Skip to content

Feature implementation from commits 70dc62d..9e60c32 #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 15 commits into
base: feature-base-branch-1
Choose a base branch
from

Conversation

codeOwlAI
Copy link
Owner

@codeOwlAI codeOwlAI commented Jun 30, 2025

PR Summary

Code Quality Improvements and Feature Enhancements

Overview

This PR includes widespread refactoring to improve code quality, adds support for function calling APIs, implements parallel multi-trace experiment generation, and enhances resource management. It also removes deprecated code related to the v3 proposal version and improves Docker container cleanup mechanisms.

Change Types

Type Description
Enhancement Added support for function calling in LLM interactions and parallel multi-trace experiment generation
Refactor Simplified code patterns, improved type annotations, and removed deprecated functionality
Bugfix Enhanced resource management for Docker containers with proper cleanup in finally blocks
Removal Eliminated unused imports, deprecated v3 proposal version support, and DMDockerConf/DMDockerEnv classes

Affected Modules

Module / File Change Description
oai/backend/* Added function calling support with appropriate interfaces
scenarios/data_science/proposal/exp_gen/parallel.py Added ParallelMultiTraceExpGen class for asynchronous parallel exploration
utils/env.py Improved Docker container resource management with cleanup_container function
scenarios/data_science/loop.py Enhanced trace selection behavior and refactored experiment handling
scenarios/data_science/proposal/exp_gen/proposal.py Consolidated functionality by removing v3 code and adding function calling
app/qlib_rd_loop/* Changed report handling and simplified workflow

Breaking Changes

  • Changed function signatures in scenarios/kaggle/kaggle_crawler.py by removing default parameters
  • Changed extract_hypothesis_and_exp_from_reports return value from tuple to single return value
  • Changed report_limit from optional to fixed value of 10000
  • Changed return type annotations for multiple experiment loaders

Notes for Reviewers

  • The PR implements consistent error handling patterns with try/finally blocks for resource cleanup
  • Function calling support is conditionally enabled based on backend capabilities
  • Experiment selection logic has been made more robust for cases with multiple leaves
  • Docker container cleanup has been significantly improved with the new cleanup_container utility

WinstonLiyt and others added 15 commits June 18, 2025 10:58
* use simple stdout and stderr
* add live_output config in LocalConf
* refactor rdagent(q) conf files

* fix

* fix ci
* feat: parameterize cache paths with USER to avoid conflicts

* guide for missing training_hyperparameters

* guidance for  KeyError: 'concise_reason'

* fixed three bugs in the test

* fix general_model task bug

* fixed some bugs in the med_model scenario

* delete comments

* format with black

* fix mypy error

* fix ruff error

* fix isort error

* sync code

* revert cache_path code

* revert cache_path code

* delete data mining scenario

* fix factor report loop

* fix LiteLLMAPIBackend log_llm_chat_content setting

* refine fin factor report scenario

* remove unused LogColors

* fix UI

* remove medical scenario docs

* change **kaggle** to **data_science**

* remove default dataset_path in create_debug_data

* remove KAGGLE_SETTINGS in kaggle_crawler

* limit litellm versions

* reformat with black

* change README

* fix_data_science_docs

* make hypothesis observations string

* Hiding old versions of kaggle docs

* hidding kaggle agent docs

---------

Co-authored-by: Young <afe.young@gmail.com>
Co-authored-by: Bowen Xian <xianbowen@outlook.com>
Co-authored-by: yuanteli <1957922024@qq.com>
Release-As: 0.5.0
* add coder version

* merge cooder and feedback prompts

* align v2 and v3 proposal prompts

* fix a small bug

* fix a bug

* fix another bug

* support both function calling and json mode in v2 proposal

* fix minor bug

* reformat

* remove proposal v3

* fix a small bug in json mode

* fix CI

* remove tmp file

* remove v3 check

---------

Co-authored-by: Xu Yang <xuyang1@microsoft.com>
…down (microsoft#975)

* Initial plan for issue

* Fix Docker container cleanup issue by using try-finally block

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Fix additional Docker container leaks in health_check and GPU test functions

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Remove temporary test files and finalize Docker container cleanup fix

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Refactor container cleanup code to reduce duplication as requested in review feedback

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* Refactor container cleanup to use shared function and always stop before remove

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* fix CI

* Fix mypy type checking errors for Docker container cleanup

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

* fix CI

* Remove unnecessary _cleanup_container wrapper method in DockerEnv class

Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: peteryang1 <25981102+peteryang1@users.noreply.github.com>
Co-authored-by: Xu Yang <xuyang1@microsoft.com>
* feat: add parquet preview and extract common DataFrame preview logic

* refactor: improve error messages, prompts, regex, and session loading

* lint
* merge support more traces
* use feedback from all traces
Co-authored-by: Xu Yang <xuyang1@microsoft.com>
* refactor: rename failed_exp_and_feedback_list to include _after_sota suffix

* refactor: merge prompts_v3 into prompts_v2 and update references
* start to work on multi-trace + async

* init ver of async-multi-tarce, to test

* add eng-ver log

* complete version of async+ mul-trace

* debug

* fix bug on         DS_RD_SETTING.get()

* update

* fix bug + simplif the usage of async in multi-trace

* fix mini bug of arg_name

* Move local_selection into class Experiment & clean the code
Comment on lines +561 to +562
problem_formatted_str += f"## {i+1}. {problem_name}\n"
problem_formatted_str += f"{problem_dict['problem']}\n"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Potential KeyError Exception on Missing Dictionary Key.

The safety check for 'problem' key existence was removed, which could cause runtime KeyError exceptions if any problem_dict is missing this key.

Current Code (Diff):

- problem_formatted_str += f"## {i+1}. {problem_name}\n"
- problem_formatted_str += f"{problem_dict['problem']}\n"
+ if "problem" not in problem_dict:
+     continue
+ problem_formatted_str += f"## {i+1}. {problem_name}\n"
+ problem_formatted_str += f"{problem_dict['problem']}\n"
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
problem_formatted_str += f"## {i+1}. {problem_name}\n"
problem_formatted_str += f"{problem_dict['problem']}\n"
if "problem" not in problem_dict:
continue
problem_formatted_str += f"## {i+1}. {problem_name}\n"
problem_formatted_str += f"{problem_dict['problem']}\n"

print(table)
for log in logs:
decoded_log = log.strip().decode()
decoded_log = self.replace_time_info(decoded_log) if remove_timestamp else decoded_log
Console().print(decoded_log, markup=False)
log_output += decoded_log + "\n"
exit_status = container.wait()["StatusCode"]
container.stop()
container.remove()
print(Rule("[bold green]Docker Logs End[/bold green]", style="dark_orange"))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Docker Container Resource Leak.

The removal of container cleanup calls will cause Docker containers to remain running indefinitely, leading to resource exhaustion over time.

Current Code (Diff):

- print(Rule("[bold green]Docker Logs End[/bold green]", style="dark_orange"))
+ container.stop()
+ container.remove()
+ print(Rule("[bold green]Docker Logs End[/bold green]", style="dark_orange"))
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
print(Rule("[bold green]Docker Logs End[/bold green]", style="dark_orange"))
container.stop()
container.remove()
print(Rule("[bold green]Docker Logs End[/bold green]", style="dark_orange"))

logger.info(f"Processing number {self.pdf_file_index} report: {report_file_path}")
self.pdf_file_index += 1
exp, hypothesis = extract_hypothesis_and_exp_from_reports(str(report_file_path))
report_file_path = self.judge_pdf_data_items[self.loop_idx]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Undefined variable self.loop_idx causes AttributeError.

The code references self.loop_idx which is not initialized anywhere in the class, causing an AttributeError at runtime.

Current Code (Diff):

-             report_file_path = self.judge_pdf_data_items[self.loop_idx]
+             report_file_path = self.judge_pdf_data_items[self.loop_n - 1]
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
report_file_path = self.judge_pdf_data_items[self.loop_idx]
report_file_path = self.judge_pdf_data_items[self.loop_n - 1]

self.pdf_file_index += 1
exp, hypothesis = extract_hypothesis_and_exp_from_reports(str(report_file_path))
report_file_path = self.judge_pdf_data_items[self.loop_idx]
logger.info(f"Processing number {self.loop_idx} report: {report_file_path}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Undefined variable self.loop_idx in log statement.

The logging statement references self.loop_idx which is not initialized, causing an AttributeError at runtime.

Current Code (Diff):

-             logger.info(f"Processing number {self.loop_idx} report: {report_file_path}")
+             logger.info(f"Processing number {self.loop_n - 1} report: {report_file_path}")
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
logger.info(f"Processing number {self.loop_idx} report: {report_file_path}")
logger.info(f"Processing number {self.loop_n - 1} report: {report_file_path}")

@@ -105,13 +108,15 @@ def load(self, model_dict: dict) -> list:
architecture=model_data["architecture"],
variables=model_data["variables"],
hyperparameters=model_data["hyperparameters"],
training_hyperparameters=model_data["training_hyperparameters"],
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Potential KeyError Exception.

Direct access to model_data["training_hyperparameters"] without checking if this key exists will cause runtime crashes when processing model data dictionaries that don't contain this field.

Current Code (Diff):

-                training_hyperparameters=model_data["training_hyperparameters"],
+                training_hyperparameters=model_data.get("training_hyperparameters", {}),
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
training_hyperparameters=model_data["training_hyperparameters"],
training_hyperparameters=model_data.get("training_hyperparameters", {}),

):
sms_all = sms
sms = sms.loc[QLIB_SELECTED_METRICS]
sms = msg.content.result
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Removed Null Check Causes Potential Crash.

The code now directly accesses msg.content.result without checking if it's None first, which can cause a NullPointerException at runtime.

Current Code (Diff):

+                         if msg.content.result is None:
+                             # Handle null case appropriately
+                             continue
                          sms = msg.content.result
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
sms = msg.content.result
if msg.content.result is None:
# Handle null case appropriately
continue
sms = msg.content.result

Comment on lines +223 to +226
sms.name = f"Round {state.lround}"
sms_all.name = f"Round {state.lround}"
state.metric_series.append(sms)
state.all_metric_series.append(sms_all)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Undefined Variable Reference.

The variable sms_all is only initialized within the QLib scenarios conditional block but is used unconditionally, causing an undefined variable error in non-QLib scenarios.

Current Code (Diff):

                         sms.name = f"Round {state.lround}"
-                         sms_all.name = f"Round {state.lround}"
                         state.metric_series.append(sms)
-                         state.all_metric_series.append(sms_all)
+                         if isinstance(state.scenario, (QlibModelScenario, QlibFactorFromReportScenario, QlibFactorScenario, QlibQuantScenario)):
+                             sms_all.name = f"Round {state.lround}"
+                             state.all_metric_series.append(sms_all)
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
sms.name = f"Round {state.lround}"
sms_all.name = f"Round {state.lround}"
state.metric_series.append(sms)
state.all_metric_series.append(sms_all)
sms.name = f"Round {state.lround}"
if isinstance(state.scenario, (QlibModelScenario, QlibFactorFromReportScenario, QlibFactorScenario, QlibQuantScenario)):
sms_all.name = f"Round {state.lround}"
state.all_metric_series.append(sms_all)
state.metric_series.append(sms)

showed_keys = []
for k, v in rdict.items():
if k.startswith("model_") and k.endswith(".py"):
if k.endswith(".py"):
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔒 Security Issue

Security: Expanded File Visibility.

This change broadens file visibility from model_*.py files to all Python files, potentially exposing sensitive configuration or credentials.

Current Code (Diff):

-                                if k.endswith(".py"):
+                                if k.startswith("model_") and k.endswith(".py"):
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
if k.endswith(".py"):
if k.startswith("model_") and k.endswith(".py"):

Comment on lines +170 to +173
success_fb_trace_count = sum(1 for fb_list in trace_fbs if fb_list)
success_fb_list = [
fb for fb_list in trace_fbs for fb in fb_list[-(num_to_slice // success_fb_trace_count) :]
]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Division by Zero Risk.

If all feedback lists are empty but non-zero in count, success_fb_trace_count will be zero, causing a division by zero error when calculating the slice size.

Current Code (Diff):

-             success_fb_trace_count = sum(1 for fb_list in trace_fbs if fb_list)
-             success_fb_list = [
-                 fb for fb_list in trace_fbs for fb in fb_list[-(num_to_slice // success_fb_trace_count) :]
-             ]
+             success_fb_trace_count = sum(1 for fb_list in trace_fbs if fb_list)
+             if success_fb_trace_count == 0:
+                 success_fb_list = []
+             else:
+                 success_fb_list = [
+                     fb for fb_list in trace_fbs for fb in fb_list[-(num_to_slice // success_fb_trace_count) :]
+                 ]
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
success_fb_trace_count = sum(1 for fb_list in trace_fbs if fb_list)
success_fb_list = [
fb for fb_list in trace_fbs for fb in fb_list[-(num_to_slice // success_fb_trace_count) :]
]
success_fb_trace_count = sum(1 for fb_list in trace_fbs if fb_list)
if success_fb_trace_count == 0:
success_fb_list = []
else:
success_fb_list = [
fb for fb_list in trace_fbs for fb in fb_list[-(num_to_slice // success_fb_trace_count) :]
]

@@ -724,7 +761,7 @@ def task_gen(
description=description,
)
new_workflow_desc = task_dict.get("workflow_update", "No update needed")
exp = DSExperiment(pending_tasks_list=[[task]], hypothesis=hypothesis)
exp = DSExperiment(pending_tasks_list=[[task]], hypothesis=hypotheses[0])
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Potential IndexError when accessing hypotheses[0].

Accessing hypotheses[0] will crash the application if the list is empty, causing a runtime exception.

Current Code (Diff):

-        exp = DSExperiment(pending_tasks_list=[[task]], hypothesis=hypotheses[0])
+        exp = DSExperiment(pending_tasks_list=[[task]], hypothesis=hypotheses[0] if hypotheses else None)
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
exp = DSExperiment(pending_tasks_list=[[task]], hypothesis=hypotheses[0])
exp = DSExperiment(pending_tasks_list=[[task]], hypothesis=hypotheses[0] if hypotheses else None)

Comment on lines +561 to +562
problem_formatted_str += f"## {i+1}. {problem_name}\n"
problem_formatted_str += f"{problem_dict['problem']}\n"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Missing Key Validation Causing Potential KeyError.

The code removed the previous validation check for 'problem' key, which could cause KeyError exceptions if any problem dictionary is missing this key.

Current Code (Diff):

  problem_formatted_str += f"## {i+1}. {problem_name}\n"
+ if "problem" not in problem_dict:
+     continue
  problem_formatted_str += f"{problem_dict['problem']}\n"
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
problem_formatted_str += f"## {i+1}. {problem_name}\n"
problem_formatted_str += f"{problem_dict['problem']}\n"
problem_formatted_str += f"## {i+1}. {problem_name}\n"
if "problem" not in problem_dict:
continue
problem_formatted_str += f"{problem_dict['problem']}\n"

@@ -358,6 +356,18 @@ def preview_csv(p: Path, file_name: str, simple=True, show_nan_columns=False) ->
return "\n".join(out)


def preview_csv(p: Path, file_name: str, simple=True, show_nan_columns=False) -> str:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Duplicate Function Definition.

The preview_csv function is defined twice which will cause a Python syntax error and crash the application at runtime.

Current Code (Diff):

- def preview_csv(p: Path, file_name: str, simple=True, show_nan_columns=False) -> str:
-     """Generate a textual preview of a csv file"""
-     df = pd.read_csv(p)
-     return preview_df(df, file_name, simple=simple, show_nan_columns=show_nan_columns)

def download_notebooks(
competition: str, local_path: str = f"{KAGGLE_IMPLEMENT_SETTING.local_data_path}/notebooks", num: int = 15
) -> None:
def download_notebooks(competition: str, local_path: str, num: int = 15) -> None:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Breaking API Change - Removed Parameter Default.

Removing the default value for local_path parameter will break any existing code that calls this function without explicitly providing this parameter.

Current Code (Diff):

- def download_notebooks(competition: str, local_path: str, num: int = 15) -> None:
+ def download_notebooks(competition: str, local_path: str = f"{KAGGLE_IMPLEMENT_SETTING.local_data_path}/notebooks", num: int = 15) -> None:
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
def download_notebooks(competition: str, local_path: str, num: int = 15) -> None:
def download_notebooks(competition: str, local_path: str = f"{KAGGLE_IMPLEMENT_SETTING.local_data_path}/notebooks", num: int = 15) -> None:

@@ -302,7 +314,7 @@ def convert_notebooks_to_text(
print(f"Converted {converted_num} notebooks to text files.")


def collect_knowledge_texts(local_path: str = KAGGLE_IMPLEMENT_SETTING.local_data_path) -> dict[str, list[str]]:
def collect_knowledge_texts(notebooks_path: str | Path) -> dict[str, list[str]]:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

Removed default parameter value breaks backward compatibility.

Removing the default parameter value will cause runtime errors in all code that calls this function without explicitly providing the parameter.

Current Code (Diff):

- def collect_knowledge_texts(notebooks_path: str | Path) -> dict[str, list[str]]:
+ def collect_knowledge_texts(notebooks_path: str | Path = KAGGLE_IMPLEMENT_SETTING.local_data_path) -> dict[str, list[str]]:
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
def collect_knowledge_texts(notebooks_path: str | Path) -> dict[str, list[str]]:
def collect_knowledge_texts(notebooks_path: str | Path = KAGGLE_IMPLEMENT_SETTING.local_data_path) -> dict[str, list[str]]:

🔄 Dependencies Affected

rdagent/scenarios/kaggle/kaggle_crawler.py

Function: Unknown function using collect_knowledge_texts

Issue: Any call to collect_knowledge_texts() without explicitly providing the parameter will fail with a missing argument error

Suggestion: Update all calls to provide the notebooks_path parameter explicitly


raise FileNotFoundError(f"No session file found in {path}")

# iterate the dump steps in increasing order
files = sorted(path.glob("*/*_*"), key=lambda f: (int(f.parent.name), int(f.name.split("_")[0])))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Correctness Issue

ValueError from non-integer filenames.

The sorting key function will raise ValueError if directory names or filenames don't follow the expected integer-based naming convention

Current Code (Diff):

-             files = sorted(path.glob("*/*_*"), key=lambda f: (int(f.parent.name), int(f.name.split("_")[0])))
-             path = files[-1]
+             try:
+                 files = sorted(path.glob("*/*_*"), key=lambda f: (int(f.parent.name), int(f.name.split("_")[0])))
+                 if not files:
+                     raise FileNotFoundError(f"No session files found in {path}")
+                 path = files[-1]
+             except ValueError:
+                 raise ValueError(f"Session files in {path} don't follow the expected naming convention (int/int_*)")
📝 Committable suggestion

‼️ IMPORTANT
Trust, but verify! 🕵️ Please review this suggestion with the care of a code archaeologist - check that it perfectly replaces the highlighted code, preserves all lines, maintains proper indentation, and won't break anything in production. Your future self will thank you! 🚀

Suggested change
files = sorted(path.glob("*/*_*"), key=lambda f: (int(f.parent.name), int(f.name.split("_")[0])))
try:
files = sorted(path.glob("*/*_*"), key=lambda f: (int(f.parent.name), int(f.name.split("_")[0])))
if not files:
raise FileNotFoundError(f"No session files found in {path}")
path = files[-1]
except ValueError:
raise ValueError(f"Session files in {path} don't follow the expected naming convention (int/int_*)")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants