-
Notifications
You must be signed in to change notification settings - Fork 3
Use a more flexible cache directory #707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…edirs # Conflicts: # executorlib/task_scheduler/file/task_scheduler.py
for more information, see https://pre-commit.ci
WalkthroughThe changes refactor cache directory handling in the file-based task scheduler and executor. The explicit Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant FileTaskScheduler
participant execute_tasks_h5
Client->>FileTaskScheduler: Initialize with resource_dict (includes cache_directory)
FileTaskScheduler->>execute_tasks_h5: Call with resource_dict (per-task cache_directory)
execute_tasks_h5->>execute_tasks_h5: Extract cache_directory from each task's resource_dict
execute_tasks_h5-->>FileTaskScheduler: Task execution results (per-task cache directory used)
Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #707 +/- ##
=======================================
Coverage 96.89% 96.89%
=======================================
Files 29 29
Lines 1319 1320 +1
=======================================
+ Hits 1278 1279 +1
Misses 41 41 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…ble_cache # Conflicts: # executorlib/task_scheduler/file/task_scheduler.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
executorlib/task_scheduler/file/shared.py(3 hunks)executorlib/task_scheduler/file/task_scheduler.py(4 hunks)tests/test_cache_fileexecutor_serial.py(4 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/test_cache_fileexecutor_serial.py (2)
executorlib/task_scheduler/file/task_scheduler.py (1)
create_file_executor(79-116)tests/test_singlenodeexecutor_noblock.py (1)
resource_dict(11-12)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
- GitHub Check: unittest_old
- GitHub Check: unittest_flux_mpich
- GitHub Check: notebooks
- GitHub Check: mypy
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
- GitHub Check: minimal
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
- GitHub Check: notebooks_integration
🔇 Additional comments (12)
tests/test_cache_fileexecutor_serial.py (5)
61-62: LGTM! Good test coverage for the new required parameter.The addition of this test case correctly verifies that
create_file_executor()raises aTypeErrorwhen called without the now-requiredresource_dictparameter, which aligns with the API change.
64-66: LGTM! Correctly updated test calls to match the new API.The addition of
resource_dict={}to these test calls properly adapts them to the new function signature whereresource_dictis now a required parameter.
112-112: LGTM! Correctly moved cache_directory into resource_dict.The change from passing
cache_directoryas a top-level parameter to embedding it withinresource_dictaligns with the API refactoring described in the PR objectives.
152-152: LGTM! Consistent with the API change.Moving
cache_directoryfrom the top-levelkwargsto inside theresource_dictis consistent with the refactoring pattern applied throughout the codebase.
192-192: LGTM! Consistent API usage.The change maintains consistency with the new approach of embedding
cache_directorywithin theresource_dictrather than passing it as a separate parameter.executorlib/task_scheduler/file/shared.py (4)
82-82: LGTM! Proper initialization of per-task cache directory tracking.The addition of
cache_dir_dictis necessary to store cache directories per task, supporting the new flexible cache directory approach.
106-106: LGTM! Correct extraction of cache directory from task resources.The change to extract
cache_directoryfromtask_resource_dictusingpop()and convert it to an absolute path is appropriate. Usingpop()ensures the cache directory is removed from the resource dict after extraction, preventing it from being passed to downstream functions that don't expect it.
150-150: LGTM! Proper storage of cache directory for task tracking.Storing the cache directory in
cache_dir_dictusing the task key allows for per-task cache directory retrieval later in the execution flow.
155-158: LGTM! Consistent use of per-task cache directories.The change to use
cache_dir_dict[key]instead of a single globalcache_directoryvariable properly completes the refactoring to support per-task cache directory configuration.executorlib/task_scheduler/file/task_scheduler.py (3)
53-53: LGTM! Good consolidation of cache directory configuration.Setting the default cache directory within the
resource_dictconsolidates cache directory configuration into a single location, which aligns with the PR objectives for more flexible cache directory handling.
63-63: LGTM! Consistent with the API refactoring.Using
resource_dictinstead of a separatecache_directoryparameter in_process_kwargsis consistent with the broader refactoring to consolidate cache directory configuration.
103-104: LGTM! Maintains backward compatibility for cache_directory parameter.The logic to add
cache_directorytoresource_dictwhen provided maintains backward compatibility with existing code that passes thecache_directoryparameter, while integrating it into the new unified resource dictionary approach.
|
|
||
|
|
||
| def create_file_executor( | ||
| resource_dict: dict, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Verification agent
🧩 Analysis chain
Verify the breaking change is intentional and documented.
The change from resource_dict: Optional[dict] = None to resource_dict: dict makes this parameter required, which is a breaking change. This should be intentional and properly documented.
🏁 Script executed:
#!/bin/bash
# Description: Check if this breaking change is mentioned in documentation or changelog
# Expected: Find references to the API change in documentation files
# Search for documentation about this change
fd -e md -e rst -e txt -e py --exec grep -l "resource_dict" {} \; | head -10
echo "---"
# Look for changelog or migration docs
fd -e md -e rst -e txt -e py -I --exec grep -l -i "breaking\|changelog\|migration" {} \; | head -5Length of output: 608
Add documentation for the breaking change to resource_dict
I didn’t find any mentions of this API change in your documentation or changelog. Since making resource_dict required is a breaking change, please:
- Add an entry under a “Breaking Changes” section in your CHANGELOG (e.g.
docs/CHANGELOG.md), noting that
executorlib/task_scheduler/file/task_scheduler.py: TaskScheduler.__init__now requiresresource_dict: dict. - Update any migration guide or README to reflect the new required parameter.
- Consider adding a docstring note in
executorlib/task_scheduler/file/task_scheduler.py(around line 80) calling out the parameter requirement.
🤖 Prompt for AI Agents
In executorlib/task_scheduler/file/task_scheduler.py around line 80, the
resource_dict parameter in TaskScheduler.__init__ is now required, which is a
breaking change. Add an entry under a "Breaking Changes" section in
docs/CHANGELOG.md describing this new requirement. Update any migration guides
or README files to mention the new required resource_dict parameter. Also, add a
note in the __init__ method's docstring near line 80 to document that
resource_dict is now a mandatory argument.
liamhuber
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a doc-migration nit, but otherwise LGTM!
Co-authored-by: Liam Huber <liam.huber@gmail.com>
Summary by CodeRabbit
Refactor
Tests