-
Notifications
You must be signed in to change notification settings - Fork 3
Define error log file in resource_dict #736
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughThis update introduces an optional Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Executor
participant Backend
participant ErrorLogger as Error Log File
User->>Executor: Submit task with resource_dict (may include error_log_file)
Executor->>Backend: Execute function with apply_dict
Backend->>Backend: Try to run function
alt Exception occurs
Backend->>ErrorLogger: backend_write_error_file(error, apply_dict)
Backend->>Executor: Return error response
else Success
Backend->>Executor: Return result
end
Executor->>User: Deliver result or raise exception
Possibly related PRs
Poem
✨ Finishing Touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
for more information, see https://pre-commit.ci
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #736 +/- ##
==========================================
+ Coverage 96.92% 96.96% +0.04%
==========================================
Files 30 31 +1
Lines 1364 1383 +19
==========================================
+ Hits 1322 1341 +19
Misses 42 42 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
executorlib/standalone/error.py (1)
4-21: Suggest improvements for robustness and formatting.The core implementation is solid, but consider these enhancements:
- Add error handling for file operations in case of permission issues or disk space problems
- Add separators between different error entries for better readability
- Consider thread safety for concurrent access scenarios
def backend_write_error_file(error: Exception, apply_dict: dict) -> None: """ Write an error to a file if specified in the apply_dict. Args: error (Exception): The error to be written. apply_dict (dict): Dictionary containing additional parameters. Returns: None """ error_log_file = apply_dict.get("error_log_file") if error_log_file is not None: - with open(error_log_file, "a") as f: - f.write("function: " + str(apply_dict["fn"]) + "\n") - f.write("args: " + str(apply_dict["args"]) + "\n") - f.write("kwargs: " + str(apply_dict["kwargs"]) + "\n") - traceback.print_exception(error, file=f) + try: + with open(error_log_file, "a") as f: + f.write("=" * 50 + "\n") + f.write("function: " + str(apply_dict["fn"]) + "\n") + f.write("args: " + str(apply_dict["args"]) + "\n") + f.write("kwargs: " + str(apply_dict["kwargs"]) + "\n") + traceback.print_exception(error, file=f) + f.write("\n") + except (IOError, OSError) as e: + # Silently ignore file write errors to avoid breaking main execution + pass
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (17)
.github/workflows/pipeline.yml(1 hunks)executorlib/backend/cache_parallel.py(2 hunks)executorlib/backend/interactive_parallel.py(2 hunks)executorlib/backend/interactive_serial.py(2 hunks)executorlib/executor/flux.py(5 hunks)executorlib/executor/single.py(5 hunks)executorlib/executor/slurm.py(5 hunks)executorlib/standalone/cache.py(1 hunks)executorlib/standalone/error.py(1 hunks)executorlib/task_scheduler/file/backend.py(2 hunks)executorlib/task_scheduler/file/hdf.py(1 hunks)executorlib/task_scheduler/file/shared.py(1 hunks)executorlib/task_scheduler/interactive/shared.py(2 hunks)pyproject.toml(0 hunks)tests/test_cache_fileexecutor_serial.py(1 hunks)tests/test_singlenodeexecutor_cache.py(1 hunks)tests/test_standalone_hdf.py(1 hunks)
💤 Files with no reviewable changes (1)
- pyproject.toml
🧰 Additional context used
🧬 Code Graph Analysis (6)
executorlib/backend/cache_parallel.py (1)
executorlib/standalone/error.py (1)
backend_write_error_file(4-21)
executorlib/backend/interactive_serial.py (1)
executorlib/standalone/error.py (1)
backend_write_error_file(4-21)
executorlib/task_scheduler/file/backend.py (1)
executorlib/standalone/error.py (1)
backend_write_error_file(4-21)
executorlib/backend/interactive_parallel.py (1)
executorlib/standalone/error.py (1)
backend_write_error_file(4-21)
tests/test_cache_fileexecutor_serial.py (4)
executorlib/task_scheduler/file/task_scheduler.py (1)
FileTaskScheduler(30-79)executorlib/task_scheduler/file/subprocess_spawner.py (1)
execute_in_subprocess(10-59)tests/test_singlenodeexecutor_cache.py (1)
get_error(16-17)tests/test_cache_backend_execute.py (1)
get_error(22-23)
tests/test_singlenodeexecutor_cache.py (3)
executorlib/executor/single.py (1)
SingleNodeExecutor(20-189)executorlib/standalone/serialize.py (1)
cloudpickle_register(9-28)tests/test_cache_fileexecutor_serial.py (1)
get_error(29-30)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: codecov/patch
- GitHub Check: unittest_openmpi (macos-latest, 3.13)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
- GitHub Check: notebooks_integration
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
- GitHub Check: unittest_slurm_mpich
- GitHub Check: notebooks
- GitHub Check: unittest_win
🔇 Additional comments (22)
executorlib/standalone/cache.py (1)
13-13: LGTM! Clean addition of error_log_file support.The addition of the
error_log_filekey to thegroup_dictis consistent with the existing pattern and enables HDF5 cache files to store error log file information alongside other task data.tests/test_standalone_hdf.py (1)
78-78: LGTM! Test data updated for error_log_file support.The addition of the
error_log_filekey to the test data ensures that HDF5 serialization/deserialization works correctly with the new error logging parameter.executorlib/backend/interactive_serial.py (2)
5-5: LGTM! Clean import of error logging functionality.The import of
backend_write_error_fileis properly added and will be used for enhanced error logging.
62-65: LGTM! Proper integration of error logging.The call to
backend_write_error_fileis correctly placed in the exception handling block and doesn't interfere with the existing error response mechanism. The function will write detailed error information to a log file if theerror_log_fileparameter is specified in the input dictionary.executorlib/backend/cache_parallel.py (2)
7-7: LGTM! Clean import of error logging functionality.The import of
backend_write_error_fileis properly added and will be used for enhanced error logging in the MPI parallel backend.
57-60: LGTM! Proper integration of error logging in MPI context.The call to
backend_write_error_fileis correctly placed in the exception handling block and properly restricted to MPI rank 0, ensuring that error logging doesn't cause conflicts in the parallel execution environment. The function will write detailed error information to a log file if theerror_log_fileparameter is specified in the apply_dict.executorlib/task_scheduler/file/backend.py (2)
5-5: LGTM! Clean import of error logging functionality.The import of
backend_write_error_fileis properly added and will be used for enhanced error logging in the file-based task scheduler backend.
81-84: LGTM! Proper integration of error logging in task execution.The call to
backend_write_error_fileis correctly placed in the exception handling block and appropriately positioned before the existingbackend_write_filecall. This ensures that detailed error information is logged to the specified error log file if theerror_log_fileparameter is provided in the apply_dict.executorlib/task_scheduler/file/hdf.py (1)
55-58: LGTM! Consistent implementation pattern.The addition of
error_log_filehandling follows the same pattern as other optional parameters in the HDF5 loader, with proper existence checking and cloudpickle deserialization.executorlib/backend/interactive_parallel.py (2)
9-9: LGTM! Proper import addition.The import of
backend_write_error_fileis correctly placed and follows the existing import organization.
86-89: LGTM! Well-integrated error logging.The
backend_write_error_filecall is properly positioned after the existing error communication, ensuring it doesn't interfere with the main error handling flow while providing additional error logging capabilities.executorlib/task_scheduler/file/shared.py (2)
129-129: LGTM! Consistent parameter extraction.The
error_log_fileparameter is properly extracted from the task resource dictionary following the same pattern as other resource parameters likecache_keyandcache_directory.
137-137: LGTM! Proper parameter propagation.The
error_log_fileis correctly added to the data dictionary after serialization, ensuring it's available during task execution.executorlib/task_scheduler/interactive/shared.py (2)
29-29: LGTM! Proper parameter addition.The
error_log_fileparameter is correctly added to the function signature with appropriate typing and default value.
74-75: LGTM! Clean conditional parameter handling.The error log file is only added to the task dictionary when it's not None, which is a good practice to avoid unnecessary key-value pairs.
tests/test_singlenodeexecutor_cache.py (1)
61-72: LGTM! Well-structured test for error log file functionality.The test follows the existing pattern and properly verifies that:
- The error log file is created when specified in resource_dict
- Exceptions are still raised as expected
- Cleanup is performed after the test
The test logic is sound and integrates well with the existing test suite.
executorlib/executor/flux.py (1)
44-45: LGTM! Consistent documentation updates for error_log_file parameter.The documentation accurately describes the new
error_log_fileparameter across all relevant classes and functions:
- FluxJobExecutor class and constructor
- FluxClusterExecutor class and constructor
- create_flux_executor function
The description is clear and consistent, properly explaining that this parameter specifies the name of the error log file for storing exceptions raised by submitted Python functions.
Also applies to: 131-132, 236-237, 317-318, 435-436
tests/test_cache_fileexecutor_serial.py (2)
91-91: LGTM! Good addition to verify expected file count.The assertion ensures that the existing error test produces exactly one file in the working directory, which helps establish the baseline behavior before testing the error log file functionality.
93-106: LGTM! Well-implemented test for error log file functionality.The test properly verifies the error log file feature:
- Creates a FileTaskScheduler with error_log_file in resource_dict
- Submits a function that raises an exception
- Verifies the exception is still raised
- Checks that exactly 2 files exist (cache file + error log file)
- Confirms the error log file is present by name
- Performs proper cleanup
The test follows established patterns and provides good coverage for the new feature.
executorlib/executor/slurm.py (1)
44-45: LGTM! Consistent documentation updates for error_log_file parameter.The documentation accurately describes the new
error_log_fileparameter across all relevant classes and functions:
- SlurmClusterExecutor class and constructor
- SlurmJobExecutor class and constructor
- create_slurm_executor function
The description is clear and consistent with other executor modules, properly explaining that this parameter specifies the name of the error log file for storing exceptions raised by submitted Python functions.
Also applies to: 125-126, 233-234, 316-317, 419-420
executorlib/executor/single.py (1)
42-43: LGTM! Consistent documentation updates for error_log_file parameter.The documentation accurately describes the new
error_log_fileparameter across all relevant classes and functions:
- SingleNodeExecutor class and constructor
- TestClusterExecutor class and constructor
- create_single_node_executor function
The description is clear and consistent with other executor modules, properly explaining that this parameter specifies the name of the error log file for storing exceptions raised by submitted Python functions.
Also applies to: 121-122, 209-210, 282-283, 392-393
.github/workflows/pipeline.yml (1)
403-408: environment-old.yml has no Python 3.9 pinA thorough search of
.ci_support/environment-old.ymlfound no explicit or implicit references to Python 3.9. The Conda environment file is compatible with the Python 3.10 request insetup-miniconda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tests/test_standalone_error.py (1)
27-29: Enhance cleanup robustnessThe tearDown method is good practice but could be more robust by handling the dynamic filename if the test improvements above are implemented.
If implementing the tempfile approach suggested above, you could store the filename as an instance variable:
def test_backend_write_error_file(self): + self.error_file = None with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.out') as tmp: - error_file = tmp.name + self.error_file = tmp.name # ... rest of test def tearDown(self): - if os.path.exists("error.out"): - os.remove("error.out") + if self.error_file and os.path.exists(self.error_file): + os.remove(self.error_file)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_standalone_error.py(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
tests/test_standalone_error.py (1)
executorlib/standalone/error.py (1)
backend_write_error_file(4-21)
🪛 Ruff (0.12.2)
tests/test_standalone_error.py
23-23: Unnecessary mode argument
Remove mode argument
(UP015)
🔇 Additional comments (1)
tests/test_standalone_error.py (1)
1-3: LGTM: Clean import structureThe imports are appropriate for the test functionality.
| def test_backend_write_error_file(self): | ||
| backend_write_error_file( | ||
| error=ValueError(), | ||
| apply_dict={ | ||
| "error_log_file": "error.out", | ||
| "fn": 1, | ||
| "args": (1, 2, 3), | ||
| "kwargs": {"a": 1, "b": 2, "c": 3}, | ||
| } | ||
| ) | ||
| error_file_content = [ | ||
| 'function: 1\n', | ||
| 'args: (1, 2, 3)\n', | ||
| "kwargs: {'a': 1, 'b': 2, 'c': 3}\n", | ||
| 'ValueError\n' | ||
| ] | ||
| with open("error.out", "r") as f: | ||
| content = f.readlines() | ||
| self.assertEqual(error_file_content, content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Improve test robustness and address static analysis issue
The test logic is sound but has several areas for improvement:
- Static analysis issue: Line 23 has an unnecessary mode argument as flagged by Ruff.
- Test isolation: The hardcoded filename "error.out" could cause issues in parallel test execution.
- Missing validation: The test doesn't verify the file exists before reading it.
- Unusual function identifier: Using
fn: 1instead of a function name seems atypical.
Apply this diff to address the static analysis issue:
- with open("error.out", "r") as f:
+ with open("error.out") as f:Consider these additional improvements:
+import tempfile
+
class TestErrorWriter(unittest.TestCase):
def test_backend_write_error_file(self):
+ # Use a temporary file for better test isolation
+ with tempfile.NamedTemporaryFile(mode='w', delete=False, suffix='.out') as tmp:
+ error_file = tmp.name
+
backend_write_error_file(
error=ValueError(),
apply_dict={
- "error_log_file": "error.out",
+ "error_log_file": error_file,
"fn": 1,
"args": (1, 2, 3),
"kwargs": {"a": 1, "b": 2, "c": 3},
}
)
+
+ # Verify file exists before reading
+ self.assertTrue(os.path.exists(error_file))
+
- with open("error.out") as f:
+ with open(error_file) as f:
content = f.readlines()
self.assertEqual(error_file_content, content)
+
+ # Clean up immediately
+ os.remove(error_file)Committable suggestion skipped: line range outside the PR's diff.
🧰 Tools
🪛 Ruff (0.12.2)
23-23: Unnecessary mode argument
Remove mode argument
(UP015)
🤖 Prompt for AI Agents
In tests/test_standalone_error.py around lines 7 to 25, fix the static analysis
issue by removing the unnecessary mode argument from the open() call on line 23.
Replace the hardcoded filename "error.out" with a temporary file or use a unique
filename per test run to ensure test isolation. Add an assertion to check that
the error file exists before attempting to read it. Finally, replace the unusual
function identifier fn: 1 with an actual function name or a mock function
reference to better reflect typical usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/test_cache_backend_execute.py (1)
134-138: LGTM! Error log verification is comprehensive.The test effectively verifies that the error log file contains the expected function call details and error message. The assertions properly validate the error logging functionality.
Consider applying this minor optimization based on static analysis:
- with open(os.path.join(cache_directory, "error.out"), "r") as f: + with open(os.path.join(cache_directory, "error.out")) as f:The "r" mode is the default for
open()and can be omitted.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
tests/test_cache_backend_execute.py(2 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
tests/test_cache_backend_execute.py
134-134: Unnecessary mode argument
Remove mode argument
(UP015)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: notebooks
- GitHub Check: unittest_flux_openmpi
- GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
- GitHub Check: unittest_win
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
- GitHub Check: unittest_mpich (macos-latest, 3.13)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
- GitHub Check: unittest_openmpi (macos-latest, 3.13)
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
- GitHub Check: notebooks_integration
- GitHub Check: unittest_flux_mpich
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
- GitHub Check: unittest_slurm_mpich
🔇 Additional comments (1)
tests/test_cache_backend_execute.py (1)
124-124: LGTM! Error log file configuration added correctly.The addition of the
error_log_fileparameter to the data dictionary is well-placed and follows the established pattern of using the cache directory for file operations.
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Tests
Chores