-
Notifications
You must be signed in to change notification settings - Fork 8
fix: Add log retrieval after job finalization #61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe pull request introduces a new method in the Changes
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (1)
🧰 Additional context used📓 Path-based instructions (1)`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
⏰ Context from checks skipped due to timeout of 90000ms (1)
🔇 Additional comments (4)
✨ Finishing Touches
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
.github/workflows/ci_mocked_api.yaml (2)
86-90: Consider adding a non-interactive installation flag.When installing Docker Compose on line 89, it's often safer for CI to add the
-yflag to avoid prompts:- sudo apt-get install docker-compose + sudo apt-get install -y docker-compose
98-103: Consider verifying the downloaded MinIO Client binary.For increased security and reliability, you could verify the binary's checksum before making it executable. Example:
curl -O https://dl.min.io/client/mc/release/linux-amd64/mc +echo "<INSERT_EXPECTED_SHA256> mc" > mc.sha256 +sha256sum -c mc.sha256 chmod +x mc sudo mv mc /usr/local/bin/tests/tests_mocked_api.py (1)
25-31: Consider adding a test case for log entries.Currently, the logger mock returns an empty list. You may want to add a scenario where
list_entriesreturns a sample log entry to validate that thesave_finished_job_logsmethod writes logs correctly to the file.snakemake_executor_plugin_googlebatch/executor.py (1)
519-564: Excellent error-handling strategy with retry.The implementation properly retries after hitting the rate limit (
ResourceExhausted). Consider two minor details:
- Ensure
log_entry.payloadis always a string. If structured logging is used later, you may need a format conversion.- Fix the minor spelling typo in line 551 (“retreive” → “retrieve”).
Example fix:
- self.logger.warning(f"Trying to retreive logs for job {job_uid} once more.") + self.logger.warning(f"Trying to retrieve logs for job {job_uid} once more.")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pyproject.tomlis excluded by!pyproject.toml
📒 Files selected for processing (3)
.github/workflows/ci_mocked_api.yaml(1 hunks)snakemake_executor_plugin_googlebatch/executor.py(4 hunks)tests/tests_mocked_api.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
tests/tests_mocked_api.pysnakemake_executor_plugin_googlebatch/executor.py
🔇 Additional comments (5)
.github/workflows/ci_mocked_api.yaml (2)
91-97: LGTM for MinIO setup.Setting up MinIO with an action is straightforward and well-contained here. No issues spotted.
104-106: LGTM for MinIO client configuration.The alias setup is concise and clear. No concerns here.
tests/tests_mocked_api.py (1)
14-14: Good addition of theuidfield in the mock.Adding
uid="bar"keeps the mock consistent with the new usage in the executor code that expects a unique job identifier.snakemake_executor_plugin_googlebatch/executor.py (2)
2-16: New imports are appropriate.The added imports (
time,ResourceExhausted,logging) are necessary for the log retrieval implementation.
505-507: Log retrieval logic on job completion looks good.Triggering
save_finished_job_logsonce the job is either "FAILED" or "SUCCEEDED" is a clear and logical approach to finalize logs.
|
@CodeRabbit do the review again |
|
@tadasbar I'll review all the changes in your pull request again. ✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
.github/workflows/ci_mocked_api.yaml (3)
86-92: Consider storing MinIO credentials in GitHub secrets.
Storing credentials directly in the workflow file can be a security risk if this repository is public or if the credentials are reused elsewhere.Here's an example of how you might reference secret values:
- name: Setup minio uses: comfuture/minio-action@v1 with: - access_key: minio - secret_key: minio123 port: 9000 + access_key: ${{ secrets.MINIO_ACCESS_KEY }} + secret_key: ${{ secrets.MINIO_SECRET_KEY }}
93-97: Verify download integrity.
When installing the MinIO client from a remote URL, consider verifying the authenticity of the binary via checksums to prevent supply chain attacks.- name: Install MinIO Client CLI run: | curl -O https://dl.min.io/client/mc/release/linux-amd64/mc chmod +x mc sudo mv mc /usr/local/bin/ + # Example checksum verification step: + # echo "<expected-checksum> mc" | sha256sum --check
99-102: Use GitHub secrets to configure MinIO client.
Similar to the setup step, consider avoiding plaintext credentials in the workflow.- name: Configure MinIO client run: | mc alias set minio http://localhost:9000 minio minio123 + # If secrets are stored: + # mc alias set minio http://localhost:9000 ${{ secrets.MINIO_ACCESS_KEY }} ${{ secrets.MINIO_SECRET_KEY }}snakemake_executor_plugin_googlebatch/executor.py (1)
519-564: Consider usingstr()for log payloads and employing backoff strategies.
log_entry.payloadmay not always be a pure string and can cause unexpected exceptions if it’s not cast. Additionally, you might consider exponential backoff instead of sleeping a fixed 60 seconds.- logfile.write(log_entry.payload + "\n") + logfile.write(str(log_entry.payload) + "\n")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
pyproject.tomlis excluded by!pyproject.toml
📒 Files selected for processing (3)
.github/workflows/ci_mocked_api.yaml(1 hunks)snakemake_executor_plugin_googlebatch/executor.py(4 hunks)tests/tests_mocked_api.py(2 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake_executor_plugin_googlebatch/executor.pytests/tests_mocked_api.py
🔇 Additional comments (4)
snakemake_executor_plugin_googlebatch/executor.py (2)
505-507: Log retrieval logic is appropriate.
Invokingsave_finished_job_logsupon job completion (either SUCCEEDED or FAILED) is a straightforward way to retrieve logs after the job ends.
567-567: Comment correction is fine.
No issues; this clarifies when and why jobs are canceled.tests/tests_mocked_api.py (2)
14-14: Mock return value updated with UID.
This aligns well with the new code that relies on the job UID for log retrieval.
25-31: Mocking the logger for Google Cloud Logging.
This mock structure appropriately tests thesave_finished_job_logsflow without requiring real API calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_googlebatch/executor.py (1)
519-564: Implementation for downloading and saving job logs.The log retrieval functionality is well-implemented with appropriate retry logic for handling API rate limits. There are a few suggestions for improvement:
- There's an inconsistency in how the log payload is written to file:
str(log_entry.payload)in the first try block vslog_entry.payloaddirectly in the retry block.- Consider using append mode ('a') for the second attempt to avoid losing logs that were successfully written before hitting rate limits.
- Add additional exception handling for potential IO errors when writing to files.
Here's a suggested improvement to address these concerns:
def save_finished_job_logs( self, job_info: SubmittedJobInfo, sleeps=60, page_size=1000, ): """ Download logs using Google Cloud Logging API and save them locally. Since tail logging does not work, this function is run only at the end of the job. """ job_uid = job_info.aux["batch_job"].uid filter_query = f"labels.job_uid={job_uid}" logfname = job_info.aux["logfile"] log_client = logging.Client(project=self.executor_settings.project) logger = log_client.logger("batch_task_logs") try: with open(logfname, "w", encoding="utf-8") as logfile: for log_entry in logger.list_entries( filter_=filter_query, page_size=page_size, ): logfile.write(str(log_entry.payload) + "\n") except ResourceExhausted: self.logger.warning( "Too many requests to Google Logging API.\n" + f"Skipping logs for job {job_uid} and sleeping for {sleeps}s." ) time.sleep(sleeps) self.logger.warning(f"Trying to retrieve logs for job {job_uid} once more.") try: - with open(logfname, "w", encoding="utf-8") as logfile: + with open(logfname, "a", encoding="utf-8") as logfile: for log_entry in logger.list_entries( filter_=filter_query, page_size=page_size, ): - logfile.write(log_entry.payload + "\n") + logfile.write(str(log_entry.payload) + "\n") except ResourceExhausted: self.logger.warning( "Retry to retrieve logs failed, " + f"the log file {logfname} might be incomplete." ) + except IOError as e: + self.logger.warning( + f"Failed to write to log file {logfname}: {str(e)}" + )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_googlebatch/executor.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake_executor_plugin_googlebatch/executor.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing-mocked-api
🔇 Additional comments (5)
snakemake_executor_plugin_googlebatch/executor.py (5)
2-2: Import added for new log retrieval functionality.The
timemodule import is necessary for implementing the retry mechanism withtime.sleep()in the newsave_finished_job_logsmethod.
505-507: LGTM: Logs are now saved when jobs complete.This is a good enhancement that calls the new log retrieval method when jobs finish (either successfully or with failures), ensuring logs are available for inspection.
567-567: Improved comment capitalization.Minor improvement to the comment for consistency.
15-16: Ensure "google-cloud-logging" dependency is properly managed.The code now imports and uses the Google Cloud Logging client, but it's important to ensure that the "google-cloud-logging" package is properly declared in your dependencies.
#!/bin/bash echo "Searching for 'google-cloud-logging' in dependency files..." for file in requirements.txt setup.py pyproject.toml; do if [ -f "$file" ]; then echo "=== $file ===" grep "google-cloud-logging" "$file" || echo "No reference in $file." else echo "$file not found." fi doneAlso applies to: 519-564
15-16:❓ Verification inconclusive
New imports support log retrieval functionality.
The addition of
ResourceExhaustedexception and theloggingclient are appropriate for implementing the new log retrieval functionality.
🏁 Script executed:
#!/bin/bash # Verify that google-cloud-logging is properly installed/imported in the project grep -r "google-cloud-logging" --include="*.py" --include="*.toml" --include="requirements.txt" --include="setup.py" .Length of output: 114
I'm not seeing evidence of the dependency when searching for the string
"google-cloud-logging", likely because the package is imported with its module name ("from google.cloud import logging") rather than its PyPI package name. To improve our verification, please run the following script, which searches for the actual import statement and checks for the dependency in common dependency files:Once you verify that the expected import and dependency declarations are in place, we can confidently conclude that the new log retrieval functionality is properly supported by these changes.
🏁 Script executed:
#!/bin/bash echo "Verifying 'from google.cloud import logging' in the repository:" grep -R "from google.cloud import logging" . echo "" echo "Checking for dependency 'google-cloud-logging' in requirements.txt (if present):" if [ -f requirements.txt ]; then grep -R "google-cloud-logging" requirements.txt || echo "Not found in requirements.txt" else echo "requirements.txt not found" fi echo "" echo "Checking for dependency 'google-cloud-logging' in setup.py (if present):" if [ -f setup.py ]; then grep -R "google-cloud-logging" setup.py || echo "Not found in setup.py" else echo "setup.py not found" fiLength of output: 833
Log Retrieval Import & Dependency Confirmation
- The new import statements in
snakemake_executor_plugin_googlebatch/executor.py—specifically importingResourceExhaustedand usingfrom google.cloud import logging—adequately support the new log retrieval functionality.- Our verification confirms that the module import is present in the code. However, no explicit reference to the
google-cloud-loggingpackage was found inrequirements.txtorsetup.py. If dependency management is handled elsewhere (for instance via apyproject.toml, Pipfile, or an environment where this package is pre-installed), no further action is necessary. Otherwise, please ensure that thegoogle-cloud-loggingpackage is properly declared to avoid runtime issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_googlebatch/executor.py (1)
519-564: Implement proper error handling, but consider potential improvementsThe implementation does a good job handling the
ResourceExhaustedexception and attempting to retry after a delay. However, there are a few possible improvements to consider:
- The retry logic contains duplicated code for log retrieval
- There's no handling for other potential exceptions (e.g., permission issues, network errors)
- Consider implementing exponential backoff instead of a fixed delay
Here's a potential implementation with reduced duplication and more comprehensive error handling:
def save_finished_job_logs( self, job_info: SubmittedJobInfo, sleeps=60, page_size=1000, ): """ Download logs using Google Cloud Logging API and save them locally. Since tail logging does not work, this function is run only at the end of the job. """ job_uid = job_info.aux["batch_job"].uid filter_query = f"labels.job_uid={job_uid}" logfname = job_info.aux["logfile"] log_client = logging.Client(project=self.executor_settings.project) logger = log_client.logger("batch_task_logs") - try: - with open(logfname, "w", encoding="utf-8") as logfile: - for log_entry in logger.list_entries( - filter_=filter_query, - page_size=page_size, - ): - logfile.write(str(log_entry.payload) + "\n") - except ResourceExhausted: - self.logger.warning( - "Too many requests to Google Logging API.\n" - + f"Skipping logs for job {job_uid} and sleeping for {sleeps}s." - ) - time.sleep(sleeps) - - self.logger.warning(f"Trying to retrieve logs for job {job_uid} once more.") - try: - with open(logfname, "w", encoding="utf-8") as logfile: - for log_entry in logger.list_entries( - filter_=filter_query, - page_size=page_size, - ): - logfile.write(str(log_entry.payload) + "\n") - except ResourceExhausted: - self.logger.warning( - "Retry to retrieve logs failed, " - + f"the log file {logfname} might be incomplete." - ) + max_retries = 1 + retry_count = 0 + + while retry_count <= max_retries: + try: + with open(logfname, "w" if retry_count == 0 else "a", encoding="utf-8") as logfile: + if retry_count > 0: + logfile.write("\n--- Log retrieval resumed after retry ---\n\n") + + for log_entry in logger.list_entries( + filter_=filter_query, + page_size=page_size, + ): + logfile.write(str(log_entry.payload) + "\n") + + # If we get here, retrieval was successful + break + + except ResourceExhausted: + retry_count += 1 + if retry_count <= max_retries: + self.logger.warning( + "Too many requests to Google Logging API.\n" + + f"Skipping logs for job {job_uid} and sleeping for {sleeps}s before retry {retry_count}/{max_retries}." + ) + time.sleep(sleeps) + else: + self.logger.warning( + "Maximum retries reached, " + + f"the log file {logfname} might be incomplete." + ) + except Exception as e: + self.logger.warning( + f"Failed to retrieve logs for job {job_uid}: {str(e)}. " + + f"The log file {logfname} might be incomplete." + ) + break
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_googlebatch/executor.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest ...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake_executor_plugin_googlebatch/executor.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing-mocked-api
🔇 Additional comments (3)
snakemake_executor_plugin_googlebatch/executor.py (3)
15-16: Ensure "google-cloud-logging" is included in project dependenciesYou've added imports for Google Cloud Logging functionality, but the project may be missing the explicit dependency for "google-cloud-logging" in its dependency files (requirements.txt, setup.py, or pyproject.toml).
505-507: LGTM: Good addition of log saving for completed jobsThe implementation correctly calls
save_finished_job_logswhen a job reaches terminal states (FAILED or SUCCEEDED), which aligns with the PR objective to retrieve logs after job completion.
567-567: Small comment capitalization correctionYou've fixed the capitalization in the method documentation comment to maintain consistency.
jakevc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be cleaner to implement a retry function for the with(open) operation so you don't have to call that logic twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_googlebatch/executor.py (1)
526-571: Good implementation of log retrieval with proper error handlingThe new
save_finished_job_logsmethod effectively handles the retrieval of logs from the Google Cloud Logging API, with appropriate error handling for rate limiting and retries.Consider enhancing the implementation with:
- Adding more comprehensive error handling for other potential API exceptions
- Returning a boolean indicating success or failure for logging/reporting purposes
- Adding debug logging for successful retrieval with the count of log entries
def save_finished_job_logs( self, job_info: SubmittedJobInfo, sleeps=60, page_size=1000, ): """ Download logs using Google Cloud Logging API and save them locally. Since tail logging does not work, this function is run only at the end of the job. """ job_uid = job_info.aux["batch_job"].uid filter_query = f"labels.job_uid={job_uid}" logfname = job_info.aux["logfile"] log_client = logging.Client(project=self.executor_settings.project) logger = log_client.logger("batch_task_logs") - def attempt_log_save(fname, query, page_size): + def attempt_log_save(fname, query, page_size): + count = 0 with open(fname, "w", encoding="utf-8") as logfile: for log_entry in logger.list_entries( filter_=query, page_size=page_size, ): logfile.write(str(log_entry.payload) + "\n") + count += 1 + return count self.logger.info(f"Saving logs for Batch job {job_uid} to {logfname}.") + success = False try: - attempt_log_save(logfname, filter_query, page_size) + count = attempt_log_save(logfname, filter_query, page_size) + self.logger.debug(f"Successfully saved {count} log entries to {logfname}") + success = True except ResourceExhausted: self.logger.warning( "Too many requests to Google Logging API.\n" + f"Skipping logs for job {job_uid} and sleeping for {sleeps}s." ) time.sleep(sleeps) self.logger.warning(f"Trying to retrieve logs for job {job_uid} once more.") try: - attempt_log_save(logfname, filter_query, page_size) + count = attempt_log_save(logfname, filter_query, page_size) + self.logger.debug(f"Successfully saved {count} log entries to {logfname} on retry") + success = True except ResourceExhausted: self.logger.warning( "Retry to retrieve logs failed, " + f"the log file {logfname} might be incomplete." ) + except Exception as e: + self.logger.error(f"Failed to retrieve logs for job {job_uid}: {str(e)}") + + return success
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_googlebatch/executor.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake_executor_plugin_googlebatch/executor.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing-mocked-api
🔇 Additional comments (3)
snakemake_executor_plugin_googlebatch/executor.py (3)
15-16: Ensure google-cloud-logging dependency is properly managedThe change adds a new import for the Google Cloud Logging client, which requires the
google-cloud-loggingpackage. A previous review already flagged this, but it's worth re-emphasizing.#!/bin/bash # Checking if google-cloud-logging is declared in dependency files for file in requirements.txt setup.py pyproject.toml; do if [ -f "$file" ]; then echo "=== Checking $file ===" grep -q "google-cloud-logging" "$file" && echo "✅ Found google-cloud-logging in $file" || echo "❌ Not found in $file" fi done
512-514: LGTM - Appropriate placement for log retrievalThe code correctly calls the log retrieval function when a job reaches a terminal state (FAILED or SUCCEEDED), ensuring logs are captured immediately after job completion.
574-574: LGTM - Comment capitalization fixGood catch correcting the capitalization in the comment for better consistency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
snakemake_executor_plugin_googlebatch/executor.py (1)
526-577: Well-implemented log retrieval with proper error handlingThe implementation of
save_finished_job_logsis well-structured with:
- Clear parameter definitions including defaults for pagination and retry timing
- Appropriate error handling for
ResourceExhaustedexceptions with a retry mechanism- Comprehensive logging to inform users of the process and any issues
The retry mechanism with a 60-second sleep interval addresses the API limitation of 60 requests per minute mentioned in the PR description.
Consider implementing a more robust retry mechanism with exponential backoff instead of a fixed sleep time. This would help in cases where the API might be under sustained heavy load:
- def save_finished_job_logs( - self, - job_info: SubmittedJobInfo, - sleeps=60, - page_size=1000, - ): + def save_finished_job_logs( + self, + job_info: SubmittedJobInfo, + initial_sleep=60, + max_retries=2, + page_size=1000, + ):And then in the exception handling:
- try: - attempt_log_save(logfname, filter_query, page_size) - except ResourceExhausted: - self.logger.warning( - "Too many requests to Google Logging API.\n" - + f"Skipping logs for job {job_uid} and sleeping for {sleeps}s." - ) - time.sleep(sleeps) - - self.logger.warning( - f"Trying to retrieve logs for Batch job {job_uid} once more." - ) - try: - attempt_log_save(logfname, filter_query, page_size) - except ResourceExhausted: - self.logger.warning( - "Retry to retrieve logs failed, " - + f"the log file {logfname} might be incomplete." - ) + retries = 0 + current_sleep = initial_sleep + + while True: + try: + attempt_log_save(logfname, filter_query, page_size) + break + except ResourceExhausted: + retries += 1 + if retries >= max_retries: + self.logger.warning( + "Max retries reached. " + + f"The log file {logfname} might be incomplete." + ) + break + + self.logger.warning( + "Too many requests to Google Logging API.\n" + + f"Retry {retries}/{max_retries}: Sleeping for {current_sleep}s." + ) + time.sleep(current_sleep) + current_sleep *= 2 # Exponential backoff
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_googlebatch/executor.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake_executor_plugin_googlebatch/executor.py
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing-mocked-api
🔇 Additional comments (4)
snakemake_executor_plugin_googlebatch/executor.py (4)
2-2: Import added for new functionalityThe
timemodule is correctly imported to support the sleep functionality in the new log retrieval method.
512-514: Good placement for log retrieval after job completionThe placement of the
save_finished_job_logscall is correct - it's triggered for both successful and failed jobs, which ensures logs are always retrieved once a job is complete.
580-580: Fixed comment capitalizationThis is a minor fix to capitalize the first letter of the method comment.
15-16:❓ Verification inconclusive
API exception handling imports look good
The imports for
ResourceExhaustedandloggingfrom Google Cloud libraries are appropriate for the new log retrieval functionality.
🏁 Script executed:
#!/bin/bash # Verify google-cloud-logging dependency is properly managed rg "google-cloud-logging" -g "requirements.txt" -g "setup.py" -g "pyproject.toml"Length of output: 224
Final Verification of Google Cloud Logging Dependency Import
- The import statements at
snakemake_executor_plugin_googlebatch/executor.py(lines 15–16) correctly bring inDeadlineExceeded,ResourceExhausted, and the logging module from Google Cloud, aligning with the needs of the log retrieval functionality.- An initial search for the string
"google-cloud-logging"in standard dependency files (i.e.requirements.txt,setup.py, orpyproject.toml) did not yield any output. This may be due to filtering by the search command or might indicate that the dependency isn’t explicitly listed.- Action Required: Please manually verify that the dependency for
google-cloud-loggingis properly declared or managed within your project’s dependency files to ensure that the runtime environment includes the correct package.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (2)
snakemake_executor_plugin_googlebatch/executor.py (2)
544-551: Consider adding docstring to the inner function.The inner helper function
attempt_log_savewould benefit from a brief docstring describing its purpose and parameters.Add a docstring to improve code readability:
def attempt_log_save(fname, logger, query, page_size): + """ + Helper function to retrieve logs and write them to a file. + + Args: + fname: Path to the log file + logger: Google Cloud logging logger + query: Filter query for logs + page_size: Number of log entries to retrieve per request + """ with open(fname, "w", encoding="utf-8") as logfile: for log_entry in logger.list_entries( filter_=query, page_size=page_size, ): logfile.write(str(log_entry.payload) + "\n")
545-551: Consider adding directory existence check before writing to the log file.The code should ensure that the directory for the log file exists before attempting to write to it.
Add a directory existence check:
def attempt_log_save(fname, logger, query, page_size): + # Ensure the directory exists + os.makedirs(os.path.dirname(fname), exist_ok=True) with open(fname, "w", encoding="utf-8") as logfile: for log_entry in logger.list_entries( filter_=query, page_size=page_size, ): logfile.write(str(log_entry.payload) + "\n")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
snakemake_executor_plugin_googlebatch/executor.py(4 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
`**/*.py`: Do not try to improve formatting. Do not suggest type annotations for functions that are defined inside of functions or methods. Do not suggest type annotation of the `s...
**/*.py: Do not try to improve formatting.
Do not suggest type annotations for functions that are defined inside of functions or methods.
Do not suggest type annotation of theselfargument of methods.
Do not suggest type annotation of theclsargument of classmethods.
Do not suggest return type annotation if a function or method does not contain areturnstatement.
snakemake_executor_plugin_googlebatch/executor.py
🪛 Ruff (0.8.2)
snakemake_executor_plugin_googlebatch/executor.py
542-542: Local variable logger is assigned to but never used
Remove assignment to unused variable logger
(F841)
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: testing-mocked-api
🔇 Additional comments (2)
snakemake_executor_plugin_googlebatch/executor.py (2)
512-514: The log retrieval implementation looks good.The code appropriately triggers log retrieval after a job has completed (either failed or succeeded), which aligns with the PR's objective of retrieving logs after job finalization.
15-16: Ensure "google-cloud-logging" is added as a dependency.A previous review comment indicated that "google-cloud-logging" might not be explicitly listed in the project dependencies. This could cause runtime issues when importing the module.
Please ensure that "google-cloud-logging" is added to your dependency management file to avoid potential runtime issues.
|
Thanks for the suggestion @jakevc. |
This PR adds log retrieval for finished Google Batch jobs.
Log tailing is not available via the Cloud Logging API, so we can only retrieve logs all at once. A simple solution is to retrieve them only after the job has finished.
There was a discussion in #14 about how the API only allows retrieving a single line per request and 60 requests per minute. However, I have discovered that we can retrieve 1,000 lines (stdout/stderr) per request, so 60,000 lines per minute.
In this implementation, if a request fails, it retries after a 60-second delay. If it fails again, the logfile is left as-is - remains incomplete.
This implementation, while not perfect, improves the current situation, where an empty
.snakemake/googlebatch_logs/directory is created. Downloading these logs is highly beneficial, as browsing them in the Google Cloud Console's logging utility is highly painful.Summary by CodeRabbit
New Features
Bug Fixes
Tests