Skip to content

Conversation

@joaquim-verges
Copy link
Member

@joaquim-verges joaquim-verges commented Nov 21, 2025

Summary by CodeRabbit

  • Chores
    • Enhanced job lifecycle monitoring with added timing instrumentation across workflow stages. Improved logging now includes duration metrics and per-stage performance tracking for better execution observability.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 21, 2025

Walkthrough

Adds timing instrumentation and structured logging to the EOA worker's main execution stages. Each stage (crash recovery, confirm flow, send flow) now records duration and emits JOB_LIFECYCLE-prefixed logs with per-stage metrics. The final completion message is also prefixed consistently.

Changes

Cohort / File(s) Summary
Timing instrumentation and JOB_LIFECYCLE logging
executors/src/eoa/worker/mod.rs
Adds start_time recording and duration computation around crash recovery, confirm flow, and send flow stages. Each stage emits a JOB_LIFECYCLE-prefixed info log with duration_seconds and stage-specific metrics. Final completion log prepended with "JOB_LIFECYCLE - " prefix. Additional per-stage completion log added after "Check for remaining work" step.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

  • Straightforward timing instrumentation and logging additions with no logic changes or signature modifications
  • Verify consistency of JOB_LIFECYCLE prefix usage across all new log statements
  • Confirm duration calculation accuracy and proper metric formatting in each stage log

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately describes the main change: adding detailed job lifecycle logging to the EOA executor worker, which directly matches the AI-generated summary of timing instrumentation and JOB_LIFECYCLE-prefixed logging additions.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch Add_detailed_job_lifecycle_logging_to_EOA_executor_worker

Comment @coderabbitai help to get the list of available commands and usage tips.

@joaquim-verges joaquim-verges marked this pull request as ready for review November 21, 2025 01:53
Copy link
Member Author

This stack of pull requests is managed by Graphite. Learn more about stacking.

@joaquim-verges joaquim-verges merged commit f1ea797 into main Nov 21, 2025
3 of 4 checks passed
@joaquim-verges joaquim-verges deleted the Add_detailed_job_lifecycle_logging_to_EOA_executor_worker branch November 21, 2025 01:54
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
executors/src/eoa/worker/mod.rs (4)

308-324: LGTM! Timing instrumentation implemented correctly.

The crash recovery timing follows the correct pattern using millisecond timestamps and the calculate_duration_seconds helper. All logged fields are accessible and the metrics provide useful insights into recovery performance.

Consider using unique variable names like recovery_start_time and recovery_duration to improve clarity and avoid shadowing across the three workflow stages. This makes the code easier to scan and debug.

-let start_time = current_timestamp_ms();
+let recovery_start_time = current_timestamp_ms();
 let recovered = self
     .recover_borrowed_state()
     .await
     .inspect_err(|e| {
         tracing::error!(error = ?e, "Error in recover_borrowed_state");
     })
     .map_err(|e| e.handle())?;
-let duration = calculate_duration_seconds(start_time, current_timestamp_ms());
+let recovery_duration = calculate_duration_seconds(recovery_start_time, current_timestamp_ms());
 tracing::info!(
     eoa = ?self.eoa,
     chain_id = self.chain_id,
     worker_id = self.store.worker_id(),
-    duration_seconds = duration,
+    duration_seconds = recovery_duration,
     recovered_count = recovered,
     "JOB_LIFECYCLE - Crash recovery completed"
 );

327-344: LGTM! Confirm flow timing instrumentation is consistent.

The timing implementation follows the same correct pattern as crash recovery, with appropriate logging of confirmation metrics (confirmed and failed counts from the report).

Similar to the crash recovery stage, consider using descriptive variable names like confirm_start_time and confirm_duration:

-let start_time = current_timestamp_ms();
+let confirm_start_time = current_timestamp_ms();
 let confirmations_report = self
     .confirm_flow()
     .await
     .inspect_err(|e| {
         tracing::error!(error = ?e, "Error in confirm flow");
     })
     .map_err(|e| e.handle())?;
-let duration = calculate_duration_seconds(start_time, current_timestamp_ms());
+let confirm_duration = calculate_duration_seconds(confirm_start_time, current_timestamp_ms());
 tracing::info!(
     eoa = ?self.eoa,
     chain_id = self.chain_id,
     worker_id = self.store.worker_id(),
-    duration_seconds = duration,
+    duration_seconds = confirm_duration,
     confirmed = confirmations_report.moved_to_success,
     failed = confirmations_report.moved_to_pending,
     "JOB_LIFECYCLE - Confirm flow completed"
 );

347-363: LGTM! Send flow timing instrumentation is consistent.

The send flow timing follows the established pattern correctly, logging the number of transactions sent along with the duration.

Apply the same variable naming improvement for consistency:

-let start_time = current_timestamp_ms();
+let send_start_time = current_timestamp_ms();
 let sent = self
     .send_flow()
     .await
     .inspect_err(|e| {
         tracing::error!(error = ?e, "Error in send_flow");
     })
     .map_err(|e| e.handle())?;
-let duration = calculate_duration_seconds(start_time, current_timestamp_ms());
+let send_duration = calculate_duration_seconds(send_start_time, current_timestamp_ms());
 tracing::info!(
     eoa = ?self.eoa,
     chain_id = self.chain_id,
     worker_id = self.store.worker_id(),
-    duration_seconds = duration,
+    duration_seconds = send_duration,
     sent_count = sent,
     "JOB_LIFECYCLE - Send flow completed"
 );

385-385: Consider adding timing instrumentation for consistency.

While the "JOB_LIFECYCLE - " prefix is correctly added, this stage lacks the duration_seconds metric that's present in the crash recovery, confirm flow, and send flow logs. Adding timing here would provide a complete picture of where time is spent in the workflow.

Apply timing instrumentation to the "Check for remaining work" stage:

 // 4. CHECK FOR REMAINING WORK
+let start_time = current_timestamp_ms();
 let counts = self
     .store
     .get_all_counts()
     .await
     .map_err(EoaExecutorWorkerError::from)
     .inspect_err(|e| {
         tracing::error!(error = ?e, "Error in get_all_counts");
     })
     .map_err(|e| e.handle())?;
+let duration = calculate_duration_seconds(start_time, current_timestamp_ms());

 tracing::info!(
+    duration_seconds = duration,
     recovered = recovered,
     confirmed = confirmations_report.moved_to_success,
     temp_failed = confirmations_report.moved_to_pending,
     replacements = confirmations_report.moved_to_pending,
     currently_submitted = counts.submitted_transactions,
     currently_pending = counts.pending_transactions,
     currently_borrowed = counts.borrowed_transactions,
     currently_recycled = counts.recycled_nonces,
     "JOB_LIFECYCLE - Check for remaining work completed"
 );

Note: If you implement the variable naming suggestion from the previous comments, use check_start_time and check_duration instead.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a0d76d7 and 48dd4bc.

📒 Files selected for processing (1)
  • executors/src/eoa/worker/mod.rs (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-20T05:30:35.171Z
Learnt from: joaquim-verges
Repo: thirdweb-dev/engine-core PR: 48
File: executors/src/eoa/worker/send.rs:20-21
Timestamp: 2025-09-20T05:30:35.171Z
Learning: In executors/src/eoa/worker/send.rs, there is a critical bug where HEALTH_CHECK_INTERVAL is defined as 300 seconds but compared against millisecond timestamps, causing balance checks to occur every 300ms instead of every 5 minutes (1000x more frequent than intended).

Applied to files:

  • executors/src/eoa/worker/mod.rs
🧬 Code graph analysis (1)
executors/src/eoa/worker/mod.rs (2)
executors/src/metrics.rs (2)
  • current_timestamp_ms (236-238)
  • calculate_duration_seconds (225-227)
executors/src/eoa/store/atomic.rs (3)
  • eoa (84-86)
  • chain_id (89-91)
  • worker_id (94-96)
🔇 Additional comments (1)
executors/src/eoa/worker/mod.rs (1)

233-233: LGTM! Consistent lifecycle logging prefix.

The addition of the "JOB_LIFECYCLE - " prefix aligns well with the new structured logging approach and makes it easier to filter job lifecycle events.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants