-
Notifications
You must be signed in to change notification settings - Fork 9
Add detailed job lifecycle logging to EOA executor worker #88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add detailed job lifecycle logging to EOA executor worker #88
Conversation
WalkthroughAdds timing instrumentation and structured logging to the EOA worker's main execution stages. Each stage (crash recovery, confirm flow, send flow) now records duration and emits JOB_LIFECYCLE-prefixed logs with per-stage metrics. The final completion message is also prefixed consistently. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
executors/src/eoa/worker/mod.rs (4)
308-324: LGTM! Timing instrumentation implemented correctly.The crash recovery timing follows the correct pattern using millisecond timestamps and the
calculate_duration_secondshelper. All logged fields are accessible and the metrics provide useful insights into recovery performance.Consider using unique variable names like
recovery_start_timeandrecovery_durationto improve clarity and avoid shadowing across the three workflow stages. This makes the code easier to scan and debug.-let start_time = current_timestamp_ms(); +let recovery_start_time = current_timestamp_ms(); let recovered = self .recover_borrowed_state() .await .inspect_err(|e| { tracing::error!(error = ?e, "Error in recover_borrowed_state"); }) .map_err(|e| e.handle())?; -let duration = calculate_duration_seconds(start_time, current_timestamp_ms()); +let recovery_duration = calculate_duration_seconds(recovery_start_time, current_timestamp_ms()); tracing::info!( eoa = ?self.eoa, chain_id = self.chain_id, worker_id = self.store.worker_id(), - duration_seconds = duration, + duration_seconds = recovery_duration, recovered_count = recovered, "JOB_LIFECYCLE - Crash recovery completed" );
327-344: LGTM! Confirm flow timing instrumentation is consistent.The timing implementation follows the same correct pattern as crash recovery, with appropriate logging of confirmation metrics (confirmed and failed counts from the report).
Similar to the crash recovery stage, consider using descriptive variable names like
confirm_start_timeandconfirm_duration:-let start_time = current_timestamp_ms(); +let confirm_start_time = current_timestamp_ms(); let confirmations_report = self .confirm_flow() .await .inspect_err(|e| { tracing::error!(error = ?e, "Error in confirm flow"); }) .map_err(|e| e.handle())?; -let duration = calculate_duration_seconds(start_time, current_timestamp_ms()); +let confirm_duration = calculate_duration_seconds(confirm_start_time, current_timestamp_ms()); tracing::info!( eoa = ?self.eoa, chain_id = self.chain_id, worker_id = self.store.worker_id(), - duration_seconds = duration, + duration_seconds = confirm_duration, confirmed = confirmations_report.moved_to_success, failed = confirmations_report.moved_to_pending, "JOB_LIFECYCLE - Confirm flow completed" );
347-363: LGTM! Send flow timing instrumentation is consistent.The send flow timing follows the established pattern correctly, logging the number of transactions sent along with the duration.
Apply the same variable naming improvement for consistency:
-let start_time = current_timestamp_ms(); +let send_start_time = current_timestamp_ms(); let sent = self .send_flow() .await .inspect_err(|e| { tracing::error!(error = ?e, "Error in send_flow"); }) .map_err(|e| e.handle())?; -let duration = calculate_duration_seconds(start_time, current_timestamp_ms()); +let send_duration = calculate_duration_seconds(send_start_time, current_timestamp_ms()); tracing::info!( eoa = ?self.eoa, chain_id = self.chain_id, worker_id = self.store.worker_id(), - duration_seconds = duration, + duration_seconds = send_duration, sent_count = sent, "JOB_LIFECYCLE - Send flow completed" );
385-385: Consider adding timing instrumentation for consistency.While the "JOB_LIFECYCLE - " prefix is correctly added, this stage lacks the
duration_secondsmetric that's present in the crash recovery, confirm flow, and send flow logs. Adding timing here would provide a complete picture of where time is spent in the workflow.Apply timing instrumentation to the "Check for remaining work" stage:
// 4. CHECK FOR REMAINING WORK +let start_time = current_timestamp_ms(); let counts = self .store .get_all_counts() .await .map_err(EoaExecutorWorkerError::from) .inspect_err(|e| { tracing::error!(error = ?e, "Error in get_all_counts"); }) .map_err(|e| e.handle())?; +let duration = calculate_duration_seconds(start_time, current_timestamp_ms()); tracing::info!( + duration_seconds = duration, recovered = recovered, confirmed = confirmations_report.moved_to_success, temp_failed = confirmations_report.moved_to_pending, replacements = confirmations_report.moved_to_pending, currently_submitted = counts.submitted_transactions, currently_pending = counts.pending_transactions, currently_borrowed = counts.borrowed_transactions, currently_recycled = counts.recycled_nonces, "JOB_LIFECYCLE - Check for remaining work completed" );Note: If you implement the variable naming suggestion from the previous comments, use
check_start_timeandcheck_durationinstead.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
executors/src/eoa/worker/mod.rs(3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-20T05:30:35.171Z
Learnt from: joaquim-verges
Repo: thirdweb-dev/engine-core PR: 48
File: executors/src/eoa/worker/send.rs:20-21
Timestamp: 2025-09-20T05:30:35.171Z
Learning: In executors/src/eoa/worker/send.rs, there is a critical bug where HEALTH_CHECK_INTERVAL is defined as 300 seconds but compared against millisecond timestamps, causing balance checks to occur every 300ms instead of every 5 minutes (1000x more frequent than intended).
Applied to files:
executors/src/eoa/worker/mod.rs
🧬 Code graph analysis (1)
executors/src/eoa/worker/mod.rs (2)
executors/src/metrics.rs (2)
current_timestamp_ms(236-238)calculate_duration_seconds(225-227)executors/src/eoa/store/atomic.rs (3)
eoa(84-86)chain_id(89-91)worker_id(94-96)
🔇 Additional comments (1)
executors/src/eoa/worker/mod.rs (1)
233-233: LGTM! Consistent lifecycle logging prefix.The addition of the "JOB_LIFECYCLE - " prefix aligns well with the new structured logging approach and makes it easier to filter job lifecycle events.

Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.