fix: replace os.Exit with graceful shutdown in mail watch#269
fix: replace os.Exit with graceful shutdown in mail watch#269luojiyin1987 wants to merge 5 commits intolarksuite:mainfrom
Conversation
Remove os.Exit(0) from signal handler and implement proper shutdown coordination using context cancellation and channel signaling. This makes the code testable and allows proper cleanup of resources.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughMailbox watch shutdown now uses a cancellable context and coordinated signal handling: a watchCtx from context.WithCancel, a shutdownBySignal channel, deferred signal.Stop(sigCh), idempotent unsubscribe via sync.Once, atomic event counting, and cli.Start run with watchCtx with adjusted error suppression. Changes
Sequence Diagram(s)sequenceDiagram
participant OS as OS Signal
participant Sig as Signal Handler
participant Main as Main Goroutine
participant CLI as cli.Start
participant WS as WebSocket / Unsubscribe
OS->>Sig: SIGINT / SIGTERM
Sig->>Sig: close(shutdownBySignal)
Sig->>Main: call cancelWatch() / cancel(watchCtx)
Note right of Main: watchCtx cancelled (context.WithCancel)
Main->>CLI: cli.Start(watchCtx)
CLI->>WS: maintain connection / handle messages
CLI-->>Main: returns (err or nil)
Main->>WS: deferred unsubscribeWithLog() (sync.Once)
alt shutdownBySignal set or err == context.Canceled
Main-->>Main: return nil
else
Main-->>Main: return output.ErrNetwork(...)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR removes The previously flagged race (cancel before close) is correctly resolved: Confidence Score: 5/5Safe to merge — shutdown logic is correct, channel/cancel ordering prevents the previously flagged race, all paths unsubscribe via defer All remaining concerns are P2 or lower. The ordering fix (close channel before cancel) is correctly implemented, goroutine leak is handled via the watchCtx.Done case, and atomic eventCount fixes the original data race. Prior review concern is resolved. No files require special attention Important Files Changed
Sequence DiagramsequenceDiagram
participant User as User (Ctrl+C)
participant Goroutine as Signal Goroutine
participant Main as Main Goroutine
participant WS as cli.Start (WebSocket)
participant API as Unsubscribe API
Main->>WS: cli.Start(watchCtx)
Main->>Goroutine: spawn (listening for sigCh / watchCtx.Done)
User->>Goroutine: SIGINT/SIGTERM
Goroutine->>Goroutine: signal.Stop + signal.Reset
Goroutine->>Main: close(shutdownBySignal)
Goroutine->>WS: cancelWatch() - context cancelled
WS-->>Main: return context.Canceled error
Main->>Main: select shutdownBySignal → return nil
Main->>API: defer unsubscribeWithLog() via sync.Once
API-->>Main: unsubscribed
Reviews (5): Last reviewed commit: "fix: harden mail watch signal shutdown p..." | Re-trigger Greptile |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
shortcuts/mail/mail_watch.go (1)
434-450:⚠️ Potential issue | 🟠 MajorPublish shutdown before cleanup, and let the signal goroutine exit when the watch stops.
Line 441 runs
unsubscribe()beforecancelWatch()/close(shutdownBySignal), so Ctrl+C can keepcli.Startblocked on a slow unsubscribe and can race intoErrNetworkif the socket exits while cleanup is in flight. Separately, line 435 only waits onsigCh; on parent-context cancel or any other non-signal exit,signal.Stop(sigCh)does not close the channel, so this goroutine leaks indefinitely.One way to structure it
shutdownBySignal := make(chan struct{}) go func() { defer func() { if r := recover(); r != nil { fmt.Fprintf(errOut, "panic in signal handler: %v\n", r) } }() - <-sigCh - info(fmt.Sprintf("\nShutting down... (received %d events)", eventCount)) - info("Unsubscribing mailbox events...") - if unsubErr := unsubscribe(); unsubErr != nil { - fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) - } else { - info("Mailbox unsubscribed.") - } - cancelWatch() - close(shutdownBySignal) + select { + case <-sigCh: + close(shutdownBySignal) + cancelWatch() + case <-watchCtx.Done(): + return + } }() info("Connected. Waiting for mail events... (Ctrl+C to stop)") if err := cli.Start(watchCtx); err != nil { select { case <-shutdownBySignal: + info(fmt.Sprintf("\nShutting down... (received %d events)", eventCount)) + info("Unsubscribing mailbox events...") + if unsubErr := unsubscribe(); unsubErr != nil { + fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) + } else { + info("Mailbox unsubscribed.") + } return nil default: }Also applies to: lines 454-458
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/mail/mail_watch.go` around lines 434 - 450, The signal-handler goroutine can block on a slow unsubscribe and never exit on non-signal shutdown; fix it by making the handler publish termination first and not block cli.Start: on receiving from sigCh (or ctx.Done), call cancelWatch() and close(shutdownBySignal) before calling unsubscribe(), then perform unsubscribe() in a separate goroutine (or with a bounded timeout) so it cannot block the main shutdown path; also make the goroutine select on both sigCh and the parent context (or ctx.Done) so it will exit when the watch is cancelled, referencing the symbols sigCh, shutdownBySignal, cancelWatch(), unsubscribe(), and cli.Start/ErrNetwork in your changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 434-450: The signal-handler goroutine can block on a slow
unsubscribe and never exit on non-signal shutdown; fix it by making the handler
publish termination first and not block cli.Start: on receiving from sigCh (or
ctx.Done), call cancelWatch() and close(shutdownBySignal) before calling
unsubscribe(), then perform unsubscribe() in a separate goroutine (or with a
bounded timeout) so it cannot block the main shutdown path; also make the
goroutine select on both sigCh and the parent context (or ctx.Done) so it will
exit when the watch is cancelled, referencing the symbols sigCh,
shutdownBySignal, cancelWatch(), unsubscribe(), and cli.Start/ErrNetwork in your
changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 28c6021c-938a-450b-91c0-50bd5a6c8723
📒 Files selected for processing (1)
shortcuts/mail/mail_watch.go
|
Addressed. Updated the signal-handler ordering in Commit: 9c93401 |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
shortcuts/mail/mail_watch.go (1)
429-450:⚠️ Potential issue | 🟠 MajorDon't block shutdown behind
unsubscribe()or leave the signal goroutine parked.Line 444 does network I/O before Lines 449-450 publish the shutdown marker and cancel the websocket, so Ctrl+C can hang on
unsubscribe()andcli.Start()can still report a connection error before the signal path is visible. Also,signal.Stopon Line 429 does not closesigCh, so this goroutine never exits on non-signal paths.Suggested shape
shutdownBySignal := make(chan struct{}) go func() { defer func() { if r := recover(); r != nil { fmt.Fprintf(errOut, "panic in signal handler: %v\n", r) } }() - <-sigCh - info(fmt.Sprintf("\nShutting down... (received %d events)", eventCount)) - info("Unsubscribing mailbox events...") - if unsubErr := unsubscribe(); unsubErr != nil { - fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) - } else { - info("Mailbox unsubscribed.") - } - close(shutdownBySignal) - cancelWatch() + select { + case <-sigCh: + info(fmt.Sprintf("\nShutting down... (received %d events)", eventCount)) + close(shutdownBySignal) + cancelWatch() + info("Unsubscribing mailbox events...") + if unsubErr := unsubscribe(); unsubErr != nil { + fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) + } else { + info("Mailbox unsubscribed.") + } + case <-watchCtx.Done(): + return + } }()If you need to wait for
unsubscribe()before returning, do that aftercli.Start()unwinds or under a bounded timeout rather than before cancellation.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/mail/mail_watch.go` around lines 429 - 450, The signal handler currently performs network I/O by calling unsubscribe() before closing shutdownBySignal and cancelWatch(), and it never reliably exits because signal.Stop(sigCh) does not close sigCh; change the handler so that on receiving from sigCh you immediately log and then close shutdownBySignal and call cancelWatch() (so cli.Start() and websocket teardown are not blocked), then launch a separate goroutine to perform unsubscribe() with a bounded timeout/context and log its result; ensure the spawned unsubscribe goroutine does not block the signal handler and that the signal handler returns (so the goroutine can exit on non-signal paths), and keep using signal.Stop(sigCh) but do not rely on sigCh being closed to terminate the goroutine.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 429-450: The signal handler currently performs network I/O by
calling unsubscribe() before closing shutdownBySignal and cancelWatch(), and it
never reliably exits because signal.Stop(sigCh) does not close sigCh; change the
handler so that on receiving from sigCh you immediately log and then close
shutdownBySignal and call cancelWatch() (so cli.Start() and websocket teardown
are not blocked), then launch a separate goroutine to perform unsubscribe() with
a bounded timeout/context and log its result; ensure the spawned unsubscribe
goroutine does not block the signal handler and that the signal handler returns
(so the goroutine can exit on non-signal paths), and keep using
signal.Stop(sigCh) but do not rely on sigCh being closed to terminate the
goroutine.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6228aff1-b905-4072-992c-6f8c845d89cf
📒 Files selected for processing (1)
shortcuts/mail/mail_watch.go
- Add select with watchCtx.Done() to handle non-signal exit paths - Move close(shutdownBySignal) and cancelWatch() before unsubscribe() - Execute unsubscribe() in main goroutine after cli.Start() returns This ensures: - Signal handler goroutine can exit on non-signal paths - Shutdown signal is published immediately without blocking on network I/O - No race condition between shutdownBySignal and context.Canceled
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
shortcuts/mail/mail_watch.go (1)
475-476:⚠️ Potential issue | 🟡 MinorMissing unsubscribe when
cli.Startreturns without error.If
cli.Startreturnsnil(e.g., server-initiated clean disconnect), the function exits without unsubscribing. While this is likely an edge case for a long-running WebSocket watch, consider whether unsubscribe should occur on all exit paths for consistency.🛠️ Possible fix
} + // Clean exit without error - still unsubscribe for completeness + info("Unsubscribing mailbox events...") + if unsubErr := unsubscribe(); unsubErr != nil { + fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) + } else { + info("Mailbox unsubscribed.") + } return nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/mail/mail_watch.go` around lines 475 - 476, The code currently exits without calling unsubscribe when cli.Start returns nil; ensure unsubscribe is invoked on every exit path by moving the unsubscribe call into a deferred or finally-style cleanup that runs after the call to cli.Start, so that regardless of whether cli.Start returns an error or nil the unsubscribe routine is executed; specifically update the function that calls cli.Start to call unsubscribe (the existing unsubscribe function/closure) in a defer immediately after successful subscription (before calling cli.Start) or in a single cleanup block after cli.Start returns, so both error and clean-disconnect returns trigger unsubscribe.
🧹 Nitpick comments (1)
shortcuts/mail/mail_watch.go (1)
453-472: Consider extracting duplicate unsubscribe logic.The unsubscribe pattern at lines 455-460 and 466-470 is duplicated. This could be consolidated for maintainability.
♻️ Suggested refactor to reduce duplication
if err := cli.Start(watchCtx); err != nil { - select { - case <-shutdownBySignal: - info("Unsubscribing mailbox events...") - if unsubErr := unsubscribe(); unsubErr != nil { - fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) - } else { - info("Mailbox unsubscribed.") - } - return nil - default: - } - if errors.Is(err, context.Canceled) { - info("Unsubscribing mailbox events...") - if unsubErr := unsubscribe(); unsubErr != nil { - fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) - } else { - info("Mailbox unsubscribed.") - } - return nil - } + isSignalShutdown := false + select { + case <-shutdownBySignal: + isSignalShutdown = true + default: + } + if isSignalShutdown || errors.Is(err, context.Canceled) { + info("Unsubscribing mailbox events...") + if unsubErr := unsubscribe(); unsubErr != nil { + fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", unsubErr) + } else { + info("Mailbox unsubscribed.") + } + return nil + } unsubscribe() //nolint:errcheck // best-effort cleanup return output.ErrNetwork("WebSocket connection failed: %v", err) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@shortcuts/mail/mail_watch.go` around lines 453 - 472, The unsubscribe logic is duplicated; extract it into a single helper and call it from both branches. Create a small function (e.g., performUnsubscribe or handleUnsubscribe) that captures necessary closures/vars (unsubscribe, info, errOut) and performs the fmt.Fprintf on error or info on success, and then replace the repeated blocks in the select case and the errors.Is(err, context.Canceled) branch with a call to that helper and return nil. Ensure the helper lives in the same scope so it can access unsubscribe, info and errOut without changing signatures.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 475-476: The code currently exits without calling unsubscribe when
cli.Start returns nil; ensure unsubscribe is invoked on every exit path by
moving the unsubscribe call into a deferred or finally-style cleanup that runs
after the call to cli.Start, so that regardless of whether cli.Start returns an
error or nil the unsubscribe routine is executed; specifically update the
function that calls cli.Start to call unsubscribe (the existing unsubscribe
function/closure) in a defer immediately after successful subscription (before
calling cli.Start) or in a single cleanup block after cli.Start returns, so both
error and clean-disconnect returns trigger unsubscribe.
---
Nitpick comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 453-472: The unsubscribe logic is duplicated; extract it into a
single helper and call it from both branches. Create a small function (e.g.,
performUnsubscribe or handleUnsubscribe) that captures necessary closures/vars
(unsubscribe, info, errOut) and performs the fmt.Fprintf on error or info on
success, and then replace the repeated blocks in the select case and the
errors.Is(err, context.Canceled) branch with a call to that helper and return
nil. Ensure the helper lives in the same scope so it can access unsubscribe,
info and errOut without changing signatures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6ca74fcd-8009-4599-bf32-501dd2e77070
📒 Files selected for processing (1)
shortcuts/mail/mail_watch.go
- Create unsubscribeWithLog() with sync.Once to ensure single execution - Use defer to guarantee cleanup on all exit paths - Remove duplicate unsubscribe logic from error handling branches This ensures unsubscribe runs exactly once regardless of how the function exits: - Normal exit - Early return (e.g., profile error) - Signal-triggered shutdown - Context cancellation - Network errors
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 262-273: The unsubscribe failure is currently only logged as a
warning by unsubscribeWithLog and the shutdown/signal handling code still
returns nil, so process exits 0 even if unsubscribe failed; change
unsubscribeWithLog to capture the unsubscribe error into a shared variable
(e.g., declare var unsubscribeErr error outside, call unsubscribeErr = err
inside the Do block when unsubscribe() fails) and have the deferred call
set/return that error via a named return or by wrapping the shutdown path: on
graceful-exit paths (the signal/cancel shutdown code that currently returns
nil), check unsubscribeErr (or call unsubscribe synchronously and capture its
error) and return that error instead of nil so unsubscribe failures propagate;
reference unsubscribeWithLog, unsubscribeLogOnce and the unsubscribe() call when
making these changes.
- Around line 452-456: The signal handler reads eventCount while handleEvent()
increments it, causing a data race; protect both sites with the same
synchronization primitive (e.g., add a sync.Mutex named eventCountMu or use
atomic.Int64) and use it whenever mutating or reading eventCount: lock/unlock
around the increment inside handleEvent() and lock/unlock (or atomic load)
around the read in the goroutine that selects on sigCh before logging and
closing shutdownBySignal; ensure the mutex/atomic is declared at the same scope
as eventCount so both code paths reference the same guard.
- Around line 440-456: The signal goroutine (reading from sigCh) currently
consumes the first interrupt but leaves signal.Notify active and doesn't restore
default behavior, so additional Ctrl+C presses are queued/dropped; update the
handler in the goroutine that handles sigCh so that when it receives the first
signal it (1) calls signal.Stop(sigCh) and signal.Reset(os.Interrupt,
syscall.SIGTERM) (importing os and syscall) to restore default handling, then
performs the existing shutdown steps (info(...), close(shutdownBySignal),
cancelWatch()); this ensures a second Ctrl+C will invoke the default terminate
behavior if shutdown blocks; keep using sigCh, shutdownBySignal and cancelWatch
names to locate the code.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 354ed9b8-9a8a-4989-97af-f5779534c8a1
📒 Files selected for processing (1)
shortcuts/mail/mail_watch.go
| var unsubscribeLogOnce sync.Once | ||
| unsubscribeWithLog := func() { | ||
| unsubscribeLogOnce.Do(func() { | ||
| info("Unsubscribing mailbox events...") | ||
| if err := unsubscribe(); err != nil { | ||
| fmt.Fprintf(errOut, "Warning: unsubscribe failed: %v\n", err) | ||
| } else { | ||
| info("Mailbox unsubscribed.") | ||
| } | ||
| }) | ||
| } | ||
| defer unsubscribeWithLog() |
There was a problem hiding this comment.
Propagate unsubscribe failures on the graceful-exit paths.
Lines 262-273 downgrade unsubscribe failures to a warning, and Lines 463-470 still return nil for signal/cancel shutdown. That means this command exits 0 even when the subscription it created earlier was not cleaned up successfully.
Also applies to: 463-470
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@shortcuts/mail/mail_watch.go` around lines 262 - 273, The unsubscribe failure
is currently only logged as a warning by unsubscribeWithLog and the
shutdown/signal handling code still returns nil, so process exits 0 even if
unsubscribe failed; change unsubscribeWithLog to capture the unsubscribe error
into a shared variable (e.g., declare var unsubscribeErr error outside, call
unsubscribeErr = err inside the Do block when unsubscribe() fails) and have the
deferred call set/return that error via a named return or by wrapping the
shutdown path: on graceful-exit paths (the signal/cancel shutdown code that
currently returns nil), check unsubscribeErr (or call unsubscribe synchronously
and capture its error) and return that error instead of nil so unsubscribe
failures propagate; reference unsubscribeWithLog, unsubscribeLogOnce and the
unsubscribe() call when making these changes.
|
Addressed in 8ea53f9. Implemented the two accepted points from this review:
The unsubscribe-exit-code behavior is intentionally unchanged in this commit. |
Replace os.Exit(0) in the signal handler of `mail +watch` with context-based graceful shutdown using context.WithCancel and a shutdown channel. The os.Exit(0) call had several problems: - Bypasses deferred cleanup functions in the call stack - Makes the code untestable (process terminates immediately) - Not idiomatic for Go CLI tools Changes: - Use context.WithCancel to create a cancellable watch context - Signal handler now cancels the context and closes shutdownCh - cli.Start uses watchCtx instead of the parent context - Distinguish between signal-triggered shutdown and real errors via select on shutdownCh Closes larksuite#269 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
os.Exit(0)from signal handler inmail watchcommandcontext.WithCanceland channel signalingsignal.Stop(sigCh)to defer for proper resource cleanupProblem
The
os.Exit(0)call in the signal handler:Solution
Use a
shutdownBySignalchannel to coordinate shutdown between the signal handler and the main goroutine. The signal handler now:shutdownBySignalchannelcancelWatch()to cancel the WebSocket contextThe main goroutine checks
shutdownBySignalto distinguish between signal-triggered shutdown and actual connection errors.Test plan
mail watchcommand with Ctrl+CCloses #268
Summary by CodeRabbit