Skip to content

fix(mail): remove os.Exit from watch signal handler#311

Closed
OwenYWT wants to merge 3 commits intolarksuite:mainfrom
OwenYWT:codex/mail-watch-graceful-shutdown
Closed

fix(mail): remove os.Exit from watch signal handler#311
OwenYWT wants to merge 3 commits intolarksuite:mainfrom
OwenYWT:codex/mail-watch-graceful-shutdown

Conversation

@OwenYWT
Copy link
Copy Markdown
Contributor

@OwenYWT OwenYWT commented Apr 8, 2026

Summary

Replace the mail +watch signal-path os.Exit(0) with graceful shutdown so Ctrl+C cancels the WebSocket client and keeps the command testable.

Changes

  • replace direct os.Exit(0) in the signal handler with a shutdown helper that unsubscribes, stops signal delivery, and cancels the watch context
  • run the WebSocket client on a cancellable context and return nil on signal-driven shutdown
  • add focused tests covering graceful shutdown and unsubscribe failure reporting

Related Issues

Test Plan

  • GOCACHE=/tmp/go-build-cache GOMODCACHE=/tmp/go-mod-cache /opt/homebrew/opt/go/bin/go test ./shortcuts/mail -run 'TestHandleMailWatchSignal_UnsubscribesAndCancels|TestHandleMailWatchSignal_ReportsUnsubscribeFailure|TestMailWatch'
  • GOCACHE=/tmp/go-build-cache GOMODCACHE=/tmp/go-mod-cache /opt/homebrew/opt/go/bin/go test ./shortcuts/mail

Summary by CodeRabbit

  • Bug Fixes

    • More graceful mail-watch shutdown: improved interrupt handling, idempotent unsubscribe with warning on failure, coordinated startup vs shutdown to avoid abrupt exits and clearer startup error reporting.
  • Refactor

    • Internal concurrency and shutdown flow hardened for safer, race-resistant behavior.
  • Tests

    • Added unit tests for signal-triggered shutdown, unsubscribe behavior, and startup/shutdown race scenarios.

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 8, 2026

CLA assistant check
All committers have signed the CLA.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 8, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 57d46927-6f9f-4961-98e2-93f7e5dcf44c

📥 Commits

Reviewing files that changed from the base of the PR and between fcb54b9 and d034179.

📒 Files selected for processing (2)
  • shortcuts/mail/mail_watch.go
  • shortcuts/mail/mail_watch_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • shortcuts/mail/mail_watch.go

📝 Walkthrough

Walkthrough

Reworks mail-watch shutdown to use a cancellable context and coordinated signal flow instead of calling os.Exit. Introduces helpers for signal-triggered shutdown, idempotent unsubscribe, and websocket start/shutdown coordination, and adds unit tests covering these behaviors.

Changes

Cohort / File(s) Summary
Mail watch runtime & signal handling
shortcuts/mail/mail_watch.go
Rewrote MailWatch flow: create cancellable watchCtx, run cli.Start(watchCtx) in a goroutine with startErrCh, replace non-atomic eventCount with atomic.Int64, add handleMailWatchSignal, newMailWatchUnsubscribeOnce, and waitForMailWatchStart; remove direct os.Exit in signal handling and coordinate shutdown via channels.
Tests for signal handler & helpers
shortcuts/mail/mail_watch_test.go
Added tests exercising handleMailWatchSignal, unsubscribe success/failure paths, newMailWatchUnsubscribeOnce idempotence, and waitForMailWatchStart start-vs-shutdown race behaviors; added errors and os imports.

Sequence Diagram(s)

sequenceDiagram
    participant Main as Main Goroutine
    participant CLI as CLI (Start)
    participant SigH as Signal Handler
    participant Ch as Channels (startErrCh / shutdownBySignal)

    Main->>Main: create cancellable watchCtx
    Main->>CLI: go cli.Start(watchCtx) -> send err to startErrCh
    Main->>SigH: start listening on sigCh
    Main->>Ch: select on startErrCh or shutdownBySignal

    SigH->>SigH: receive interrupt
    SigH->>SigH: call handleMailWatchSignal() (unsubscribe, stop signal, cancel)
    SigH->>Ch: send shutdownBySignal

    CLI-->>Main: observes cancelled watchCtx and stops
    alt Shutdown by signal
        Ch-->>Main: shutdownBySignal received
        Main->>Main: return nil (graceful exit)
    else Startup error
        CLI-->>Main: startErrCh receives error
        Main->>Main: perform best-effort unsubscribe and return network error
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested reviewers

  • infeng

Poem

🐰 I nibble logs and gently pry,
No sudden exits — just a sigh.
I cancel contexts, tidy the nest,
Unsubscribe once, then let you rest.
Hop! A graceful shutdown — pass the zest!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the primary change: removing os.Exit from the signal handler and replacing it with graceful shutdown mechanisms.
Description check ✅ Passed The PR description follows the repository template with clear Summary, Changes, Test Plan, and Related Issues sections that comprehensively document the motivation, implementation, and verification.
Linked Issues check ✅ Passed All key requirements from issue #268 are met: os.Exit(0) removed, graceful shutdown via context cancellation implemented, shutdown coordination added, and proper cleanup with signal.Stop deferred.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the PR objective: helper functions for shutdown coordination, atomic counter fix for data race, WebSocket context management, and comprehensive test coverage for the shutdown logic.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions bot added domain/mail PR touches the mail domain size/M Single-domain feat or fix with limited business impact labels Apr 8, 2026
@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 8, 2026

Greptile Summary

This PR replaces the abrupt os.Exit(0) in the mail +watch signal handler with a structured graceful shutdown: a cancellable context drives the WebSocket client, sync.Once guards the unsubscribe call, and a pair of channels (shutdownRequested/shutdownComplete) let the main goroutine distinguish a clean signal exit from a real WebSocket error.

  • P1 — deadlock on panic: close(shutdownComplete) (line 503) is not deferred. If handleMailWatchSignal panics after close(shutdownRequested), the defer recover() swallows the panic but shutdownComplete is never closed; waitForMailWatchStart then blocks forever waiting on <-shutdownComplete.

Confidence Score: 4/5

Safe to merge once close(shutdownComplete) is deferred to prevent a deadlock when handleMailWatchSignal panics.

One P1 finding remains: close(shutdownComplete) is not deferred, creating a deadlock path if the signal handler goroutine panics after closing shutdownRequested. All other prior concerns from previous review threads have been addressed by the new channel-based coordination design.

shortcuts/mail/mail_watch.go — signal goroutine around lines 491–504

Vulnerabilities

No security concerns identified. The change removes an os.Exit call and introduces channel-based coordination; no new input surfaces, credential handling, or injection vectors are introduced.

Important Files Changed

Filename Overview
shortcuts/mail/mail_watch.go Replaces os.Exit(0) with context-cancellation–based graceful shutdown using three new helpers and a signal goroutine; one P1 deadlock risk if handleMailWatchSignal panics before close(shutdownComplete).
shortcuts/mail/mail_watch_test.go Adds focused unit tests for handleMailWatchSignal, newMailWatchUnsubscribeOnce, and waitForMailWatchStart covering success, error reporting, and signal-vs-error race; covers the expected paths well.

Sequence Diagram

sequenceDiagram
    participant Main
    participant WSGoroutine as WebSocket goroutine
    participant SigGoroutine as Signal goroutine

    Main->>WSGoroutine: go cli.Start(watchCtx)
    Main->>SigGoroutine: go (wait for signal)
    Main->>Main: waitForMailWatchStart(startErrCh, shutdownRequested, shutdownComplete)

    Note over SigGoroutine: Ctrl+C received
    SigGoroutine->>SigGoroutine: close(shutdownRequested)
    SigGoroutine->>SigGoroutine: handleMailWatchSignal → unsubscribe, stop, cancel()
    WSGoroutine-->>WSGoroutine: context cancelled → cli.Start returns
    WSGoroutine->>Main: startErrCh ← context.Canceled
    SigGoroutine->>SigGoroutine: close(shutdownComplete)  ⚠️ not deferred

    alt startErrCh arrives before shutdownComplete
        Main->>Main: inner select: shutdownRequested closed → wait for shutdownComplete
        SigGoroutine-->>Main: shutdownComplete closed
        Main->>Main: return nil ✓
    else shutdownComplete arrives first
        Main->>Main: return nil ✓
    end

    Note over SigGoroutine: ⚠️ panic in handleMailWatchSignal skips close(shutdownComplete) → deadlock
Loading

Reviews (3): Last reviewed commit: "fix(mail): avoid watch shutdown race" | Re-trigger Greptile

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
shortcuts/mail/mail_watch.go (1)

53-67: Consider using or removing the unused sig parameter.

The sig os.Signal parameter is accepted but never used in the function body. If you intend to log which signal triggered shutdown (SIGINT vs SIGTERM), consider adding it to the log message. Otherwise, if the parameter exists solely for test verification, that's acceptable but worth a brief comment.

♻️ Optional: Log the signal type
 func handleMailWatchSignal(errOut io.Writer, sig os.Signal, eventCount int, unsubscribe func() error, stopSignals func(), cancel context.CancelFunc) {
-	fmt.Fprintf(errOut, "\nShutting down... (received %d events)\n", eventCount)
+	fmt.Fprintf(errOut, "\nShutting down (signal: %v)... (received %d events)\n", sig, eventCount)
 	fmt.Fprintln(errOut, "Unsubscribing mailbox events...")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@shortcuts/mail/mail_watch.go` around lines 53 - 67, The function
handleMailWatchSignal currently accepts sig os.Signal but never uses it; either
remove the sig parameter from the signature and update any callers/tests, or
include it in the shutdown log (e.g., add sig to the fmt.Fprintf/Fprintln output
to show which signal triggered shutdown). If the parameter must remain only for
tests, keep it but add a short comment above handleMailWatchSignal explaining
it's intentionally unused for test verification. Ensure references to
unsubscribe, stopSignals, cancel behavior remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 464-468: The signal handler reads eventCount concurrently with
handleEvent mutating it, causing a data race; change eventCount to an
atomic.Int64 (declare var eventCount atomic.Int64), update the increment in
handleEvent to use eventCount.Add(1), and read it in the signal-handling
goroutine via eventCount.Load() (convert to int when calling
handleMailWatchSignal) so the read/write are synchronized; adjust imports to
include sync/atomic package as needed.

---

Nitpick comments:
In `@shortcuts/mail/mail_watch.go`:
- Around line 53-67: The function handleMailWatchSignal currently accepts sig
os.Signal but never uses it; either remove the sig parameter from the signature
and update any callers/tests, or include it in the shutdown log (e.g., add sig
to the fmt.Fprintf/Fprintln output to show which signal triggered shutdown). If
the parameter must remain only for tests, keep it but add a short comment above
handleMailWatchSignal explaining it's intentionally unused for test
verification. Ensure references to unsubscribe, stopSignals, cancel behavior
remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cfb2c0ff-fbee-46ed-8451-4722e100df80

📥 Commits

Reviewing files that changed from the base of the PR and between 12bb01a and 4967857.

📒 Files selected for processing (2)
  • shortcuts/mail/mail_watch.go
  • shortcuts/mail/mail_watch_test.go

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 8, 2026

Tip:

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.

Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

@chanthuang
Copy link
Copy Markdown
Collaborator

Feedback: correct core architecture, a few improvements possible

This is the strongest of the three PRs addressing this issue. The key architectural decision — running cli.Start(watchCtx) in a separate goroutine and using select in the main goroutine to wait on both shutdownBySignal and startErrCh — is critical and correct. Through local testing, we confirmed that the Lark WebSocket SDK does not return promptly after context cancellation, so this goroutine-based approach is the only way to ensure defers execute on signal shutdown. The other two PRs (#269, #273) both call cli.Start synchronously and hang as a result.

The extracted handleMailWatchSignal function and its tests are well done.

A few improvements to consider:

  1. Unified cleanup via defer: Currently unsubscribe() is called in multiple places (signal handler, error paths). Wrapping it with sync.Once and a defer at the top would guarantee exactly-once execution on all exit paths and eliminate the scattered cleanup calls.
  2. Forced termination support: Adding signal.Reset(os.Interrupt, syscall.SIGTERM) in the signal handler would restore default signal behavior, allowing a second Ctrl+C to force-kill the process if the graceful shutdown gets stuck.
  3. eventCount data race: Already fixed with atomic.Int64 in the second commit — nice catch.

@chanthuang
Copy link
Copy Markdown
Collaborator

Closing in favor of #350 which has been merged and addresses the same os.Exit issue.

#350 combines the most suitable approaches from all three PRs:

Thank you for the contribution — your work directly informed the final implementation.

@chanthuang chanthuang closed this Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

domain/mail PR touches the mail domain size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: remove os.Exit from mail watch signal handler

3 participants