Skip to content

feat(daemon): add circuit breaker for sync handler#158

Merged
AnnatarHe merged 2 commits intomainfrom
feat/daemon-sync-circuit-breaker
Dec 25, 2025
Merged

feat(daemon): add circuit breaker for sync handler#158
AnnatarHe merged 2 commits intomainfrom
feat/daemon-sync-circuit-breaker

Conversation

@AnnatarHe
Copy link
Copy Markdown
Contributor

Summary

  • Add circuit breaker pattern to prevent overwhelming the server when experiencing errors
  • After 10 consecutive failures, circuit opens and saves sync data locally to ~/.shelltime/sync-pending.jsonl
  • 1-hour timer resets circuit and republishes saved data back to pub/sub topic for retry
  • Uses interface pattern (CircuitBreaker) for testability

Files Changed

File Change
model/db.go Add SYNC_PENDING_FILE constant
daemon/circuit_breaker.go NEW - CircuitBreaker interface + SyncCircuitBreakerService
cmd/daemon/main.go Initialize and start circuit breaker service
daemon/handlers.sync.go Use circuit breaker in handler

Behavior

State Action Result
Circuit Closed Server success Reset failure counter
Circuit Closed Server error Increment failure counter
Circuit Closed 10th consecutive error Open circuit, log error
Circuit Open Any sync request Save to local file, return nil
Circuit Open 1-hour timer fires Close circuit, republish to topic

Test plan

  • Build passes: go build ./...
  • Tests pass: go test ./daemon/...
  • Manual test: simulate server errors and verify circuit opens after 10 failures
  • Verify data is saved to ~/.shelltime/sync-pending.jsonl when circuit is open
  • Verify 1-hour timer republishes saved data

🤖 Generated with Claude Code

Add a circuit breaker pattern to prevent overwhelming the server when
it's experiencing errors:

- After 10 consecutive failures, circuit opens and saves data locally
- 1-hour timer resets circuit and republishes saved data to pub/sub
- Data is persisted to ~/.shelltime/sync-pending.jsonl when circuit open
- Uses interface pattern for testability

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @AnnatarHe, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request integrates a circuit breaker pattern into the daemon's synchronization process. This enhancement aims to prevent system overload and improve robustness by temporarily halting sync operations to the server when a threshold of consecutive errors is met. Instead of failing outright, data is stored locally and automatically retried after a set interval, ensuring eventual consistency and better handling of transient server issues.

Highlights

  • Circuit Breaker Implementation: Introduces a circuit breaker pattern to the daemon's sync handler to enhance system resilience and prevent overwhelming the server during periods of error.
  • Error Handling & Local Storage: After 10 consecutive sync failures, the circuit opens, and subsequent sync data is saved locally to ~/.shelltime/sync-pending.jsonl instead of being lost.
  • Automated Retry Mechanism: A 1-hour timer is set to automatically reset the circuit and republish any locally saved pending sync data back to the pub/sub topic for retry, ensuring eventual consistency.
  • Testability: The circuit breaker functionality is designed with an interface (CircuitBreaker) to facilitate easier testing and promote a clean architecture.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 25, 2025

Codecov Report

❌ Patch coverage is 56.47059% with 74 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
model/circuit_breaker.go 70.58% 27 Missing and 13 partials ⚠️
daemon/circuit_breaker.go 0.00% 17 Missing ⚠️
daemon/handlers.sync.go 0.00% 8 Missing and 2 partials ⚠️
cmd/daemon/main.go 0.00% 6 Missing ⚠️
model/db.go 0.00% 1 Missing ⚠️
Flag Coverage Δ
unittests 20.46% <56.47%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
model/db.go 0.00% <0.00%> (ø)
cmd/daemon/main.go 0.00% <0.00%> (ø)
daemon/handlers.sync.go 55.07% <0.00%> (-9.34%) ⬇️
daemon/circuit_breaker.go 0.00% <0.00%> (ø)
model/circuit_breaker.go 70.58% <70.58%> (ø)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a circuit breaker pattern for the sync handler, which is a great addition for improving the resilience of the application. When sync operations fail repeatedly, the circuit opens, and pending data is saved locally for later retry. The implementation is mostly solid, using a background goroutine with a ticker to reset the circuit and retry.

My review includes feedback on a few key areas:

  • Refactoring the use of a global variable for the circuit breaker service to use dependency injection for better testability and maintainability.
  • Improving the memory efficiency of the retry mechanism to handle large backlogs of pending data.
  • A couple of correctness improvements related to file path handling and dead code removal.

Overall, this is a valuable feature, and with a few adjustments, it can be made even more robust.

Comment thread daemon/circuit_breaker.go Outdated
}

// Global instance
var syncCircuitBreaker CircuitBreaker
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Using a global variable syncCircuitBreaker creates tight coupling and makes the code harder to test and reason about. It's assigned as a side effect in NewSyncCircuitBreakerService, which is not an ideal pattern.

A better approach is to use dependency injection. The SyncCircuitBreakerService instance should be created in main.go and passed explicitly to the components that need it, such as the handlePubSubSync handler. This makes dependencies explicit and significantly improves testability.

Comment thread daemon/circuit_breaker.go Outdated
}

func (s *SyncCircuitBreakerService) SaveForRetry(ctx context.Context, payload interface{}) error {
filePath := os.ExpandEnv(fmt.Sprintf("%s/%s", "$HOME", model.SYNC_PENDING_FILE))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

There are a couple of improvements that can be made to file path handling here.

  1. Duplication: The logic to construct filePath is duplicated in retryPendingData at line 157. It's better to extract this into a private helper function.
  2. Path Construction: Using path/filepath.Join is the idiomatic and cross-platform way to construct file paths instead of fmt.Sprintf.
  3. Directory Creation: os.OpenFile with O_CREATE will not create parent directories. If ~/.shelltime doesn't exist, this will fail. You should ensure the directory exists using os.MkdirAll.

Here's an example of a helper function incorporating these points that you could add to the service:

import (
	"path/filepath"
)

func (s *SyncCircuitBreakerService) getPendingFilePath() (string, error) {
	homeDir, err := os.UserHomeDir()
	if err != nil {
		return "", fmt.Errorf("could not get user home directory: %w", err)
	}
	
	filePath := filepath.Join(homeDir, model.SYNC_PENDING_FILE)
	dir := filepath.Dir(filePath)

	if err := os.MkdirAll(dir, 0750); err != nil {
		return "", fmt.Errorf("failed to create directory for pending sync file: %w", err)
	}

	return filePath, nil
}

You could then call this helper in SaveForRetry and retryPendingData.

Comment thread daemon/circuit_breaker.go Outdated
Comment on lines +170 to +215
var lines []string
scanner := bufio.NewScanner(file)
for scanner.Scan() {
line := scanner.Text()
if line != "" {
lines = append(lines, line)
}
}
file.Close()

if err := scanner.Err(); err != nil {
slog.Error("Error reading pending sync file", slog.Any("err", err))
return
}

if len(lines) == 0 {
slog.Debug("No pending sync data to retry")
return
}

slog.Info("Starting sync data retry", slog.Int("pendingCount", len(lines)))

var failedLines []string
successCount := 0

for _, line := range lines {
// Republish to pub/sub topic
msg := message.NewMessage(watermill.NewUUID(), []byte(line))
if err := s.publisher.Publish(PubSubTopic, msg); err != nil {
slog.Warn("Failed to republish sync data, keeping for next retry", slog.Any("err", err))
failedLines = append(failedLines, line)
} else {
successCount++
}
}

// Rewrite file with only failed lines
if err := s.rewriteLogFile(filePath, failedLines); err != nil {
slog.Error("Failed to update pending sync file", slog.Any("err", err))
return
}

slog.Info("Sync data retry completed",
slog.Int("republished", successCount),
slog.Int("remaining", len(failedLines)))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of retryPendingData reads the entire pending data file into memory via the lines slice. If the service is unavailable for an extended period, this file could become very large, leading to excessive memory consumption.

A more memory-efficient, streaming approach is recommended:

  1. Open the pending file for reading.
  2. Create a new temporary file for writing failed retries.
  3. Read the pending file line by line.
  4. For each line, attempt to republish the message.
  5. If republishing fails, write the line to the temporary file.
  6. After processing all lines, close both files.
  7. Atomically replace the original pending file with the temporary file (or delete it if the temp file is empty).

This avoids loading all pending messages into memory at once and makes the retry mechanism more robust.

Comment thread cmd/daemon/main.go
Comment on lines +84 to +86
if err := syncCircuitBreakerService.Start(ctx); err != nil {
slog.Error("Failed to start sync circuit breaker service", slog.Any("err", err))
} else {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The syncCircuitBreakerService.Start function currently always returns nil, which makes this error check unreachable. The if err != nil block is effectively dead code.

You should either:

  1. Modify Start to return a potential error if one can occur during initialization.
  2. Change the signature of Start to not return an error (e.g., func (s *SyncCircuitBreakerService) Start(ctx context.Context)), and simplify the call site.

Given the current implementation of Start, the second option seems more appropriate.

syncCircuitBreakerService.Start(ctx)
slog.Info("Sync circuit breaker service started")
defer syncCircuitBreakerService.Stop()

- Move CircuitBreakerService from daemon to model package
- Add CircuitBreaker interface with generic []byte payload
- Add comprehensive tests for circuit breaker functionality
- Keep thin wrapper in daemon for SocketMessage handling
- Add CircuitBreaker to mockery config for mock generation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@AnnatarHe AnnatarHe merged commit 0e1cc16 into main Dec 25, 2025
3 checks passed
@AnnatarHe AnnatarHe deleted the feat/daemon-sync-circuit-breaker branch December 25, 2025 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant