Skip to content

feat: watch improvements#310

Merged
jason-lynch merged 1 commit intomainfrom
feat/watch-improvements
Mar 31, 2026
Merged

feat: watch improvements#310
jason-lynch merged 1 commit intomainfrom
feat/watch-improvements

Conversation

@jason-lynch
Copy link
Copy Markdown
Member

Summary

Makes several substantial improvements to the storage.Watch operations:

  • Watch now does a synchronous Get operation before starting the watch to get the current value(s) and to get the start revision for the watch. Callers no longer have to do this operation themselves.
  • Watch now restarts itself automatically using the most recently-fetched revision. We were previously repeating this restart logic in every caller. Restarts are rate-limited to 1 per second.
  • Watch reports errors over an error channel, matching the convention that we've established everywhere.

We had an unused Until operation, so I opted to remove it rather than update it to reflect these changes.

Testing

From the user's perspective, everything should function the same as it did before. The changes in this PR were more focused on improving the watch's usability and edge-case behavior, such as when a Put occurs between the initial Get and the Watch operations, or the recovery behavior after an outage.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

This pull request refactors the watch mechanism across the storage system, removing error-driven watch closure (ErrWatchClosed, ErrWatchUntilTimedOut) and redesigning the WatchOp interface to use error-returning handlers and dedicated error channels. Watch initialization now performs an initial synchronous Get before watching changes. All dependent services (election, migration, scheduler) are updated to use the new API.

Changes

Cohort / File(s) Summary
Watch Interface & Implementation
server/internal/storage/interface.go, server/internal/storage/watch.go, server/internal/storage/errors.go
Replaced ErrWatchClosed and ErrWatchUntilTimedOut error constants; reworked WatchOp interface to use error-returning handlers, removed Until() method, added Error() channel and PropagateErrors() method; rewrote watch startup flow with initial synchronous Get, atomic concurrency guards, and error propagation via dedicated channel.
Watch Usage in Services
server/internal/election/candidate.go, server/internal/migrate/runner.go, server/internal/scheduler/service.go
Updated watch initialization to call store.Watch() upfront, reworked watch callbacks to return errors, removed ErrWatchClosed restart logic, added EventTypeUnknown handling, and changed watch errors to route via PropagateErrors() instead of direct error pushes.
Watch Tests
server/internal/storage/watch_test.go
Replaced Until-based tests with new tests verifying initial event delivery from pre-existing keys, handler error propagation to both Watch() result and Error() channel, and Create event detection; adjusted imports accordingly.
Test & Utility Updates
server/internal/election/candidate_test.go, server/internal/testutils/logger.go
Moved logger factory and election service initialization into subtests for per-test lifecycles; enhanced test logger to include test_name field when verbose mode is enabled.
Dependencies
go.mod
Marked golang.org/x/time v0.9.0 as a direct dependency by removing the // indirect comment.

Poem

🐰 Watch refactored, errors retired with grace,
Error channels now guide each async race,
Initial Gets anchor the watch stream,
PropagateErrors—a cleaner scheme,
Restarts reborn, no more closure dreams! 🎉

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is well-structured with Summary, Changes context, and Testing sections matching the template. However, the Changes section lacks explicit bulleted items and the Checklist section is completely missing. Add a bulleted Changes section with specific modifications and complete all checklist items (tests, docs, issue link, changelog, breaking changes).
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat: watch improvements' clearly summarizes the main changes (watch function improvements) and follows Conventional Commits style as required by the template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/watch-improvements

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jason-lynch jason-lynch force-pushed the feat/watch-improvements branch 2 times, most recently from f3edc39 to 0905973 Compare March 20, 2026 21:40
@jason-lynch
Copy link
Copy Markdown
Member Author

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 23, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@jason-lynch jason-lynch force-pushed the feat/put-with-updated-version branch from fa785be to 64a9954 Compare March 24, 2026 00:16
@jason-lynch jason-lynch force-pushed the feat/watch-improvements branch from 0905973 to fa49e10 Compare March 24, 2026 00:16
@jason-lynch jason-lynch requested a review from tsivaprasad March 30, 2026 15:37
}

c.watchOp.Close()
c.done <- struct{}{}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of c.done <- struct{}{}, I think it should be close(c.done)

@jason-lynch jason-lynch force-pushed the feat/put-with-updated-version branch from 64a9954 to 2db5030 Compare March 31, 2026 17:26
@jason-lynch jason-lynch force-pushed the feat/watch-improvements branch from fa49e10 to f7cad02 Compare March 31, 2026 17:26
@codacy-production
Copy link
Copy Markdown

codacy-production bot commented Mar 31, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 4 complexity . 0 duplication

Metric Results
Complexity 4
Duplication 0

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Base automatically changed from feat/put-with-updated-version to main March 31, 2026 18:03
Makes several substantial improvements to the `storage.Watch`
operations:

- `Watch` now does a synchronous `Get` operation before the starting the
  watch to get the current value(s) and to get the start revision for
  the watch. Callers no longer have to do this operation themselves.
- `Watch` now restarts itself automatically using the most recently-
  fetched revision. We were previously repeating this restart logic in
  every caller. Restarts are rate-limited to 1 per second.
- `Watch` reports errors over an error channel, matching the convention
  that we've established everywhere.

We had an unused `Until` operation and I opted to remove it rather than
update it for these changes.
@jason-lynch jason-lynch force-pushed the feat/watch-improvements branch from f7cad02 to 8fbc079 Compare March 31, 2026 18:07
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
server/internal/storage/watch_test.go (1)

86-102: Consider using unique key for test isolation.

Lines 88 and 97 use a hardcoded "watch-err" key while other tests use uuid.NewString(). For consistency and to prevent potential interference if tests are run with -parallel in the future, consider using a unique key.

♻️ Suggested change for consistency
 	t.Run("delivers error from handler", func(t *testing.T) {
 		ctx := t.Context()
-		watch := storage.NewWatchOp[*TestValue](client, "watch-err")
+		key := uuid.NewString()
+		watch := storage.NewWatchOp[*TestValue](client, key)
 
 		sentinel := assert.AnError
 		handler := func(e *storage.Event[*TestValue]) error {
 			return sentinel
 		}
 		require.NoError(t, watch.Watch(ctx, handler))
 		t.Cleanup(watch.Close)
 
-		err := storage.NewCreateOp(client, "watch-err", &TestValue{SomeField: "v"}).
+		err := storage.NewCreateOp(client, key, &TestValue{SomeField: "v"}).
 			Exec(ctx)
 		require.NoError(t, err)
 
 		require.ErrorIs(t, <-watch.Error(), sentinel)
 	})
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/storage/watch_test.go` around lines 86 - 102, The test uses a
hardcoded key "watch-err" for both NewWatchOp and NewCreateOp which can cause
interference when tests run in parallel; replace the literal with a unique key
(e.g., k := uuid.NewString()) and use that variable in
storage.NewWatchOp[*TestValue](client, k) and storage.NewCreateOp(client, k,
...) so NewWatchOp, watch, and NewCreateOp all reference the same per-test
unique key for isolation.
server/internal/storage/watch.go (1)

50-70: Handler invoked while holding mutex during initial load.

The load() method calls handle() at line 62 while holding o.mu. If the handler ever needs to call any watchOp method (e.g., Close()), this would deadlock. The current callers don't do this, but it's worth documenting or restructuring.

Consider releasing the lock before invoking handlers, or document that handlers must not call back into the watch operation.

♻️ Optional: Release lock before calling handlers
 func (o *watchOp[V]) load(ctx context.Context, handle func(e *Event[V]) error) error {
 	o.mu.Lock()
-	defer o.mu.Unlock()
 
 	resp, err := o.client.Get(ctx, o.key, o.options...)
 	if err != nil {
+		o.mu.Unlock()
 		return fmt.Errorf("failed to get initial items for watch: %w", err)
 	}
 
+	o.revision = resp.Header.Revision
+	o.mu.Unlock()
+
 	for _, kv := range resp.Kvs {
 		if err := handle(convertKVToEvent[V](kv)); err != nil {
 			return err
 		}
 	}
 
-	o.revision = resp.Header.Revision
-
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@server/internal/storage/watch.go` around lines 50 - 70, The load() method
currently holds o.mu while calling handle(), risking deadlock if handlers call
back into watchOp; to fix, while holding o.mu build a slice of events by
converting resp.Kvs with convertKVToEvent[V], set o.revision =
resp.Header.Revision, then release o.mu and iterate the collected events calling
handle(event) (returning any handler error). This ensures o.revision is updated
under the lock but handlers run without holding o.mu; update the load()
implementation accordingly (watchOp.load, o.mu, o.revision, convertKVToEvent[V],
handle).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@server/internal/storage/watch_test.go`:
- Around line 86-102: The test uses a hardcoded key "watch-err" for both
NewWatchOp and NewCreateOp which can cause interference when tests run in
parallel; replace the literal with a unique key (e.g., k := uuid.NewString())
and use that variable in storage.NewWatchOp[*TestValue](client, k) and
storage.NewCreateOp(client, k, ...) so NewWatchOp, watch, and NewCreateOp all
reference the same per-test unique key for isolation.

In `@server/internal/storage/watch.go`:
- Around line 50-70: The load() method currently holds o.mu while calling
handle(), risking deadlock if handlers call back into watchOp; to fix, while
holding o.mu build a slice of events by converting resp.Kvs with
convertKVToEvent[V], set o.revision = resp.Header.Revision, then release o.mu
and iterate the collected events calling handle(event) (returning any handler
error). This ensures o.revision is updated under the lock but handlers run
without holding o.mu; update the load() implementation accordingly
(watchOp.load, o.mu, o.revision, convertKVToEvent[V], handle).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7f1fc90-45d4-468f-93e7-db92ae3c1b2e

📥 Commits

Reviewing files that changed from the base of the PR and between 0b4300c and 8fbc079.

📒 Files selected for processing (10)
  • go.mod
  • server/internal/election/candidate.go
  • server/internal/election/candidate_test.go
  • server/internal/migrate/runner.go
  • server/internal/scheduler/service.go
  • server/internal/storage/errors.go
  • server/internal/storage/interface.go
  • server/internal/storage/watch.go
  • server/internal/storage/watch_test.go
  • server/internal/testutils/logger.go
💤 Files with no reviewable changes (1)
  • server/internal/storage/errors.go

@jason-lynch jason-lynch merged commit f18292a into main Mar 31, 2026
3 checks passed
@jason-lynch jason-lynch deleted the feat/watch-improvements branch March 31, 2026 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants