Skip to content

feat(sync): persist cloud sync queue state#77

Merged
shark0F0497 merged 2 commits into
mainfrom
chore/misc-tasks
May 12, 2026
Merged

feat(sync): persist cloud sync queue state#77
shark0F0497 merged 2 commits into
mainfrom
chore/misc-tasks

Conversation

@shark0F0497
Copy link
Copy Markdown
Collaborator

Pull Request Checklist

Please ensure your PR meets the following requirements:

  • Code follows the style guidelines
  • Tests pass locally
  • Code is formatted
  • Documentation updated if needed
  • Commit messages follow conventional commits
  • PR description is complete and clear

Summary

This PR makes Keystone cloud sync enqueue state durable by persisting queued work as sync_logs.status = 'pending' before worker execution. It also includes the existing seed script cleanup already present on this branch.


Motivation

  • Cloud Sync Center exposed a queued state, but manual retries were only placed in an in-memory channel and usually jumped directly from failed to in_progress.
  • Operators could retry multiple failed episodes successfully while never seeing queued jobs in Synapse.
  • Durable pending rows make queued work observable, restart-recoverable, and consistent with the UI status model.

Changes

Modified Files

  • internal/services/sync_worker.go - Persists manual and automatic sync work as pending rows, dispatches persisted pending rows before other polling work, and claims pending rows into in_progress with attempt counting handled at claim time.
  • internal/services/sync_worker_test.go - Adds coverage for manual retry pending persistence, due-failure promotion, queue-full recovery, persisted pending dispatch, and pending-row claim behavior.
  • scripts/seed.py - Removes SOP skill_sequence seed payload entries already present on this branch.

Added Files

Deleted Files

  • None

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update (documentation changes only)
  • Refactoring (code improvement without functional changes)
  • Performance improvement (code changes that improve performance)
  • Test changes (adding, modifying, or removing tests)

Impact Analysis

Breaking Changes

None

Backward Compatibility

Fully backward compatible. The implementation reuses the existing sync_logs.status = 'pending' state and does not require a database migration.


Testing

Test Environment

  • Local development environment
  • Go test cache redirected with GOCACHE=/tmp/go-build

Test Cases

  • Unit tests pass locally
  • Integration tests pass locally
  • E2E tests pass (if applicable)
  • Manual testing completed

Manual Testing Steps

Manual UI testing was not performed in this pass. The backend behavior was covered with targeted SyncWorker tests and the full Go test suite with race detection.

Test Coverage

  • New tests added
  • Existing tests updated
  • Coverage maintained or improved

Commands run:

env GOCACHE=/tmp/go-build go test -cover -race -v ./...

Screenshots / Recordings

Not applicable.


Performance Impact

  • Memory usage: No meaningful change
  • CPU usage: No meaningful change
  • Throughput: No expected regression
  • Lock contention: Slightly increased per-episode DB locking during enqueue/claim, bounded to sync worker operations

Documentation


Related Issues

  • None

Additional Notes

  • With KEYSTONE_SYNC_MAX_CONCURRENT=2, retrying four failed episodes should now expose two active uploads and two queued rows before completion, assuming uploads are long enough for the UI to observe the transition.
  • This PR intentionally keeps stale in_progress recovery as a follow-up hardening item, as documented in the design note.

Reviewers

No specific reviewers requested.


Notes for Reviewers

  • Please focus on the transaction boundaries in persistPendingSyncLog() and the pending -> in_progress claim path.
  • Verify the attempt-count semantics for fresh manual chains and automatic failed-row retries.

Checklist for Reviewers

  • Code changes are correct and well-implemented
  • Tests are adequate and pass
  • Documentation is updated and accurate
  • No unintended side effects
  • Performance impact is acceptable
  • Backward compatibility maintained (if applicable)

@shark0F0497 shark0F0497 merged commit e8ef064 into main May 12, 2026
5 of 6 checks passed
@shark0F0497 shark0F0497 deleted the chore/misc-tasks branch May 12, 2026 12:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant