Conversation
…ad fallback - Adds the dual-collection MetadataStore wrapper on DatabaseContext, opt-in via per-DB use_system_metadata_collection (overrides cluster default). - Routes bootstrap metadata (registry, dbconfigs, cbgt cfg) into _system._mobile on a per-bucket basis, decided from a registry-location probe plus the cluster flag / per-DB opt-in. Reads fall back to _default._default until migration completes; writes pin to the owning collection so CAS stays consistent across retries. - Tracks migration lifecycle in _sync:metadata_migration_status (born in _system._mobile): per-DB state map plus a bootstrap-copy phase that runs after every DB completes, copies bootstrap docs into _system._mobile, then disables fallback reads. - New-DB fast path (probeLegacyPerDBMetadata) skips arming a migration when there is no legacy _sync:seq to migrate from. - Does not perform the per-DatabaseContext data copy yet — landing as CBG-5228. Writes already route to the new location after opt-in, and reads fall back so existing data remains accessible. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
Implements CBG-5226 by adding an opt-in path for storing Sync Gateway bootstrap metadata and per-database metadata in _system._mobile, with read-fallback to legacy _default._default during the migration window. This wires migration status tracking via a bucket-level _sync:metadata_migration_status document and adds rosmar + Couchbase bootstrap-connection support for dual-collection reads/writes.
Changes:
- Add per-DB
base.MetadataStorewrapping to target_system._mobilewhen opted in, including a “new DB” fast path and per-bucket bootstrap-migration completion trigger wiring. - Extend
base.BootstrapConnection(CouchbaseCluster + RosmarCluster) to support dual-collection bootstrap metadata operations, migration status doc CRUD, and bucket-target caching. - Add/adjust tests to exercise dual-collection bootstrap semantics and metadata migration gating behavior.
Reviewed changes
Copilot reviewed 11 out of 12 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| rest/server_context.go | Routes DB metadata to _system._mobile via MetadataStore, adds migration-status stamping/refresh logic, and adds bucket-level bootstrap migration completion flow. |
| rest/metadatamigrationtest/metadata_migration_test.go | Adjusts test setup to seed legacy metadata so migration gating behavior can be exercised. |
| rest/main.go | Plumbs startup flag into bootstrap connection creation to enable bootstrap dual-collection behavior. |
| rest/admin_api.go | Adds per-bucket bootstrap-target hinting prior to initial registry/dbconfig writes for new deployments. |
| db/database.go | Wires new migration status hooks into DatabaseContext and refines metadata migration arming behavior. |
| db/background_mgr_metadata_migration.go | Implements status-doc updates for per-DB migration lifecycle (stubbed copy step for follow-up PR). |
| base/bootstrap.go | Extends BootstrapConnection interface and CouchbaseCluster implementation for dual-collection bootstrap metadata + migration status handling. |
| base/rosmar_cluster.go | Implements the new dual-collection bootstrap behavior and migration-status APIs for rosmar. |
| base/metadata_migration_status.go | Introduces the bucket-level metadata migration status document model and helpers. |
| base/bootstrap_test.go | Adds dual-collection bootstrap tests (insert/write/touch/delete fallback semantics) and updates cluster constructors. |
| go.mod | Bumps rosmar dependency to a newer pseudo-version. |
| go.sum | Updates checksums for the rosmar bump. |
…edge The CAS claim that flipped bootstrap.state not_started → in_progress was unsafe in two ways: (1) UpdateMetadataMigrationStatus re-invokes the mutator on CAS retries, so a stale `claimed = true` from an earlier iteration could let a non-claimant proceed alongside the real winner; (2) if a node crashed or MigrateBootstrapDocs returned an error after the in_progress write, the bucket was permanently wedged — the not_started guard prevented any peer from re-entering the claim path. MigrateBootstrapDocs is already idempotent under concurrent execution (primary Insert tolerates ErrDocumentExists, fallback Remove uses observed-CAS), so the bucket-level claim was buying very little while introducing the wedge. Now the function runs the copy step directly and the only CAS-guarded write is the final not_started → complete transition, which short-circuits if a peer has already completed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…bility The bootstrap-block state machine is now two-valued (pending → complete) since the in_progress claim was removed. The name "not_started" is misleading in that context — multiple nodes can have attempted the migration, so the doc's status of "not_started" doesn't mean no work has happened. Renamed to "pending", with a comment clarifying the two-valued machine. Added LastAttemptedAt and Attempts to BootstrapMigrationStatus as soft observability fields, written on each entry into the migration loop. These do not gate any behaviour — they're for operators trying to tell "no one has tried this bucket yet" from "we've been retrying for an hour." Encoding the latter as a state value would invite a future reader to add a gate against it and reintroduce the wedge condition. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gregns1
previously approved these changes
May 28, 2026
Contributor
gregns1
left a comment
There was a problem hiding this comment.
Happy for these to be done in follow up PR/ticket so approving but everything else looks good to me.
seedLegacyBootstrapDoc previously branched on TestUseXattrs(), which conflates two unrelated concerns: SG's general application-metadata xattr mode and the bootstrap-config persistence mode (the useXattrConfig argument to NewCouchbaseCluster). The bootstrap persistence mode is its own setting and should be exercised explicitly in both states. Threaded an explicit useXattrs parameter through seedLegacyBootstrapDoc and the dual-collection bootstrap test fixture, and converted the five tests built on that fixture to subtests covering both modes. Rosmar's bootstrap path has no xattr-mode variant, so the bootstrap_xattr=true subtest skips when running against Rosmar. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
gregns1
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
CBG-5226
MetadataStorewrapper onDatabaseContextin SG based on opt-in configuration - as well as similar fallback read logic in the bootstrap code._system._mobilestorage for all bootstrap metadata (registry, dbconfigs) for:_sync:metadata_migration_statusin_system._mobileto track per-bucket and per-database migration state. Used to bypass the dual metadata store wrapper post-migration, and also used to determine the last migrated database for bootstrap/registry migration.DatabaseContextmigration - Upcoming PR as CBG-5228 - though all writes get routed to the new location after opting in, and reads do fall back to the fallback datastore even without the migration run.Diagrams
Logic to determine which collection to use for bootstrap (not gated behind any opt-in - we only have the opt-in for databases). We optimistically write to
_systemif we find this is a new deployment. We write to_defaultif there is an existing deployment, but have a migration job to move bootstrap-level stuff over once the last database has been fully migrated.flowchart TD A[First bootstrap-doc op on bucket] --> C[Read _sync:registry] C --> D{Where is it?} D -- _system._mobile --> E[Target: _system._mobile] D -- _default._default --> F[Target: _default._default] D -- not found --> G{Cluster flag<br/>OR per-DB opt-in?} G -- yes --> E G -- no --> FLogic to determine whether db is new and can bypass metadata migration and write directly to
_system, or if we need to run migrate.flowchart TD A[Create or update db] --> B{resolveUseSystemMetadataCollection<br/>per-DB wins over cluster flag} B -- no --> J[MetadataStore = _default._default<br/>no wrapper] B -- yes --> C[Wrap MetadataStore<br/>base.NewMetadataStore] C --> D[probeLegacyPerDBMetadata] D --> E{_sync:seq<br/>or _sync:m_id:seq<br/>exists in _default._default?} E -- found --> F[Arm migration:<br/>not_started entry in<br/>migration_status] E -- none --> G[SetMigrationComplete<br/>on wrapper immediately] G --> H[shouldRunMetadataMigration<br/>short-circuits — no manager arm] F --> I[MetadataMigrationManager<br/>processes entry]Integration Tests
TestChangeIndexPartitionsStartStopAndRestartflakeTestMetadataMigrationStartsAfterAllNodesApplyConfigFixed w/ last commit