Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR implements reconciliation of revision trees on write for Inter Sync Gateway Replication (ISGR) by taking the new revTree property and including those revisions in the write process. The changes ensure that revision IDs and versions are correctly synchronized between active and passive replicators.
Key changes:
- Updated tests to assert on both CV and RevID instead of just CV due to fixed reconciliation
- Added new revision tree reconciliation logic in CRUD operations
- Extended BLIP handlers to process revision tree properties from ISGR
Reviewed Changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| rest/replicatortest/replicator_test.go | Updated test assertions to check both CV and RevID, removed temporary workarounds for CBG-4790/4791 |
| rest/replicatortest/replicator_revtree_test.go | Added comprehensive test cases for revision tree reconciliation in both pull and push scenarios |
| db/utilities_hlv_testing.go | Removed outdated comment about CBG-4790/4791 being unimplemented |
| db/util_testing.go | Updated function call to include new revTreeISGRProperty parameter |
| db/import_test.go | Updated function call to include new revTreeISGRProperty parameter |
| db/crud_test.go | Updated function calls to include new revTreeISGRProperty parameter |
| db/crud.go | Implemented core revision tree reconciliation logic with new parameter handling |
| db/blip_sync_context.go | Enhanced known revs processing to handle CV extraction and legacy document detection |
| db/blip_handler.go | Updated revision processing to use revision tree property for ISGR clients |
| if versionVectorProtocol { | ||
| legacyRev = true | ||
| if !cvInKnownRevs { | ||
| // if cv wasn't the first element in known revs array, this will be legacy doc |
There was a problem hiding this comment.
This should probably never happen, but this comment isn't correct:
If you had known revs as:
["1@abc", "1-abc", "2@def"]
this would end up incorrect. I find this logic incredibly hard to follow and I was thinking whether we could split this into a function with some table driven inputs.
I'm thinking something like:
getKnownRevs(changeRequestRow []any) (deltaSrcRevID string, knownRevs map[string]bool)
Then we could list valid conditions for
- 3.x replications
- 4.x replications
- mixed?
I am having trouble holding in my head what the expected input and output is, and having a test for this can serve as a type of documentation.
There was a problem hiding this comment.
Moved to function and test runs on that function.
| } | ||
|
|
||
| func (db *DatabaseCollectionWithUser) PutExistingCurrentVersion(ctx context.Context, newDoc *Document, newDocHLV *HybridLogicalVector, existingDoc *sgbucket.BucketDocument, revTreeHistory []string) (doc *Document, cv *Version, newRevID string, err error) { | ||
| func (db *DatabaseCollectionWithUser) PutExistingCurrentVersion(ctx context.Context, newDoc *Document, newDocHLV *HybridLogicalVector, existingDoc *sgbucket.BucketDocument, revTreeHistory []string, revTreeISGRProperty bool) (doc *Document, cv *Version, newRevID string, err error) { |
There was a problem hiding this comment.
I think you should describe the arguments to this function in a docstring.
I'm trying to wrap my head around why you'd have revTreeHistory but revTreeISGRProperty is false. This could be done in some type of test as well.
One way to make this more clear (not sure) is to have something that is:
type putRevTreeHistory struct {
history []string // is this always going be ["1-abc", "2-abc"]
forceAlignment bool // do not check for revtree conflicts?
}
I don't like the above because I'm in fact not sure what this does.
My very shaky understanding is:
- revTreeHistory could contain the incoming cv
- revTreeHistory could contain the revtree from the incoming source? which would be the previous revisions? Could this be a 2@abc, 1-def?
Do we think that we should clarify with some type of typed information to make the subsequent calls easier to understand?
I really don't think this should mention ISGR at all, that's ultimately an implementation detail upstream. The boolean revTreeISGRProperty controls:
- If true, generate a new revtree id rather than putting the existing revtree id
- If true, will take all of this revtree history and add it to the existing revtree? or overwrite the history if there are conflicts?
- avoid rev tree history conflict checking?
Because this behavior is connected to revHistory that's why I thought they might belong in the same struct.
There was a problem hiding this comment.
Changed the naming and added comment to describe input parameters to the function.
There was a problem hiding this comment.
I will be documenting legacy rev handling better soon too for knowledge of the team given this area can get very confusing in the code IMO.
| // to add any new revision to rev tree below. | ||
| // Only check for rev tree conflicts if we haven't already checked above | ||
| if !revTreeConflictChecked && len(revTreeHistory) > 0 { | ||
| if !revTreeConflictChecked && len(revTreeHistory) > 0 && !revTreeISGRProperty { |
There was a problem hiding this comment.
Should we declare revTreeConflictChecked true if revTreeISGRProperty is set above? If not, I think that we should add to comment above why this isn't checked
There was a problem hiding this comment.
Added comment here for clarity, will be updating the doc in the description with more legacy rev handling info soon too for team knowledge.
| require.EventuallyWithT(t, func(c *assert.CollectT) { | ||
| assert.Equal(c, db.ReplicationStateStopped, ar.GetStatus(ctx1).Status) | ||
| }, time.Second*20, time.Millisecond*100) |
There was a problem hiding this comment.
For readability, I would be inclined to use:
rt1.WaitForReplicationStatus(replicationID, db.ReplicationStateStopped)
you might need to use RestTester.CreateReplication
There was a problem hiding this comment.
I want to be using the active replicator object here not the rest api for these so will have to leave these assertions as is.
Its on roadmap to have these active replicator tests running like blip tester client does in two modes, it may not make sense to have these running in two modes given they test 4.0 features but I would still like that option.
| assert.ElementsMatch(t, slices.Collect(maps.Keys(rt1Doc.History)), docHistoryList) | ||
| } | ||
|
|
||
| func TestActiveReplicatorPushRevTreeReconciliation(t *testing.T) { |
There was a problem hiding this comment.
Is this just the same test as the pull test? should these be parameterized? If this makes it too complicated I don't want to do that, but also if it really is mostly the same code then it might be more helpful?
The same is true for large size tests. If it doesn't make sense to use push and pull, does it make sense to parameterize doc numbers for the test?
There was a problem hiding this comment.
Parameterized these tests
| } | ||
| } | ||
|
|
||
| func (doc *Document) alignRevTreeHistory(ctx context.Context, newDoc *Document, revTreeHistory []string) error { |
There was a problem hiding this comment.
Add docstring and table test for this function with the expectations.
My understanding is that align might actually mean merge and overwrite if required, but it would be easier to see if we can have concrete examples.
There was a problem hiding this comment.
Done and added test
torcolvin
left a comment
There was a problem hiding this comment.
LGTM, some logging suggestions
| // extract cv from the known revs array | ||
| msgHLV, _, err := extractHLVFromBlipString(revID) | ||
| if err != nil { | ||
| base.DebugfCtx(ctx, base.KeySync, "Invalid known rev format for hlv on doc: %s falling back to full body replication.", base.UD(docID)) |
There was a problem hiding this comment.
we are swallowing the error here, so we don't have much chance of ever debugging this
There was a problem hiding this comment.
I don't think we want to error here fully given it can fall back to full body replication. Its possible that the first elem may be a revID (if the change response for this doc is a legacy rev) in this case we wouldn't want to error but maybe log to ensure its clear why its not being sent as a delta.
I have added the error message and the revID string to the log line though so we can see why we fall into this.
| if doc.HLV == nil { | ||
| // no hlv on local doc, mark as missing but send current rev as known rev (will be handled as legacy | ||
| // rev document on changes response handler) | ||
| base.DebugfCtx(ctx, base.KeyChanges, "Doc %s has no HLV, marking change version %s as missing", base.UD(docid), base.UD(rev)) |
There was a problem hiding this comment.
Do you think debug is too verbose for this? This seems common. If this is useful for testing, maybe KeyVV?
There was a problem hiding this comment.
Good point, i'll drop to trace
CBG-4790
revTreeproperty in ISGR and includes those revisions in the writeUpdated some info on ISGR changes thus far in this document: https://docs.google.com/document/d/1SrN_om0pBm3yGnZF1cenFrkODhwMYWfPkjPbTq_2cEE/edit?usp=sharing
Pre-review checklist
fmt.Print,log.Print, ...)base.UD(docID),base.MD(dbName))docs/apiDependencies (if applicable)
Integration Tests
GSI=true,xattrs=truehttps://jenkins.sgwdev.com/job/SyncGateway-Integration/32/ (test flake)