New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to write old revision can result in malformed revision body in revtree #3692

Closed
adamcfraser opened this Issue Aug 4, 2018 · 4 comments

Comments

Projects
None yet
5 participants
@adamcfraser
Copy link
Contributor

adamcfraser commented Aug 4, 2018

When persisting temporary backups of old revision bodies, Sync Gateway adds a non-JSON byte to ensure they aren't included via N1QL, views, etc. On successful write of the temporary backup, the revision body is removed from the rev tree.

If the write of the temporary backup document fails, Sync Gateway isn't removing the revision body from the rev tree. In some scenarios, the revision body that's retained in the rev tree includes the leading non-JSON byte.

Subsequent updates to the document that target the should-be-removed document as ancestor will attempt to use the malformed revision as old doc. Sync functions that include oldDoc in their signature will fail attempting to parse the revision body, and cause the update to fail with a 500 error like:

WARNING: Sync fn exception: Unparseable JSRunner input: {"key1":"value2","v":"2b"; doc = map[key1:value2 v:3c _rev:3-c _id:doc1] -- db.(*Database).getChannelsAndAccess() at crud.go:1209

Write failure of the temporary backup should be tolerated - the backup is just a performance optimization - but on error the revision body must continue to be removed from the revtree.

@adamcfraser adamcfraser self-assigned this Aug 4, 2018

@bbrks bbrks closed this in #3694 Aug 6, 2018

bbrks added a commit that referenced this issue Aug 6, 2018

Ensure removal of obsolete rev from revtree (#3694)
Previously failures persisting temporary revision backups were also blocking revision removal, resulting in malformed obsolete revision bodies remaining in the revtree in some scenarios.

Fixes #3692
@ritamcouchbase

This comment has been minimized.

Copy link

ritamcouchbase commented Aug 27, 2018

@adamcfraser - Can you please help with a reproducible functional test case that can be automated.

@adamcfraser

This comment has been minimized.

Copy link
Contributor

adamcfraser commented Sep 4, 2018

@ritamcouchbase Reproducing this issue requires forcing a failure on a specific Couchbase Server write op. In our unit test we're using our bucket wrapper code (LeakyBucket) to force an error during this write. I'm not sure if the functional test framework has the same capability - maybe you would be able to do something like bringing down the vbucket that the specific document is targeted to, as long as no other documents in the test use that vbucket?

Here are the steps to repro, in any case:

  1. Write a document (creates revision 1-xxx)
  2. Make three updates to the document (creating rev 2-xxx, 3-xxx, and 4-xxx)
  3. Write a conflicting revision as a child of rev 1-xxx (2-yyy)
  4. Write a revision as a child of 2-yyy (3-yyy)
    • At this point Sync Gateway will be backing up the previous revision by writing a document named _sync:rev:[docid]:[len(revid)]:[revid] to the bucket. This write operation that needs to fail in order to reproduce the issue.
  5. Write a conflicting revision as a child of 2-yyy (3-zzz). Prior to the fix, this will fail due to this issue. Post-fix, it will succeed.

adamcfraser added a commit that referenced this issue Sep 27, 2018

Ensure removal of obsolete rev from revtree (#3694)
Previously failures persisting temporary revision backups were also blocking revision removal, resulting in malformed obsolete revision bodies remaining in the revtree in some scenarios.

Fixes #3692

adamcfraser added a commit that referenced this issue Sep 27, 2018

bbrks added a commit that referenced this issue Sep 28, 2018

Ensure removal of obsolete rev from revtree (#3694) (#3768)
Previously failures persisting temporary revision backups were also blocking revision removal, resulting in malformed obsolete revision bodies remaining in the revtree in some scenarios.

Fixes #3692

adamcfraser added a commit that referenced this issue Sep 28, 2018

Avoid error when processing documents impacted by #3692 (#3769)
* Avoid error when processing documents impacted by #3692

Fixes #3700

* Test tear-down cleanup

@djpongh djpongh added this to the 2.1.1 milestone Oct 8, 2018

@djpongh djpongh added the bug label Oct 8, 2018

adamcfraser added a commit that referenced this issue Oct 19, 2018

Avoid error when processing documents impacted by #3692 (#3769)
* Avoid error when processing documents impacted by #3692

Fixes #3700

* Test tear-down cleanup
@ArihantRk

This comment has been minimized.

Copy link

ArihantRk commented Oct 30, 2018

@adamcfraser

We are getting same kind of errors in sg-replicator logs and due to this replication is blocked/in loop. sg-replciation is bidirectional and only used for replication between clusters

.343+05:30 HTTP: #9642647: PUT /ptxdata/dc1536050635484_rec919916200045271020182352510_attachment?new_edits=false (ADMIN)
2018-10-29T17:51:14.426+05:30 CRUD+: Invoking sync on doc "dc1536050635484_rec919916200045271020182352510_attachment" rev 2-7aecbc654556c850acdf8331143ae570
2018-10-29T17:51:14.426+05:30 WARNING: Sync fn exception: Unparseable JSRunner input: ^A{"_attachments":{"919916200045_27_10_2018__23_52_51_0":{"content_type":"mp3","digest":"sha1-l/YafQe+yKEgHJHZvzjrSQsz39U=","length":837636,"revpos":1,"stub":true}},"_id":"dc1536050635484_rec919916200045271020182352510_attachment","_rev":"1-f28c38df64799fafaa06306b15484600","fileName":"919916200045_27_10_2018__23_52_51_0.mp3","state":"pending","type":"dispatcherrecording","userId":"919916200045","ver":"1.0"; doc = map[_id:dc1536050635484_rec919916200045271020182352510_attachment _revisions:map[ids:[7aecbc654556c850acdf8331143ae570 f28c38df64799fafaa06306b15484600] start:%!s(float64=2)] expiration:1543259420 type:dispatcherrecording userId:919916200045 ver:1.0 _attachments:map[919916200045_27_10_2018__23_52_51_0:map[revpos:%!s(int=2) content_type:mp3 length:%!s(int=837636) stub:%!s(bool=true) digest:sha1-l/YafQe+yKEgHJHZvzjrSQsz39U=]] _rev:2-7aecbc654556c850acdf8331143ae570 channels:[dc1536050635484-REC-CH] fileName:919916200045_27_10_2018__23_52_51_0.mp3 state:processed status:%!s(float64=0)] -- db.(*Database).getChannelsAndAccess() at crud.go:1199
2018-10-29T17:51:14.426+05:30 HTTP: #9642647: --> 500 Exception in JS sync function (82.7 ms)
2018-10-29T17:51:14.859+05:30 HTTP: #9642648: POST /ptxdata/_bulk_get?revs=true&attachments=true (ADMIN)
2018-10-29T17:51:14.871+05:30 HTTP+: #9642648: -->

Syncgateway response for the same document:

curl http://localhost:4985/ptxdata/dc1536050635484_rec919916200045271020182352510_attachment?rev=2-7aecbc654556c850acdf8331143ae570
{"_attachments":{"919916200045_27_10_2018__23_52_51_0":{"content_type":"mp3","digest":"sha1-l/YafQe+yKEgHJHZvzjrSQsz39U=","length":837636,"revpos":1,"stub":true}},"_id":"dc1536050635484_rec919916200045271020182352510_attachment","_rev":"2-7aecbc654556c850acdf8331143ae570","channels":["dc1536050635484-REC-CH"],"expiration":"1543259420","fileName":"919916200045_27_10_2018__23_52_51_0.mp3","state":"processed","status":0,"type":"dispatcherrecording","userId":"919916200045","ver":"1.0"}[root@GSG3 syncgw]#

[root@GSG3 syncgw]# curl http://localhost:4985/ptxdata/dc1536050635484_rec919916200045271020182352510_attachment?rev=1-f28c38df64799fafaa06306b15484600
{"error":"not_found","reason":"missing"}

curl http://localhost:4985/ptxdata/_raw/dc1536050635484_rec919916200045271020182352510_attachment
{"_attachments":{"919916200045_27_10_2018__23_52_51_0":{"content_type":"mp3","digest":"sha1-l/YafQe+yKEgHJHZvzjrSQsz39U=","length":837636,"revpos":3,"stub":true}},"_sync":{"rev":"3-28eccbeea456a64db6c5af8ec5fbe948","flags":24,"sequence":37104841,"recent_sequences":[37099909,37102106,37102814,37104841],"history":{"revs":["2-7aecbc654556c850acdf8331143ae570","2-7618d5b3ca382a01599f82661a57728f","3-28eccbeea456a64db6c5af8ec5fbe948","1-f28c38df64799fafaa06306b15484600"],"parents":[3,3,1,-1],"bodymap":{"0":"{"_attachments":{"919916200045_27_10_2018__23_52_51_0":{"content_type":"mp3","digest":"sha1-l/YafQe+yKEgHJHZvzjrSQsz39U=","length":837636,"revpos":1,"stub":true}},"channels":["dc1536050635484-REC-CH"],"expiration":"1543259420","fileName":"919916200045_27_10_2018__23_52_51_0.mp3","state":"processed","status":0,"type":"dispatcherrecording","userId":"919916200045","ver":"1.0"}"},"channels":[["dc1536050635484-REC-CH"],["dc1536050635484-REC-CH"],["dc1536050635484-REC-CH"],null]},"channels":{"dc1536050635484-REC-CH":null},"cas":"","time_saved":"2018-10-28T00:49:41.42519032+05:30"},"channels":["dc1536050635484-REC-CH"],"expiration":"1543259407","fileName":"919916200045_27_10_2018__23_52_51_0.mp3","state":"processed","status":0,"type":"dispatcherrecording","userId":"919916200045","ver":"1.0"}

Is this same issue we are observing.

@adamcfraser

This comment has been minimized.

Copy link
Contributor

adamcfraser commented Oct 30, 2018

@ArihantRk No, this doesn't sound like the same issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment