New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

switch to [integer(), binary()] format for correct sorting in couchdb. #8

Merged
merged 1 commit into from Aug 24, 2011

Conversation

Projects
None yet
2 participants
@rnewson
Member

rnewson commented Aug 23, 2011

It seems to be this simple. I've performed replication successfully between bigcouch and couchdb with this change.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Aug 24, 2011

Member

That regex confuses me, I can't quite parse it. Is it backwards-compatible, i.e. if I upgrade and then try to do an incremental replication using the old "integer-binary" format for the value of since will I have to start from scratch?

Member

kocolosk commented Aug 24, 2011

That regex confuses me, I can't quite parse it. Is it backwards-compatible, i.e. if I upgrade and then try to do an incremental replication using the old "integer-binary" format for the value of since will I have to start from scratch?

@rnewson

This comment has been minimized.

Show comment
Hide comment
@rnewson

rnewson Aug 24, 2011

Member

No, you won't start from scratch, this regexp matches the old and new format (I verified). I'll explain the regexp;

(?[a-zA-Z0-9-_]+)(["]])*$

the expression is anchored at the end, it then ignores runs of quotation marks and right brackets, and then greedily captures a run of chars permitted in base64 with url encoding.

Update sequences come in as either "1-foo", or "[1,"foo"]", so this expression matches both.

Member

rnewson commented Aug 24, 2011

No, you won't start from scratch, this regexp matches the old and new format (I verified). I'll explain the regexp;

(?[a-zA-Z0-9-_]+)(["]])*$

the expression is anchored at the end, it then ignores runs of quotation marks and right brackets, and then greedily captures a run of chars permitted in base64 with url encoding.

Update sequences come in as either "1-foo", or "[1,"foo"]", so this expression matches both.

@kocolosk

This comment has been minimized.

Show comment
Hide comment
@kocolosk

kocolosk Aug 24, 2011

Member

Thanks for the explanation @rnewson. Can you add a BugzID link and a little bit of detail in the commit message about the rationale for the switch and the supported formats?

Member

kocolosk commented Aug 24, 2011

Thanks for the explanation @rnewson. Can you add a BugzID link and a little bit of detail in the commit message about the rationale for the switch and the supported formats?

Robert Newson
Make clustered update_seq sort correctly
The CouchDB replicator gets confused when it sees an update sequence
of "100-" follow "99-" as it's treated as a string (where CouchDB uses
integers). This commit changes the format to [integer(), binary()]
which sort correctly.

The revised regular expression matches both the old and new update_seq
pattern. It is anchored at the end of the string and ignores any
trailing quotation marks and right brackets (which are present when
the replicator passes an update_seq as a string). It then greedily
matches all consecutive base64 (url-encoded variant) characters.

BugzID: 10986

@kocolosk kocolosk merged commit d47bb25 into master Aug 24, 2011

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment