Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: allow rewriting spans keys that may not decode as keys #38341

Merged
merged 2 commits into from
Jun 26, 2019

Conversation

dt
Copy link
Member

@dt dt commented Jun 21, 2019

backupccl: allow rewriting spans keys that may not decode as keys

When rewriting the spans to pre-split and distribute work in RESTORE, we
use the key rewriter to rewrite span boundaries. This usually works
well, but in a few cases, valid span boundaries may no longer be
entirely made of valid value encodings. For example, .PrefixEnd alters
the trailing bytes to increment them, but this will break decoding of
byte string that assumes a very specific trailing terminator.

PeekLength, and key decoding in general, can fail when reading the last
value from a key that is coming from a span. Keys in spans are often
altered e.g. by calling Next() or PrefixEnd() to ensure a given span is
inclusive or for other reasons, but the manipulations sometimes change
the encoded bytes, meaning they can no longer successfully decode as
back to the original values. This is OK when span boundaries mostly are
only required to be even divisions of keyspace, but when we try to go
back to interpreting them as keys, it can fall apart. Partitioning a
table (and applying zone configs) eagerly creates splits at the defined
partition boundaries, using PrefixEnd for their ends, resulting in such
spans.

Fortunately, the only common span manipulations are to the trailing byte
of a key (e.g. incrementing or appending a null) so for our needs here,
if we fail to decode because of one fo those manipulations, we can
assume that we are at the end of the key as far as fields where a table
ID which needs to be replaced can appear and consider the rewrite of
this key as being completed successfully.

Finally unlike key rewrites of actual row-data, span rewrites do not
need to be perfect: spans are only rewritten for use in pre-splitting
and work distribution, so even if it turned out that this assumption was
incorrect, it could cause a performance degradation but does not pose a
correctness risk.

Release note (bug fix): fix an issue that prevented restoring some
backups if they included tables that were partitioned by columns of a
certain types while also interleaved by child tables.

@dt dt requested review from maddyblue, danhhz, thoszhang and a team June 21, 2019 03:26
@cockroach-teamcity
Copy link
Member

This change is Reviewable

pkg/ccl/storageccl/key_rewriter.go Outdated Show resolved Hide resolved
@dt dt closed this Jun 21, 2019
@dt dt deleted the interleave branch June 21, 2019 04:37
@dt dt restored the interleave branch June 21, 2019 04:37
@dt dt reopened this Jun 21, 2019
When rewriting the spans to pre-split and distribute work in RESTORE, we
use the _key_ rewriter to rewrite _span_ boundaries. This usually works
well, but in a few cases, valid span boundaries may no longer be
entirely made of valid value encodings. For example, .PrefixEnd alters
the trailing bytes to increment them, but this will break decoding of
byte string that assumes a very specific trailing terminator.

 PeekLength, and key decoding in general, can fail when reading the last
value from a key that is coming from a span. Keys in spans are often
altered e.g. by calling Next() or PrefixEnd() to ensure a given span is
inclusive or for other reasons, but the manipulations sometimes change
the encoded bytes, meaning they can no longer successfully decode as
back to the original values. This is OK when span boundaries mostly are
only required to be even divisions of keyspace, but when we try to go
back to interpreting them as keys, it can fall apart. Partitioning a
table (and applying zone configs) eagerly creates splits at the defined
partition boundaries, using PrefixEnd for their ends, resulting in such
spans.

Fortunately, the only common span manipulations are to the trailing byte
of a key (e.g. incrementing or appending a null) so for our needs here,
if we fail to decode because of one fo those manipulations, we can
assume that we are at the end of the key as far as fields where a table
ID which needs to be replaced can appear and consider the rewrite of
this key as being completed successfully.

Finally unlike key rewrites of actual row-data, span rewrites do not
need to be perfect: spans are only rewritten for use in pre-splitting
and work distribution, so even if it turned out that this assumption was
incorrect, it could cause a performance degradation but does not pose a
correctness risk.

Release note (bug fix): fix an issue that prevented restoring some
backups if they included tables that were partitioned by columns of a
certain types while also interleaved by child tables.
@dt
Copy link
Member Author

dt commented Jun 24, 2019

I think this is the third time that trying to decode spans as keys has confused RESTORE. I wonder if given that spans are only really needed for pre-splitting / sharing work evenly during RESTORE, we should allow RESTORE to continue even if they fail to rewrite so that you can at least still restore, though maybe slower. Though I fear that if we did that, it'd probably rot.

@danhhz
Copy link
Contributor

danhhz commented Jun 24, 2019

I don't have all this loaded up anymore, but is it possible to undo a PrefixEnd? If you hit an encoding error, you could attempt to undo a PrefixEnd and do it again. Maybe you'd also have to try to undo a Next

@dt
Copy link
Member Author

dt commented Jun 25, 2019

@danhhz Yep, that was exactly my first thought too and I started with a patch using that approach (there's an UndoPrefixEnd in encoding). However working though that approach, I ended up changing my mind as I went, switching to this approach instead.

Part of it is that 1) you can't always undo PrefixEnd since can throw away bytes after it carries its increment and 2) I couldn't convince myself that operating on the post-undo key was always correct and then 3) we can't even be sure a key was actually PrefixEnd'ed to start with. On top of those, there could be other things done to spans now or in the future that break this, or other encodings (which could also be added without this code and tests noticing). Given all those (in particular 1), I ended up concluding we'd still want to handle a non-decodable suffix, so I was back where I started.

All of that got thinking that baking assumptions into this code about spans and key encoding was making this brittle, and going further down that path seemed likely to be difficult to maintain. I tried to think about what we actually need from this code: a corse partitioning of the target keyspace for pre-splitting. Span bounds at least have a prefix of valid key encoding, since they serve to partition the key space of valid key byte strings, and if we can rewrite that prefix, we should end up with enough of a partitioning of the target keyspace.

@dt
Copy link
Member Author

dt commented Jun 26, 2019

based on offline discussion with @danhhz I think we concluded this is good to go for now -- longer term we might want a more general solution to avoid trying to decode spans entirely in the interest of making this less brittle, but for now this change should fix the known bug.

bors r+

craig bot pushed a commit that referenced this pull request Jun 26, 2019
38341:  backupccl: allow rewriting spans keys that may not decode as keys r=dt a=dt

backupccl: allow rewriting spans keys that may not decode as keys

When rewriting the spans to pre-split and distribute work in RESTORE, we
use the _key_ rewriter to rewrite _span_ boundaries. This usually works
well, but in a few cases, valid span boundaries may no longer be
entirely made of valid value encodings. For example, .PrefixEnd alters
the trailing bytes to increment them, but this will break decoding of
byte string that assumes a very specific trailing terminator.

 PeekLength, and key decoding in general, can fail when reading the last
value from a key that is coming from a span. Keys in spans are often
altered e.g. by calling Next() or PrefixEnd() to ensure a given span is
inclusive or for other reasons, but the manipulations sometimes change
the encoded bytes, meaning they can no longer successfully decode as
back to the original values. This is OK when span boundaries mostly are
only required to be even divisions of keyspace, but when we try to go
back to interpreting them as keys, it can fall apart. Partitioning a
table (and applying zone configs) eagerly creates splits at the defined
partition boundaries, using PrefixEnd for their ends, resulting in such
spans.

Fortunately, the only common span manipulations are to the trailing byte
of a key (e.g. incrementing or appending a null) so for our needs here,
if we fail to decode because of one fo those manipulations, we can
assume that we are at the end of the key as far as fields where a table
ID which needs to be replaced can appear and consider the rewrite of
this key as being completed successfully.

Finally unlike key rewrites of actual row-data, span rewrites do not
need to be perfect: spans are only rewritten for use in pre-splitting
and work distribution, so even if it turned out that this assumption was
incorrect, it could cause a performance degradation but does not pose a
correctness risk.

Release note (bug fix): fix an issue that prevented restoring some
backups if they included tables that were partitioned by columns of a
certain types while also interleaved by child tables.

38442: sql: Adding tests for loose index scan on interleaved tables r=rohany a=rohany

We were unsure of the support for the loose index scan table reader on interleaved tables, so tests have been added.

Addresses #38413

Co-authored-by: David Taylor <tinystatemachine@gmail.com>
Co-authored-by: Rohan Yadav <rohany@alumni.cmu.edu>
@craig
Copy link
Contributor

craig bot commented Jun 26, 2019

Build succeeded

@craig craig bot merged commit 6c52765 into cockroachdb:master Jun 26, 2019
@dt dt deleted the interleave branch June 28, 2019 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants