Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sort & validate TSI key value insertion. #8995

Merged
merged 1 commit into from
Oct 23, 2017

Conversation

benbjohnson
Copy link
Contributor

@benbjohnson benbjohnson commented Oct 20, 2017

Overview

The TagBlockEncoder value sorting was removed at some point and the underlying input was done via a range over a map which has unpredictable sorting. This fixes the sorting and adds additional validation to ensure inputs are always ordered when compacting index files.

Required for all non-trivial PRs
  • Rebased/mergable
  • Tests pass
  • CHANGELOG.md updated

@ghost ghost assigned benbjohnson Oct 20, 2017
@ghost ghost added the review label Oct 20, 2017
Copy link
Contributor

@e-dard e-dard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍 Just the question about the encoder ordering check via bytes.Compare.

@@ -539,6 +543,13 @@ func (enc *TagBlockEncoder) EncodeValue(value []byte, deleted bool, seriesIDs []
return fmt.Errorf("zero length tag value not allowed")
}

// Validate that keys are in-order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How often will we call EncodeValue? I only ask as this bytes.Compare call could be an expensive way to test the encoder is working properly. Or is returning an error a "normal path" for this encoder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's called once for every tag value in the index. I think the cost should be relatively small compared to the merging and writing. We can remove it if it becomes a performance bottleneck but I think it'd be good to keep it in the short-term at least to ensure we don't store unsorted data.

@benbjohnson benbjohnson merged commit 63324c2 into influxdata:master Oct 23, 2017
@ghost ghost removed the review label Oct 23, 2017
@benbjohnson benbjohnson deleted the tsi-sort-log-tag-values branch October 23, 2017 16:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants