Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store global sequence number in FileMetadata #4270

Closed
petermattis opened this issue Aug 13, 2018 · 5 comments
Closed

store global sequence number in FileMetadata #4270

petermattis opened this issue Aug 13, 2018 · 5 comments

Comments

@petermattis
Copy link
Contributor

When an external sst is ingested it is assigned a global sequence number which applies to all of the entries in the sstable. This is done by setting the rocksdb.external_sst_file.global_seqno property stored in the sstable itself. The way the property is written is a bit awkward as it is done by having the property reading code store the byte offset of the property within the sstable and then overwriting that value when the file is ingested. Besides the awkwardness, this has the effect of mutating the ingested file violating the usual invariant that sstables are immutable (which causes CockroachDB to jump through a few hoops). It also prohibits using a global sequence number with an sstable created without the global sequence number property.

As an alternative, we could add a FileMetadata::global_seqno field and persist that field to the MANIFEST. Note that we're already writing the MANIFEST as part of ingestion anyway. Is there any downside to storing the global sequence number outside of the sstable? I suppose a tool like sst_dump won't see the correct sequence number, but that seems minor.

@riversand963
Copy link
Contributor

@petermattis thanks for your post. We noticed the problem you mentioned and have already provided an alternative way of writing global_seqno when ingesting external SST files.
Do you think #4188 can solve your problem? In this PR, we are using the information already stored in the MANIFEST to assign global seqno when we read the externally-ingested SST.

@petermattis
Copy link
Contributor Author

@riversand963 Yep, that is more or less the same idea, but without changing the MANIFEST format. There is a backwards compatibility issue, though (present in my proposal as well). A DB containing an sstable ingested with RocksDB #4188 will be silently corrupted by an earlier version of RocksDB.

@riversand963
Copy link
Contributor

@petermattis That's right. We currently allow users to force writing global_seqno anyway by setting IngestExternalFileOptions::write_global_seqno to true. Now this member variable is true by default. We plan to consider changing its default value in the future though.

@petermattis
Copy link
Contributor Author

@riversand963 Great. I didn't see that in #4188, but somehow the commit which got merged did have write_global_seqno.

@riversand963
Copy link
Contributor

@petermattis Sorry for the confusion. I was referring to #4172 . #4188 was abandoned by me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants