-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: faster ingestion #25
Comments
Would we need to create a new memtable on every ingestion so that iterators may respect sequence order within the flushables? |
Yes. That could leave a significant amount of wasted space in the memtables, but I think that memory is accounted for and will result in a flush eventually occurring. As mentioned in cockroachdb/cockroach#62700, if the ingested sstables are small we could convert the ingestions into write batches. Same idea as above that the WAL entry would point to the sstable on disk, but rather than appending the sstable to the memtable list we loop over the contents of the sstable and insert it into the memtable. |
My understanding is that there is also a hiccup for concurrent normal writes that are assigned a seqnum after this ingest. They will need to wait until their seqnum becomes visible, which is blocked behind the ingest waiting for the memtable(s) to be flushed so that it can then update the manifest. |
This pr implements the feature mentioned in RFC: https://github.com/cockroachdb/pebble/pull/1586/files Closes #25
DB.Ingest
currently experiences a hiccup if the table being ingested overlaps with a memtable: ingestion needs to wait for the memtable to be flushed. This is necessary because the ingested sstable is given a sequence number newer than entries in the memtable. We cannot add the ingested table to the LSM until overlapping entries with older sequence numbers are written to L0.One way to avoid the hiccup is to have ingestion lazily add the table to the LSM. If the ingested table overlaps with a memtable, an entry is added to the WAL (ensuring that the action will be completed in the face of a crash) and the ingested table is appended to the list of memtables. We'll have to add a small wrapper around the sstable so that it implements the
flushable
interface and add some logic so that the table isn't actually flushed, but it would then simply be added to the L0 metadata when it is time to flush (i.e. a new table would not be created).The text was updated successfully, but these errors were encountered: