-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for deletes #18
Comments
New design: Deletes will be maintained in a separate file the references the file, and fileoffset (possibly row number). We'll need to look up this for each and every read. When a delete transaction ends, we will mark the file on disk as "done" and start a new file for all new rows. The idea is to limit the size of files, once we know we have deleted rows. A secondary daemon process will come back by and compact files. Thus we can safely remove old rows async, but not have to store the deletes forever. We will create n number of files, unless the compact process can start. When we compact, all non-active files will be read and moved into a single larger file. We need to add support for reading from multiple files, scanning across each one. This also lays the foundation for online schema upgrades, were we can possibly map the old schema to a new schema, and convert the data ondisk async. |
This implements deletes in a single file. Initial ground work is layed for multiple delete and multiple data files.
Delete support is needed. Deletes can be done in multiple ways.
"1)" or "2) "are pretty equivalent. The advantage to 2 is during a table scan one does not have to parse the message only to find it has been deleted. It might also be that option 1 makes roll back easier. However with 1 or 2 we still need to maintain a list of the ongoing rows touched in the transactions.
"3)" Does not seem to have a large benefit. If we keep the rows separate, then we just have to read that into memory and still do a comparison. The only upside compared to 1, is we don't have to parse capnp proto message to see if it is deleted or not, we can store the file offset and skip that way.
With option 2) We can also have a daemon process that periodically reorganizes a table that is closed, so zero'ed out space that is not a message is removed, truncating the file size.
The text was updated successfully, but these errors were encountered: