Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upchange log ids in data keys and current table #207
Conversation
ekg
added some commits
Oct 16, 2014
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
|
This depends on mafintosh/level-dat#1 |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
ekg
Oct 17, 2014
Contributor
It looks like the build error results from the requirement that we update the level-dat version.
|
It looks like the build error results from the requirement that we update the level-dat version. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
maxogden
Nov 15, 2014
Member
Just commenting here for posterity, we have discussed this pull request and are still investigating if it is the right approach. Keeping it open for now
|
Just commenting here for posterity, we have discussed this pull request and are still investigating if it is the right approach. Keeping it open for now |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
maxogden
Dec 4, 2014
Member
gonna close for now, but we will definitely make sure this gets in after @mafintosh refactors the storage stuff
|
gonna close for now, but we will definitely make sure this gets in after @mafintosh refactors the storage stuff |
ekg commentedOct 17, 2014
The addition of change log ids to the data keys (after the versions), will allow us to quickly extract the state of the data at particular point in the change log. This can be accomplished via a linear scan of the keys in the data table, requiring that the change id of a particular object is <= the target point in the log.
If we did not include this data alongside the data, we would be forced to complete a reconstruction of the dataset via the change log. This would complicate the process of rolling back particular subsets of the data to predetermined points in the history. Additionally, it wouldn't be possible to quickly determine the relative age of two objects, which has a number of possible applications in reproducibility and logging.
No functionality is yet tested which is based on the change ids, but the next step should be to implement a commit/checkout or checkpoint/rollback model on top of it.