change log ids in data keys and current table by ekg · Pull Request #207 · dat-ecosystem/dat

ekg · 2014-10-17T19:12:35Z

The addition of change log ids to the data keys (after the versions), will allow us to quickly extract the state of the data at particular point in the change log. This can be accomplished via a linear scan of the keys in the data table, requiring that the change id of a particular object is <= the target point in the log.

If we did not include this data alongside the data, we would be forced to complete a reconstruction of the dataset via the change log. This would complicate the process of rolling back particular subsets of the data to predetermined points in the history. Additionally, it wouldn't be possible to quickly determine the relative age of two objects, which has a number of possible applications in reproducibility and logging.

No functionality is yet tested which is based on the change ids, but the next step should be to implement a commit/checkout or checkpoint/rollback model on top of it.

By adding a reference to the change index id to data table keys, we can quickly revert the current table of the repository to a particular checkpoint. These changes only enable the storage of the change index keys. This is not a stable commit. A majority of tests now pass, but there are still significant issues.

The addition of change log ids to the data keys (after the versions), will allow us to quickly extract the state of the data at particular point in the change log. This can be accomplished via a linear scan of the keys in the data table, requiring that the change id of a particular object is <= the target point in the log. If we did not include this data alongside the data, we would be forced to complete a reconstruction of the dataset via the change log. This would complicate the process of rolling back particular subsets of the data to predetermined points in the history. Additionally, it wouldn't be possible to quickly determine the relative age of two objects, which has a number of possible applications in reproducibility and logging. The level-dat backend will support these change ids as of 4.5.0. No functionality is yet tested which is based on the change ids, but the next step should be to implement a commit/checkout or checkpoint/rollback model on top of it. With this update we now pass 616/616 tests.

ekg · 2014-10-17T19:23:42Z

This depends on mafintosh/level-dat#1

ekg · 2014-10-17T19:24:22Z

It looks like the build error results from the requirement that we update the level-dat version.

max-mapper · 2014-11-15T07:13:10Z

Just commenting here for posterity, we have discussed this pull request and are still investigating if it is the right approach. Keeping it open for now

max-mapper · 2014-12-04T00:12:52Z

gonna close for now, but we will definitely make sure this gets in after @mafintosh refactors the storage stuff

ekg added 3 commits October 16, 2014 19:13

Merge branch 'master' of https://github.com/maxogden/dat

030df1e

ekg mentioned this pull request Oct 17, 2014

change log ids in data keys and current table mafintosh/level-dat#1

Open

max-mapper closed this Dec 4, 2014

max-mapper mentioned this pull request Dec 23, 2014

snapshots/checkout support #256

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

change log ids in data keys and current table#207

change log ids in data keys and current table#207
ekg wants to merge 3 commits intodat-ecosystem:masterfrom
ekg:master

ekg commented Oct 17, 2014

Uh oh!

ekg commented Oct 17, 2014

Uh oh!

ekg commented Oct 17, 2014

Uh oh!

max-mapper commented Nov 15, 2014

Uh oh!

max-mapper commented Dec 4, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ekg commented Oct 17, 2014

Uh oh!

ekg commented Oct 17, 2014

Uh oh!

ekg commented Oct 17, 2014

Uh oh!

max-mapper commented Nov 15, 2014

Uh oh!

max-mapper commented Dec 4, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants