Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

irmin-pack #615

Open
wants to merge 6 commits into
base: master
from

Conversation

Projects
None yet
3 participants
@samoht
Copy link
Member

commented Feb 11, 2019

/cc @dinosaure and @Julow

@samoht samoht force-pushed the samoht:pack branch 3 times, most recently from 921992d to ca4a563 Feb 11, 2019

@samoht samoht force-pushed the samoht:pack branch 2 times, most recently from 8f690bb to 510daea Feb 19, 2019

@samoht samoht changed the title [WIP] irmin-pack irmin-pack Feb 20, 2019

@samoht

This comment has been minimized.

Copy link
Member Author

commented Feb 20, 2019

This is now passing all the tests (at least locally, let's see what the CI is saying).

I still need to benchmark how much space we gain vs. other backends and have a better ideas of the runtime costs.

@samoht samoht force-pushed the samoht:pack branch from 510daea to 38c8c0d Feb 28, 2019

@samoht

This comment has been minimized.

Copy link
Member Author

commented Feb 28, 2019

I've rebased this against master.

@samoht samoht force-pushed the samoht:pack branch 2 times, most recently from 929c614 to f4b9c5c Mar 3, 2019

@zshipko zshipko added this to the 2.0 milestone Mar 6, 2019

@samoht samoht force-pushed the samoht:pack branch 4 times, most recently from dd471a5 to fb44190 Mar 6, 2019

@samoht samoht force-pushed the samoht:pack branch 2 times, most recently from 6e12539 to dc9d15d Mar 14, 2019

@samoht

This comment has been minimized.

Copy link
Member Author

commented Mar 17, 2019

What is needed before merging:

  • atomic update of offsets -- probably using a separate file + atomic rename
  • add a proper file header to distinguish between the various file kinds
  • improve the way the dictionary is loaded by reading pages
  • a few more benchmarks
  • use file offset instead of index ids in hash compression for packs

@samoht samoht force-pushed the samoht:pack branch from dc9d15d to f5d1602 Mar 26, 2019

@samoht samoht force-pushed the samoht:pack branch 5 times, most recently from d67b04c to 7dabf55 Apr 11, 2019

@samoht samoht force-pushed the samoht:pack branch 4 times, most recently from 45cf064 to 365a454 Apr 18, 2019

@samoht

This comment has been minimized.

Copy link
Member Author

commented May 13, 2019

Early results, on the Tezos codebase.

# for 80k blocks (commits)
$ du -h ~/.tezos-node/context/*
9.9M    /home/samoht/.tezos-node/context/store.dict
185M    /home/samoht/.tezos-node/context/store.index
206M    /home/samoht/.tezos-node/context/store.pack
@avsm

This comment has been minimized.

Copy link
Member

commented May 13, 2019

Wow, that's close to an order of magnitude improvement from current Tezos.

@samoht samoht force-pushed the samoht:pack branch 7 times, most recently from b854baa to ba1de0a May 14, 2019

samoht added some commits Feb 20, 2019

irmin-pack: add a new "packed" backend
All objects are stored in a single file with a separate index
for fast random reads. The representation of hashes and filenames
are compressed using a separate dictionnary.

@samoht samoht force-pushed the samoht:pack branch from ba1de0a to 84a1e5d May 17, 2019

samoht added some commits May 17, 2019

@samoht samoht force-pushed the samoht:pack branch from c217d1a to 7bb3a63 May 17, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.