-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some consistency checks to inodes #1253
Conversation
f35b098
to
e7dd922
Compare
83a1158
to
0fe4183
Compare
src/irmin-pack/ext.ml
Outdated
let check ~kind ~offset ~length k = | ||
match kind with | ||
| `Contents -> Ok () | ||
| `Node -> X.Node.CA.integrity_check_inodes ~offset ~length k nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If i'm not mistaken, this is the source of the problem for your slow check on tezos.
X.Node.CA.integrity_check_inodes
is called for all objects in the pack file with the magic 'I'
or 'N'
, but X.Node.CA.integrity_check_inodes
itself recursively loads all the inode tree below.
For each inode tree made of N objects, the number of calls to Pack.unsafe_find
is not O(N) as we would expect, but O(N*N)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed offline, I modified this and we do a graph traversal instead to ensure that we check only the "root" nodes for inodes instead of checking all the intermediate nodes.
I'm running it on the 42G store and after around 3 hours I'm at
1252049k nodes / 3k commits
maybe I missed something again, and we can be faster? (the integrity check is also very slow, around 10 hours for 220k commits).
0fe4183
to
2d2ebde
Compare
2d2ebde
to
65941d8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your traversal algorithms seem fine to me, I guess 3 hours is a normal duration to check a 42GB store...
c939dc5
to
c1d303c
Compare
c1d303c
to
69dffdb
Compare
Co-authored-by: Nicolas Goguey <ngoguey@student.42.fr>
69dffdb
to
d5e3679
Compare
… irmin-chunk, irmin-pack, irmin-test, irmin-http, irmin-unix, ppx_irmin, irmin-bench, irmin-graphql, irmin-containers, irmin-mirage-git and irmin-mirage-graphql (2.4.0) CHANGES: ### Fixed - **irmin-pack** - Fix a bug in `inode` where the `remove` function could cause hashing instabilities. No user-facing change since this function is not being used yet. (mirage/irmin#1247, @Ngoguey42, @icristescu) - **irmin** - Ensure that `Tree.add_tree t k v` complexity does not depend on `v` size. (mirage/irmin#1267, @samoht @Ngoguey42 and @craigfe) ### Added - **irmin** - Added a `Perms` module containing helper types for using phantom-typed capabilities as used by the store backends. (mirage/irmin#1262, @craigfe) - Added an `Exported_for_stores` module containing miscellaneous helper types for building backends. (mirage/irmin#1262, @craigfe) - Added new operations `Tree.update` and `Tree.update_tree` for efficient read-and-set on trees. (mirage/irmin#1274, @craigfe) - **irmin-pack**: - Added `integrity-check-inodes` command to `irmin-fsck` for checking the integrity of inodes. (mirage/irmin#1253, @icristescu, @Ngoguey42) - **irmin-bench** - Added benchmarks for tree operations. (mirage/irmin#1237, @icristescu, @Ngoguey42, @craigfe) #### Changed - The `irmin-mem` package is now included with the `irmin` package under the library name `irmin.mem`. It keeps the same top-level module name of `Irmin_mem`. (mirage/irmin#1276, @craigfe) #### Removed - `Irmin_mem` no longer provides the layered in-memory store `Make_layered`. This can be constructed manually via `Irmin_layers.Make`. (mirage/irmin#1276, @craigfe)
… irmin-chunk, irmin-pack, irmin-test, irmin-http, irmin-unix, ppx_irmin, irmin-bench, irmin-graphql, irmin-containers, irmin-mirage-git and irmin-mirage-graphql (2.4.0) CHANGES: ### Fixed - **irmin-pack** - Fix a bug in `inode` where the `remove` function could cause hashing instabilities. No user-facing change since this function is not being used yet. (mirage/irmin#1247, @Ngoguey42, @icristescu) - **irmin** - Ensure that `Tree.add_tree t k v` complexity does not depend on `v` size. (mirage/irmin#1267, @samoht @Ngoguey42 and @craigfe) ### Added - **irmin** - Added a `Perms` module containing helper types for using phantom-typed capabilities as used by the store backends. (mirage/irmin#1262, @craigfe) - Added an `Exported_for_stores` module containing miscellaneous helper types for building backends. (mirage/irmin#1262, @craigfe) - Added new operations `Tree.update` and `Tree.update_tree` for efficient read-and-set on trees. (mirage/irmin#1274, @craigfe) - **irmin-pack**: - Added `integrity-check-inodes` command to `irmin-fsck` for checking the integrity of inodes. (mirage/irmin#1253, @icristescu, @Ngoguey42) - **irmin-bench** - Added benchmarks for tree operations. (mirage/irmin#1237, @icristescu, @Ngoguey42, @craigfe) #### Changed - The `irmin-mem` package is now included with the `irmin` package under the library name `irmin.mem`. It keeps the same top-level module name of `Irmin_mem`. (mirage/irmin#1276, @craigfe) #### Removed - `Irmin_mem` no longer provides the layered in-memory store `Make_layered`. This can be constructed manually via `Irmin_layers.Make`. (mirage/irmin#1276, @craigfe)
No description provided.