Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More features in bench/tree.exe #1269

Merged
merged 12 commits into from
Feb 8, 2021
Merged

Conversation

Ngoguey42
Copy link
Contributor

@Ngoguey42 Ngoguey42 commented Jan 28, 2021

Diff 1: --quick is now --mode quick, and --mode gives more choices.
Diff 2: The configurations for the quick and slow modes are now hard coded and can't be changed from the CLI. The other modes can be configured from the CLI.
Diff 3: Add --flatten for trace mode to collapse the 6 step-long hashes to single steps.
Diff 4: Add a progress bar using https://github.com/CraigFe/progress
Diff 5: Trace mode now tracks the time spent by the 8 unitary operation, using mtime and https://github.com/barko/bentov (more info in tree.ml).

I also have a python script to plot the histograms
tree_normal
tree_flatten

@Ngoguey42 Ngoguey42 added the no-changelog-needed No changelog is needed here label Jan 28, 2021
@Ngoguey42 Ngoguey42 mentioned this pull request Feb 1, 2021
@Ngoguey42 Ngoguey42 marked this pull request as draft February 2, 2021 16:47
@Ngoguey42 Ngoguey42 changed the title More options on tree.exe WIP: More options on tree.exe Feb 3, 2021
@Ngoguey42 Ngoguey42 force-pushed the profile_inodes branch 2 times, most recently from e0018e3 to bbba35e Compare February 4, 2021 12:02
@Ngoguey42 Ngoguey42 changed the title WIP: More options on tree.exe More features in bench/tree.exe Feb 4, 2021
@Ngoguey42 Ngoguey42 marked this pull request as ready for review February 4, 2021 12:12
@Ngoguey42 Ngoguey42 removed the no-changelog-needed No changelog is needed here label Feb 4, 2021
Copy link
Contributor

@icristescu icristescu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks great! Can you share your python script too?

bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
|> Hashtbl.of_seq

(** Find and print the largest directory. This may be useful in the future to
make sure that we are benching on million-sized directories. *)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should not add code that we don't use right now. If you think it's useful you can add a cli argument to print this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have to refactor a big chunk of code to avoid printing that in the middle of the progress bar. My next PR on that file will most likely take care of it, so I'd like to keep it if that is OK for you.

bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
@Ngoguey42
Copy link
Contributor Author

Where would be a good place to put my python script? Slack? Gist? Irmin's repo?

@Ngoguey42
Copy link
Contributor Author

Ngoguey42 and others added 6 commits February 5, 2021 13:27
Diff 1: `--quick` is now `--mode quick`, and `--mode` gives more choices.
Diff 2: The configurations for the `quick` and `slow` modes are now hard coded and can't be changed from the CLI. The other modes can be configured from the CLI.
Diff 3: Add `--flatten` for trace mode to collapse the 6 step-long hashes to single steps.
Diff 4: Add a progress bar using https://github.com/CraigFe/progress
Diff 5: Trace mode now tracks the time spent by the 8 unitary operation, using mtime and https://github.com/barko/bentov (more info in tree.ml).
Co-authored-by: Ioana Cristescu  <icristescu@users.noreply.github.com>
@craigfe
Copy link
Member

craigfe commented Feb 5, 2021

I keep this sort of thing in an irmin-notebooks repository. Now might be a good time to put that somewhere we can all have access too; maybe tarides/irmin-notebooks?

@Ngoguey42
Copy link
Contributor Author

Yes, we should centralise those scripts.

Although I don't feel like turning this one into a notebook, I find it more convenient to manipulate image files in order to compare 2 revisions or 2 variations of a parameter.

@craigfe
Copy link
Member

craigfe commented Feb 5, 2021

Notebooks can run savefig too :-) I don't think it needs to be a case of one or the other.

@Ngoguey42
Copy link
Contributor Author

Okey I can do that. I also want to be able to run from command line, which is possible for a notebook after installing some more python tools.

@Ngoguey42
Copy link
Contributor Author

My latest commit writes to disk the json histogram.

If you have no more comments this PR may be merged today.

Copy link
Member

@craigfe craigfe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just style nits.

CHANGES.md Outdated Show resolved Hide resolved
|> Benchmark.run config
in
let+ () = Store.Repo.close repo in
(config, result)
fun fmt ->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is already in this file in a few places, but conventionally fmt is a format-string, and ppf is a Format.formatter.

bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
bench/irmin-pack/tree.ml Outdated Show resolved Hide resolved
ngoguey and others added 3 commits February 5, 2021 16:51
@Ngoguey42 Ngoguey42 merged commit c526079 into mirage:master Feb 8, 2021
craigfe added a commit to craigfe/opam-repository that referenced this pull request Feb 16, 2021
… irmin-chunk, irmin-pack, irmin-test, irmin-http, irmin-unix, ppx_irmin, irmin-bench, irmin-graphql, irmin-containers, irmin-mirage-git and irmin-mirage-graphql (2.5.0)

CHANGES:

### Changed

- **irmin**
  - `Store.Tree.remove` is now much faster when operating on large directories.
    The commits following removals are also much faster. (mirage/irmin#1289, @Ngoguey42)

  - Changed `Store.Tree.{of_hash, shallow}` to take kinded hashes, allowing the
    creation of unforced contents values. (mirage/irmin#1285, @craigfe)

  - Changed `Tree.destruct` to return _lazy_ contents values, which may be forced
    with `Tree.Contents.force`. (mirage/irmin#1285, @craigfe)

- **irmin-bench**
  - New features in benchmarks for tree operations (mirage/irmin#1269, @Ngoguey42)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants