-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use --include=* or --anything instead of --copies 0 to speed up get_content_annexinfo #7230
Conversation
4603ccc
to
710b692
Compare
the mac os failure is due to an outdated mac build image |
…t_annexinfo Apparently `--copies 0` could result in up to 10 (or more) penalty of running find or findref. --include=* was recommended by Joey in https://git-annex.branchable.com/todo/add_--all___40__or_alike__41___to_find_and_findref/ Closes datalad#7038
…support Since there is still no release interim the dates, I think such comparison - should be safe - would allow us to immediately take advantage of this OPT while testing datalad/git-annex against datalad master. Here are some timings of all 3 possible options ❯ pwd /home/yoh/datalad/dandi/dandisets/000026 ❯ time git annex find --copies 0 | wc -l 20575 git annex find --copies 0 12.67s user 1.17s system 120% cpu 11.513 total wc -l 0.01s user 0.10s system 0% cpu 11.513 total ❯ time git annex find --include='*' | wc -l 20575 git annex find --include='*' 1.18s user 0.14s system 134% cpu 0.984 total wc -l 0.01s user 0.02s system 3% cpu 0.984 total ❯ time git annex find --anything | wc -l 20575 git annex find --anything 0.71s user 0.18s system 157% cpu 0.567 total wc -l 0.02s user 0.01s system 5% cpu 0.566 total So --anything leads to almost twice faster performance than --include=*, so worth it.
710b692
to
a57f461
Compare
great -- thanks! I rebased. The main point is that "it works" so we should proceed with this PR. I also added now support for new
I think we are ready, taking out of the draft -- your feedback is very welcome! I hope it makes cut for 0.18.0. |
Codecov ReportBase: 88.73% // Head: 88.73% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #7230 +/- ##
==========================================
- Coverage 88.73% 88.73% -0.01%
==========================================
Files 325 325
Lines 44124 44184 +60
Branches 5867 5880 +13
==========================================
+ Hits 39154 39205 +51
- Misses 4955 4964 +9
Partials 15 15
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
hm, crippled one freaked out:
which has no "track" in our issue tracker besides it used to fail and was disabled on windows: #5126 . rerunning that job now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused. The PR title and the diff don't quite fit - there's no --largerthan
.
Changed approach or incomplete PR, @yarikoptic ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than that confusion, looks good to me!
So, if you feel no changelog required (which is fine with me), I'm ready to merge this.
Thank you @bpoldrack . Indeed changed approach and need changelog. |
Code Climate has analyzed commit ef96e8a and detected 0 issues on this pull request. View more on Code Climate. |
PR released in |
Apparently
--copies 0
could result in up to 10 (or more) penalty of running find or findref.TODO
--anything
which was added in 10.20221212-17-g0b2dd374d on Dec 20 (so we can compare against10.20221220
version) -- but yet to wait for datalad/git-annex get that build