Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPT: status/diff performance on dataset hierarchies #3314

Merged
merged 1 commit into from Apr 10, 2019

Conversation

@mih
Copy link
Member

@mih mih commented Apr 10, 2019

  • Avoid subdataset recursion with diffstatus(to != None)
    This will give a stellar performance boost for rev-diff on large dataset hierarchies, but even for a test case with an small dataset that has 10 direct (empty) subdatasets, the performance gain is ~25% already (down for 1.4s to 1s). On the the /// dataset is goes down from 7min to 1.4s
This will give a stellar performance boost for `rev-diff` on large
dataset hierarchies, but even for a test case with an small dataset
that has 10 direct (empty) subdatasets, the performance gain is ~25%
already (done for 1.4s to 1s).
@codecov
Copy link

@codecov codecov bot commented Apr 10, 2019

Codecov Report

Merging #3314 into master will increase coverage by 0.02%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3314      +/-   ##
==========================================
+ Coverage   91.02%   91.04%   +0.02%     
==========================================
  Files         263      263              
  Lines       34230    34230              
==========================================
+ Hits        31157    31164       +7     
+ Misses       3073     3066       -7
Impacted Files Coverage Δ
datalad/support/gitrepo.py 89.1% <100%> (ø) ⬆️
datalad/downloaders/http.py 85.31% <0%> (+2.77%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2e283e6...8991bec. Read the comment docs.

@mih mih added the TERRIFIC! label Apr 10, 2019
kyleam
kyleam approved these changes Apr 10, 2019
# specifically what the changes in subdatasets are
# this is done by a high-level command like rev-diff
# so the comparison within this repo and the present
# `state` label are all we need, and they are done already
Copy link
Contributor

@kyleam kyleam Apr 10, 2019

Makes sense.

@mih
Copy link
Member Author

@mih mih commented Apr 10, 2019

Thx @kyleam !

@mih mih merged commit 73145b4 into datalad:master Apr 10, 2019
5 checks passed
@mih mih deleted the enh-statusperf branch Apr 10, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants