Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Expose status() file type inspection mode switch #3701

Merged
merged 1 commit into from Sep 25, 2019

Conversation

mih
Copy link
Member

@mih mih commented Sep 22, 2019

This can make an essential performance difference in datasets with a large number of files.

Demo:

% git annex info | grep '^annexed'
annexed files in working tree: 20054

% time datalad status -t eval
datalad status -t eval  79.15s user 1.51s system 98% cpu 1:21.53 total

% time datalad status -t raw
datalad status -t raw  2.13s user 0.64s system 100% cpu 2.765 total

Ping #3342 @yarikoptic

@yarikoptic
Copy link
Member

yarikoptic commented Sep 22, 2019

Hard to say for this PR along where it would help since raw isn't used anywhere yet in the code, is it?

@mih
Copy link
Member Author

mih commented Sep 22, 2019

The now exposed internal switch should be in use. However, there were complaints about status itself being slow when called directly. This is for that kind of use case.

This is not to say that a careful inspection would not bring up more cases, where we should use this internally as well.

This can make an essential performance difference in datasets with a
large number of files.
@codecov
Copy link

codecov bot commented Sep 23, 2019

Codecov Report

Merging #3701 into master will decrease coverage by 59.3%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           master    #3701       +/-   ##
===========================================
- Coverage   82.78%   23.47%   -59.31%     
===========================================
  Files         273      270        -3     
  Lines       35808    35802        -6     
===========================================
- Hits        29644     8406    -21238     
- Misses       6164    27396    +21232
Impacted Files Coverage Δ
datalad/core/local/status.py 67% <ø> (-31%) ⬇️
datalad/customremotes/tests/__init__.py 0% <0%> (-100%) ⬇️
datalad/tests/test_protocols.py 0% <0%> (-100%) ⬇️
datalad/interface/tests/test_rerun_merges.py 0% <0%> (-100%) ⬇️
datalad/tests/test_constraints.py 0% <0%> (-100%) ⬇️
datalad/tests/test_misc.py 0% <0%> (-100%) ⬇️
datalad/distribution/tests/test_clone.py 0% <0%> (-100%) ⬇️
datalad/interface/run.py 0% <0%> (-100%) ⬇️
datalad/local/tests/test_subdataset.py 0% <0%> (-100%) ⬇️
datalad/support/tests/test_ansi_colors.py 0% <0%> (-100%) ⬇️
... and 221 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c7aa628...30b1b68. Read the comment docs.

@mih
Copy link
Member Author

mih commented Sep 25, 2019

Ok, no objections. Even if this helps with nothing, but figuring out what impact file type checking would have in a given scenario, it is helpful IMHO -- merging.

@mih mih merged commit 481fcf9 into datalad:master Sep 25, 2019
@mih mih deleted the enh-status branch Sep 25, 2019
mih added a commit to mih/datalad that referenced this pull request Sep 25, 2019
According to my current logic and assessment, the status() run done in
top-level save() does not need to inspect filetypes. The tests will show
if that is true.

If true, performance gains of the same magnitude as shown in
datalad#3701 for status() also become
available for save().
mih added a commit to mih/datalad that referenced this pull request Sep 29, 2019
According to my current logic and assessment, the status() run done in
top-level save() does not need to inspect filetypes. The tests will show
if that is true.

If true, performance gains of the same magnitude as shown in
datalad#3701 for status() also become
available for save().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants