Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPT: Faster approach to GitRepo.get_hexsha() #4806

Merged
merged 1 commit into from Aug 6, 2020
Merged

Conversation

mih
Copy link
Member

@mih mih commented Aug 6, 2020

The previous git-show can be slow with more complex commits (e.g.
octopus merge commit, see
#4801)

Switch to git-rev-parse instead as suggested by @kyleam in
#4801 (comment)

Behavior of GitRepo.get_hexsha() should stay constant for all
observed usage patterns. In particular raising a ValueError when
querying for a specific, but non existent commitish is critical
for subdataset handling.

Fixes gh-4801

I will tackle an analog change (using git-log) in GitRepo.format_commit() in a separate PR.

@mih mih added the performance Improve performance of an existing feature label Aug 6, 2020
@codecov
Copy link

codecov bot commented Aug 6, 2020

Codecov Report

Merging #4806 into maint will decrease coverage by 0.04%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##            maint    #4806      +/-   ##
==========================================
- Coverage   89.67%   89.62%   -0.05%     
==========================================
  Files         288      288              
  Lines       40352    40356       +4     
==========================================
- Hits        36184    36168      -16     
- Misses       4168     4188      +20     
Impacted Files Coverage Δ
datalad/support/gitrepo.py 90.32% <100.00%> (-0.13%) ⬇️
datalad/downloaders/http.py 81.85% <0.00%> (-2.71%) ⬇️
datalad/downloaders/tests/test_http.py 87.71% <0.00%> (-2.22%) ⬇️
datalad/core/distributed/clone.py 88.62% <0.00%> (-0.60%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cf54d79...de28483. Read the comment docs.

# use --quiet because the 'Needed a single revision' error message
# that is the result of running this in a repo with no commits
# isn't useful to report
cmd = ['rev-parse', '--verify', '{}^{{commit}}'.format(
Copy link
Collaborator

@kyleam kyleam Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above mentions --quiet, but it's not passed here.

Copy link
Member Author

@mih mih Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arrgh, thx, wil fix.

Copy link
Member Author

@mih mih Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Force-pushed an update

The previous `git-show` can be slow with more complex commits (e.g.
octopus merge commit, see
datalad#4801)

Switch to `git-rev-parse` instead as suggested by @kyleam in
datalad#4801 (comment)

Behavior of `GitRepo.get_hexsha()` should stay constant for all
observed usage patterns. In particular raising a ValueError when
querying for a specific, but non existent commitish is critical
for subdataset handling.

Fixes dataladgh-4801
kyleam
kyleam approved these changes Aug 6, 2020
@mih mih merged commit 010636a into datalad:maint Aug 6, 2020
3 of 4 checks passed
@mih mih deleted the bf-4801-1 branch Aug 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Improve performance of an existing feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants