Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow Performance on file history for large repos #20764

Closed
zeripath opened this issue Aug 11, 2022 · 1 comment · Fixed by #20765
Closed

Slow Performance on file history for large repos #20764

zeripath opened this issue Aug 11, 2022 · 1 comment · Fixed by #20765
Labels
type/enhancement An improvement of existing functionality
Milestone

Comments

@zeripath
Copy link
Contributor

The use of git log --follow on getting file history causes increased slow-downs for large repositories and potentially makes the count incorect.

Example URL:

This is essentially:

git rev-list --count $REVISION -- $FILE_PATH

followed-by:

git log $REVISION --follow  --pretty=format:%H -- $FILE_PATH

The second one of these is so much slower than the first and it can actually produce different results for the number of commits due to the --follow on the second call. (which appears to be the cause of most of the slow downs.)

Now if it were not for --follow we could actually use git rev-list for both of these calls and the skip and max-count will be free (in contrast to the current system where the skip doesn't work.)

Looking at the history for this line I don't think there was reasoning behind adding the follow except that I would guess that it was nice to add.

So... a simple speed improvement here is to drop the follow and switch to rev-list for these calls.

An additional speed improvement is to add a deferrable route as in the commit infos page

Originally posted by @zeripath in #19812 (comment)

zeripath added a commit to zeripath/gitea that referenced this issue Aug 11, 2022
The use of `--follow` makes getting these commits very slow on large repositories
as it results in searching the whole commit tree for a blob.

Now as nice as the results of `--follow` are, I am uncertain whether it is really
of sufficient importance to keep around.

Fix go-gitea#20764

Signed-off-by: Andrew Thornton <art27@cantab.net>
techknowlogick added a commit that referenced this issue Aug 15, 2022
The use of `--follow` makes getting these commits very slow on large repositories
as it results in searching the whole commit tree for a blob.

Now as nice as the results of `--follow` are, I am uncertain whether it is really
of sufficient importance to keep around.

Fix #20764

Signed-off-by: Andrew Thornton <art27@cantab.net>

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
vsysoev pushed a commit to IntegraSDL/gitea that referenced this issue Aug 28, 2022
The use of `--follow` makes getting these commits very slow on large repositories
as it results in searching the whole commit tree for a blob.

Now as nice as the results of `--follow` are, I am uncertain whether it is really
of sufficient importance to keep around.

Fix go-gitea#20764

Signed-off-by: Andrew Thornton <art27@cantab.net>

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: techknowlogick <techknowlogick@gitea.io>
@zeripath zeripath added this to the 1.18.0 milestone Nov 30, 2022
@zeripath zeripath added the type/enhancement An improvement of existing functionality label Nov 30, 2022
@bvp
Copy link

bvp commented Feb 15, 2023

This breaks compatibility with small repositories where there is file renaming

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type/enhancement An improvement of existing functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants