New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Blame is very slow #14
Comments
I would have helped by fork and an implementation by diff. But, may not get time till May. And while looking at the code, I realized that FilePatches does not give me proper 'rename'. I was expecting that "from, to" in the case of rename will be correct. But it is not. That is critical as blame depends on that. |
type FilePatch interface { But what if it is a rename? |
I am not the original author of this part, and It's a long time since I don't touch this part. But what I remember is that the rename it's no handled because it's not part of the diff process it's a heuristic with a threshold, after the process of making the diff. |
I noticed this as well and ended up shelling out to |
I did exactly the same by doing exec but then that also becomes a problem as need to parse the output... |
Is there any way to speed up? |
To help us keep things tidy and focus on the active tasks, we've introduced a stale bot to spot issues/PRs that haven't had any activity in a while. This particular issue hasn't had any updates or activity in the past 90 days, so it's been labeled as 'stale'. If it remains inactive for the next 30 days, it'll be automatically closed. We understand everyone's busy, but if this issue is still important to you, please feel free to add a comment or make an update to keep it active. Thanks for your understanding and cooperation! |
The blame feature was rewrote (#789) since the last activity on this issue and introduced some performance improvements. So I am closing it for the time being. If the issue persists we can either reopen this or create a new one. |
Sadly, I believe this might still may be an issue. For small repositories that contain no more than a few hundred commits at most, the time to get the blame for a given file is nothing to complain about, but for larger repositories with a few thousand commits and a complex history, the process grinds to a halt. Quite where the thresholds are, I can't be sure, but I do get the sense that the overall complexity of the repository has an excessive impact on the performance of the blame functionality. This is critical to a piece of work I'm currently working on, so I'll start to take a look at the blame source and see what I can find. I thought it would be best to post here now, though, just to give the community a heads-up. Edit: For transparency, and if a more competent person spots that I am simply doing something wrong please do call me out, this is the function that I have found causes trouble in my project, where // GetFileContents will attempt to read the contents of a file and return each
// line with the most recent blame information.
func (c *Commit) GetFileContents(filepath string) ([]*git.Line, error) {
file, err := git.Blame(c.ptr, path.Clean(filepath))
if err != nil {
return []*git.Line{}, &FileNotFoundError{
Filepath: filepath,
}
}
return file.Lines, nil
} |
@iainjreid can you please confirm what version of go-git you are using? |
@pjbgf
|
After some more testing using the in-memory storage (which is immeasurably faster, but the bottleneck still persists), I believe this issue is simply made apparent and worse by my SQL-backed storage. My gut feeling is that as the commit history is being walked, the round trip cost of whichever storage mechanism is being used becomes an overwhelmingly high overhead? I'll run some tests locally and report back 👍 |
I tested a Blame and as compared to CLI it is very slow. CLI takes around < 1 and go-git takes 35 second.
The text was updated successfully, but these errors were encountered: