New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve blame performance #10840
Improve blame performance #10840
Conversation
Future work: During my investigation, I saw that For example, like I said in PR description when no avatar is in cache, blame on
But a quick test replacing the use of I have not replaced the code because I don't know how to reproduce the same proxy configuration than with using |
I have been using this commit for some months and did some measurements with the change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+2 LGTM
I have been using this commit for some months
Me, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(approved but agree with mstv, should be changed)
25c1314
to
35d337e
Compare
Done. |
Can you please point me to where we're doing this?
I think as a first step we can provide an implementation without a proxy, and then bolt on a proxy support. Unless, of course, you're behind proxy and need it :) A quick search for how to configure proxy with HttpClient yields these: https://stackoverflow.com/a/58560252/2338036 and https://stackoverflow.com/a/55242207/2338036. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
GitCommands/Git/GitModule.cs
Outdated
@@ -3378,7 +3378,7 @@ internal GitBlame ParseGitBlame(string output, Encoding encoding) | |||
// is a blank line, and third is an introductory paragraph about the project. | |||
|
|||
Dictionary<ObjectId, GitBlameCommit> commitByObjectId = new(); | |||
List<GitBlameLine> lines = new(capacity: 256); | |||
List<GitBlameLine> lines = new(capacity: Math.Min(Math.Max(256, output.Length / 120), 1000)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are these magic numbers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a heuristic to size the list (to prevent multiple reallocation) depending on the size of the file content to blame with some limits to not allocate a too small list (256 as before) or too big (1000) one.
To 'estimate' the number of lines, I took a number bigger than what nearly all projects will have as mean source code length to guarantee that we will not over-allocate (because we blame nearly 90% of the time source code, right!?!) which is a length of 120 characters...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to add some hints about this.
I remember looking at this line some minutes when I tried this commit out to understand it.
(Not affecting approval.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a comment...
I have remove the code for another PR as @gerhardol suggestion and updated the description just a few minutes before your comment...
I will have a look but it will be part of the other PR on avatar handling where I already fix or improve some other things... |
* cache author lines instead of building it every times (because it happens often that multiple lines belong to same commit) * Use TryGetValue() to get the commit and avatar image * better heuristic for initial line capacity based on evaluation of multiple repositories on how many characters are needed on average in a git blame to "describe" a line of a text file (surprisingly consistent around 120 characters!)
35d337e
to
4f25331
Compare
Proposed changes
Some small improvements on blame data generation:
* cache author lines instead of building it every times
(because it happens often that multiple lines belong to same commit)
* Use TryGetValue to get the commit and avatar image
* better heuristic for initial line capacity
Test methodology
Test environment(s)
Merge strategy
Merge commit
✒️ I contribute this code under The Developer Certificate of Origin.