forked from git-for-windows/git
-
Notifications
You must be signed in to change notification settings - Fork 106
commit: remove parse_commit_no_graph() #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
derrickstolee
merged 1 commit into
microsoft:vfs-2.22.0
from
derrickstolee:commit-graph-fast
Aug 6, 2019
Merged
commit: remove parse_commit_no_graph() #168
derrickstolee
merged 1 commit into
microsoft:vfs-2.22.0
from
derrickstolee:commit-graph-fast
Aug 6, 2019
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|
(I notice of course that this will fail the test that he added for this case... I will probably disable it for now.) Edit: I just did a revert (and adjusted the conflicts around changes on top of that commit). |
532c2aa to
913fd06
Compare
jeffhostetler
approved these changes
Aug 6, 2019
The parse_commit_no_graph() method was added in 43d3561 ("commit-graph write: don't die if the existing graph is corrupt" 2019-03-25) as a way to avoid persisting bad data across commit-graph files. That is, if the commit-graph file has undetected corrupt data -- such as a flipped bit in a parent int-id value -- then that data will persist to the next commit-graph file. The parse_commit_no_graph() method was used to always use the pack data directly instead. Unfortunately, this comes at a significant performance cost. In both time and memory, parsing from pack files is much slower than parsing from the commit-graph file. In a repository with 4.5 million commits, this can lead to Git taking up to 11gb of memory to rewrite the file. Now that the incremental commit-graph file format exists, we can rely on the quality of the commit-graph file if we follow the two-step pattern of (1) write a commit-graph with "--split" and (2) run "git commit-graph verify --shallow" to verify the tip file. This reverts commit 43d3561. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
913fd06 to
a405834
Compare
derrickstolee
commented
Aug 6, 2019
derrickstolee
added a commit
to derrickstolee/VFSForGit
that referenced
this pull request
Aug 6, 2019
See microsoft/git#168 for full details. v2.22.0 had a change that made writing commit-graph files much slower by always parsing from pack-files instead of from the commit-graph file. For some users of the OS repo, this caused super-slow performance and high memory usage. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
to derrickstolee/VFSForGit
that referenced
this pull request
Aug 6, 2019
See microsoft/git#168 for full details. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
to derrickstolee/VFSForGit
that referenced
this pull request
Aug 6, 2019
See microsoft/git#168 for full details. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
to derrickstolee/VFSForGit
that referenced
this pull request
Aug 6, 2019
See microsoft/git#168 for full details. Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
derrickstolee
added a commit
to microsoft/VFSForGit
that referenced
this pull request
Aug 6, 2019
See microsoft/git#168 for full details. v2.22.0 had a change that made writing commit-graph files much slower by always parsing from pack-files instead of from the commit-graph file. For some users of the OS repo, this caused super-slow performance and high memory usage.
derrickstolee
added a commit
to microsoft/VFSForGit
that referenced
this pull request
Aug 6, 2019
This PR does two things: 1. Updates Git to include the slow commit-graph write fix from microsoft/git#168. Also see #1420 for the M153 hotfix. 2. Removes the multi-pack-index writes from the `PostFetchStep` and instead only writes the multi-pack-index during the `PackfileMaintenanceStep`. In order to get the most out of the step, we need to ensure we have a multi-pack-index before running the `expire` and `repack` steps. Also, we can only expire the packs from that day if they are contained in the multi-pack-index. If we agree that we should send (2) as a hotfix to the M155 release (in addition to (1), which is necessary), then I'll create a hotfix PR to that branch.
derrickstolee
added a commit
to microsoft/VFSForGit
that referenced
this pull request
Aug 7, 2019
This PR does two things: Updates Git to include the slow commit-graph write fix from microsoft/git#168. Also see #1420 for the M153 hotfix. Removes the multi-pack-index writes from the PostFetchStep and instead only writes the multi-pack-index during the PackfileMaintenanceStep. In order to get the most out of the step, we need to ensure we have a multi-pack-index before running the expire and repack steps. Also, we can only expire the packs from that day if they are contained in the multi-pack-index. If we agree that we should send (2) as a hotfix to the M155 release (in addition to (1), which is necessary), then I'll create a hotfix PR to that branch.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The parse_commit_no_graph() method was added in 43d3561 ("commit-graph
write: don't die if the existing graph is corrupt" 2019-03-25) as a way
to avoid persisting bad data across commit-graph files. That is, if the
commit-graph file has undetected corrupt data -- such as a flipped bit
in a parent int-id value -- then that data will persist to the next
commit-graph file. The parse_commit_no_graph() method was used to always
use the pack data directly instead.
Unfortunately, this comes at a significant performance cost. In both
time and memory, parsing from pack files is much slower than parsing
from the commit-graph file. In a repository with 4.5 million commits,
this can lead to Git taking up to 11gb of memory to rewrite the file.
Now that the incremental commit-graph file format exists, we can rely
on the quality of the commit-graph file if we follow the two-step
pattern of (1) write a commit-graph with "--split" and (2) run "git
commit-graph verify --shallow" to verify the tip file.
@jrbriggs, @jeffhostetler, @jamill and others: this change should revert
the performance problems we are seeing with the M153 release. I will
test this carefully after generating a Windows installer. I'm not sure if
this change would be something we can send upstream or not, but I
will start a conversation about it on the list.