Comparing changes
Open a pull request
base repository: git/git
base: 670b81a890388c60b7032a4f5b879f2ece8c4558
head repository: git/git
compare: e5ca291076a8a936283bb2c57433c4393d3f80c2
- 16 commits
- 10 files changed
- 1 contributor
Commits on Jul 14, 2021
-
sparse-index: skip indexes with unmerged entries
The sparse-index format is designed to be compatible with merge conflicts, even those outside the sparse-checkout definition. The reason is that when converting a full index to a sparse one, a cache entry with nonzero stage will not be collapsed into a sparse directory entry. However, this behavior was not tested, and a different behavior within convert_to_sparse() fails in this scenario. Specifically, cache_tree_update() will fail when unmerged entries exist. convert_to_sparse_rec() uses the cache-tree data to recursively walk the tree structure, but also to compute the OIDs used in the sparse-directory entries. Add an index scan to convert_to_sparse() that will detect if these merge conflict entries exist and skip the conversion before trying to update the cache-tree. This is marked as NEEDSWORK because this can be removed with a suitable update to cache_tree_update() or a similar method that can construct a cache-tree with invalid nodes, but still allow creating the nodes necessary for creating sparse directory entries. It is possible that in the future we will not need to make such an update, since if we do not expand a sparse-index into a full one, this conversion does not need to happen. Thus, this can be deferred until the merge machinery is made to integrate with the sparse-index. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
sparse-index: include EXTENDED flag when expanding
When creating a full index from a sparse one, we create cache entries for every blob within a given sparse directory entry. These are correctly marked with the CE_SKIP_WORKTREE flag, but the CE_EXTENDED flag is not included. The CE_EXTENDED flag would exist if we loaded a full index from disk with these entries marked with CE_SKIP_WORKTREE, so we can add the flag here to be consistent. This allows us to directly compare the flags present in cache entries when testing the sparse-index feature, but has no significance to its correctness in the user-facing functionality. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
t1092: replace incorrect 'echo' with 'cat'
This fixes the test data shape to be as expected, allowing rename detection to work properly now that the 'larger-content' file actually has meaningful lines. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
t1092: expand repository data shape
As more features integrate with the sparse-index feature, more and more special cases arise that require different data shapes within the tree structure of the repository in order to demonstrate those cases. Add several interesting special cases all at once instead of sprinkling them across several commits. The interesting cases being added here are: * Add sparse-directory entries on both sides of directories within the sparse-checkout definition. * Add directories outside the sparse-checkout definition who have only one entry and are the first entry of a directory with multiple entries. * Add filenames adjacent to a sparse directory entry that sort before and after the trailing slash. Later tests will take advantage of these shapes, but they also deepen the tests that already exist. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
t1092: add tests for status/add and sparse files
Before moving to update 'git status' and 'git add' to work with sparse indexes, add an explicit test that ensures the sparse-index works the same as a normal sparse-checkout when the worktree contains directories and files outside of the sparse cone. Specifically, 'folder1/a' is a file in our test repo, but 'folder1' is not in the sparse cone. When 'folder1/a' is modified, the file is not shown as modified and adding it will fail. This is new behavior as of a20f704 (add: warn when asked to update SKIP_WORKTREE entries, 2021-04-08). Before that change, these adds would be silently ignored. Untracked files are fine: adding new files both with 'git add .' and 'git add folder1/' works just as in a full checkout. This may not be entirely desirable, but we are not intending to change behavior at the moment, only document it. A future change could alter the behavior to be more sensible, and this test could be modified to satisfy the new expected behavior. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
unpack-trees: preserve cache_bottom
The cache_bottom member of 'struct unpack_trees_options' is used to track the range of index entries corresponding to a node of the cache tree. While recursing with traverse_by_cache_tree(), this value is preserved on the call stack using a local and then restored as that method returns. The mark_ce_used() method normally modifies the cache_bottom member when it refers to the marked cache entry. However, sparse directory entries are stored as nodes in the cache-tree data structure as of 2de37c5 (cache-tree: integrate with sparse directory entries, 2021-03-30). Thus, the cache_bottom will be modified as the cache-tree walk advances. Do not update it as well within mark_ce_used(). Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
unpack-trees: compare sparse directories correctly
As we further integrate the sparse-index into unpack-trees, we need to ensure that we compare sparse directory entries correctly with other entries. This affects searching for an exact path as well as sorting index entries. Sparse directory entries contain the trailing directory separator. This is important for the sorting, in particular. Thus, within do_compare_entry() we stop using S_IFREG in all cases, since sparse directories should use S_IFDIR to indicate that the comparison should treat the entry name as a dirctory. Within compare_entry(), it first calls do_compare_entry() to check the leading portion of the name. When the input path is a directory name, we could match exactly already. Thus, we should return 0 if we have an exact string match on a sparse directory entry. The final check is a length comparison between the strings. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
unpack-trees: rename unpack_nondirectories()
In the next change, we will use this method to unpack a sparse directory entry, so change the name to unpack_single_entry() so these entries apply. The new name reflects that we will not recurse into trees in order to resolve the conflicts. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
unpack-trees: unpack sparse directory entries
During unpack_callback(), index entries are compared against tree entries. These are matched according to names and types. One goal is to decide if we should recurse into subtrees or simply operate on one index entry. In the case of a sparse-directory entry, we do not want to recurse into that subtree and instead simply compare the trees. In some cases, we might want to perform a merge operation on the entry, such as during 'git checkout <commit>' which wants to replace a sparse tree entry with the tree for that path at the target commit. We extend the logic within unpack_single_entry() to create a sparse-directory entry in this case, and then that is sent to call_unpack_fn(). There are some subtleties in this process. For instance, we need to update find_cache_entry() to allow finding a sparse-directory entry that exactly matches a given path. Use the new helper method sparse_dir_matches_path() for this. We also need to ignore conflict markers in the case that the entries correspond to directories and we already have a sparse directory entry. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
dir.c: accept a directory as part of cone-mode patterns
When we have sparse directory entries in the index, we want to compare that directory against sparse-checkout patterns. Those pattern matching algorithms are built expecting a file path, not a directory path. This is especially important in the "cone mode" patterns which will match files that exist within the "parent directories" as well as the recursive directory matches. If path_matches_pattern_list() is given a directory, we can add a fake filename ("-") to the directory and get the same results as before, assuming we are in cone mode. Since sparse index requires cone mode patterns, this is an acceptable assumption. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> -
diff-lib: handle index diffs with sparse dirs
While comparing an index to a tree, we may see a sparse directory entry. In this case, we should compare that portion of the tree to the tree represented by that entry. This could include a new tree which needs to be expanded to a full list of added files. It could also include an existing tree, in which case all of the changes inside are important to describe, including the modifications, additions, and deletions. Note that the case where the tree has a path and the index does not remains identical to before: the lack of a cache entry is the same with a sparse index. Use diff_tree_oid() appropriately to compute the diff. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
status: skip sparse-checkout percentage with sparse-index
'git status' began reporting a percentage of populated paths when sparse-checkout is enabled in 051df3c (wt-status: show sparse checkout status as well, 2020-07-18). This percentage is incorrect when the index has sparse directories. It would also be expensive to calculate as we would need to parse trees to count the total number of possible paths. Avoid the expensive computation by simplifying the output to only report that a sparse checkout exists, without the percentage. This change is the reason we use 'git status --porcelain=v2' in t1092-sparse-checkout-compatibility.sh. We don't want to ensure that this message is equal across both modes, but instead just the important information about staged, modified, and untracked files are compared. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
status: use sparse-index throughout
By testing 'git -c core.fsmonitor= status -uno', we can check for the simplest index operations that can be made sparse-aware. The necessary implementation details are already integrated with sparse-checkout, so modify command_requires_full_index to be zero for cmd_status(). In refresh_index(), we loop through the index entries to refresh their stat() information. However, sparse directories have no stat() information to populate. Ignore these entries. This allows 'git status' to no longer expand a sparse index to a full one. This is further tested by dropping the "-uno" option and adding an untracked file into the worktree. The performance test p2000-sparse-checkout-operations.sh demonstrates these improvements: Test HEAD~1 HEAD ----------------------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.31(0.30+0.05) 0.31(0.29+0.06) +0.0% 2000.3: git status (full-index-v4) 0.31(0.29+0.07) 0.34(0.30+0.08) +9.7% 2000.4: git status (sparse-index-v3) 2.35(2.28+0.10) 0.04(0.04+0.05) -98.3% 2000.5: git status (sparse-index-v4) 2.35(2.24+0.15) 0.05(0.04+0.06) -97.9% Note that since HEAD~1 was expanding the sparse index by parsing trees, it was artificially slower than the full index case. Thus, the 98% improvement is misleading, and instead we should celebrate the 0.34s to 0.05s improvement of 85%. This is more indicative of the peformance gains we are expecting by using a sparse index. Note: we are dropping the assignment of core.fsmonitor here. This is not necessary for the test script as we are not altering the config any other way. Correct integration with FS Monitor will be validated in later changes. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
wt-status: expand added sparse directory entries
It is difficult, but possible, to get into a state where we intend to add a directory that is outside of the sparse-checkout definition. Add a test to t1092-sparse-checkout-compatibility.sh that demonstrates this using a combination of 'git reset --mixed' and 'git checkout --orphan'. This test failed before because the output of 'git status --porcelain=v2' would not match on the lines for folder1/: * The sparse-checkout repo (with a full index) would output each path name that is intended to be added. * The sparse-index repo would only output that "folder1/" is staged for addition. The status should report the full list of files to be added, and so this sparse-directory entry should be expanded to a full list when reaching it inside the wt_status_collect_changes_initial() method. Use read_tree_at() to assist. Somehow, this loop over the cache entries was not guarded by ensure_full_index() as intended. Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
fsmonitor: integrate with sparse index
If we need to expand a sparse-index into a full one, then the FS Monitor bitmap is going to be incorrect. Ensure that we start fresh at such an event. While this is currently a performance drawback, the eventual hope of the sparse-index feature is that these expansions will be rare and hence we will be able to keep the FS Monitor data accurate across multiple Git commands. These tests are added to demonstrate that the behavior is the same across a full index and a sparse index, but also that file modifications to a tracked directory outside of the sparse cone will trigger ensure_full_index(). Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-
t1092: document bad sparse-checkout behavior
There are several situations where a repository with sparse-checkout enabled will act differently than a normal repository, and in ways that are not intentional. The test t1092-sparse-checkout-compatibility.sh documents some of these deviations, but a casual reader might think these are intentional behavior changes. Add comments on these tests that make it clear that these behaviors should be updated. Using 'NEEDSWORK' helps contributors find that these are potential areas for improvement. Helped-by: Elijah Newren <newren@gmail.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
This comparison is taking too long to generate.
Unfortunately it looks like we can’t render this comparison for you right now. It might be too big, or there might be something weird with your repository.
You can try running this command locally to see the comparison on your machine:
git diff 670b81a890388c60b7032a4f5b879f2ece8c4558...e5ca291076a8a936283bb2c57433c4393d3f80c2