Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sparse-checkout: modify 'git add', 'git rm', and 'git mv' behavior #1018

Commits on Sep 23, 2021

  1. t3705: test that 'sparse_entry' is unstaged

    The tests in t3705-add-sparse-checkout.sh check to see how 'git add'
    behaves with paths outside the sparse-checkout definition. These
    currently check to see if a given warning is present but not that the
    index is not updated with the sparse entries. Add a new
    'test_sparse_entry_unstaged' helper to be sure 'git add' is behaving
    correctly.
    
    We need to modify setup_sparse_entry to actually commit the sparse_entry
    file so it exists at HEAD and as an entry in the index, but its exact
    contents are not staged in the index.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 23, 2021
    Configuration menu
    Copy the full SHA
    642b05f View commit details
    Browse the repository at this point in the history

Commits on Sep 24, 2021

  1. t1092: behavior for adding sparse files

    Add some tests to demonstrate the current behavior around adding files
    outside of the sparse-checkout cone. Currently, untracked files are
    handled differently from tracked files. A future change will make these
    cases be handled the same way.
    
    Further expand checking that a failed 'git add' does not stage changes
    to the index.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    58389ed View commit details
    Browse the repository at this point in the history
  2. dir: select directories correctly

    When matching a path against a list of patterns, the ones that require a
    directory match previously did not work when a filename is specified.
    This was fine when all pattern-matching was done within methods such as
    unpack_trees() that check a directory before recursing into the
    contained files. However, other commands will start matching individual
    files against pattern lists without that recursive approach.
    
    The last_matching_pattern_from_list() logic performs some checks on the
    filetype of a path within the index when the PATTERN_FLAG_MUSTBEDIR flag
    is set. This works great when setting SKIP_WORKTREE bits within
    unpack_trees(), but doesn't work well when passing an arbitrary path
    such as a file within a matching directory.
    
    We extract the logic around determining the file type, but attempt to
    avoid checking the filesystem if the parent directory already matches
    the sparse-checkout patterns. The new path_matches_dir_pattern() method
    includes a 'path_parent' parameter that is used to store the parent
    directory of 'pathname' between multiple pattern matching tests. This is
    loaded lazily, only on the first pattern it finds that has the
    PATTERN_FLAG_MUSTBEDIR flag.
    
    If we find that a path has a parent directory, we start by checking to
    see if that parent directory matches the pattern. If so, then we do not
    need to query the index for the type (which can be expensive). If we
    find that the parent does not match, then we still must check the type
    from the index for the given pathname.
    
    Note that this does not affect cone mode pattern matching, but instead
    the more general -- and slower -- full pattern set. Thus, this does not
    affect the sparse index.
    
    Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    2ebaf8e View commit details
    Browse the repository at this point in the history
  3. dir: fix pattern matching on dirs

    Within match_pathname(), one successful matching category happens when
    the pattern is equal to its non-wildcard prefix. At this point, we have
    checked that the input 'pathname' matches the pattern up to the prefix
    length, and then we subtraced that length from both 'patternlen' and
    'namelen'.
    
    In the case of a directory match, this prefix match should be
    sufficient. However, the success condition only cared about _exact_
    equality here. Instead, we should allow any path that agrees on this
    prefix in the case of PATTERN_FLAG_MUSTBEDIR.
    
    This case was not tested before because of the way unpack_trees() would
    match a parent directory before visiting the contained paths. This
    approach is changing, so we must change this comparison.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    24bffda View commit details
    Browse the repository at this point in the history
  4. add: fail when adding an untracked sparse file

    The add_files() method in builtin/add.c takes a set of untracked files
    that are being added by the input pathspec and inserts them into the
    index. If these files are outside of the sparse-checkout cone, then they
    gain the SKIP_WORKTREE bit at some point. However, this was not checked
    before inserting into the index, so these files are added even though we
    want to avoid modifying the index outside of the sparse-checkout cone.
    
    Add a check within add_files() for these files and write the advice
    about files outside of the sparse-checkout cone.
    
    This behavior change modifies some existing tests within t1092. These
    tests intended to document how a user could interact with the existing
    behavior in place. Many of these tests need to be marked as expecting
    failure. A future change will allow these tests to pass by adding a flag
    to 'git add' that allows users to modify index entries outside of the
    sparse-checkout cone.
    
    The 'submodule handling' test is intended to document what happens to
    directories that contain a submodule when the sparse index is enabled.
    It is not trying to say that users should be able to add submodules
    outside of the sparse-checkout cone, so that test can be modified to
    avoid that operation.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    e3a749e View commit details
    Browse the repository at this point in the history
  5. add: skip tracked paths outside sparse-checkout cone

    When 'git add' adds a tracked file that is outside of the
    sparse-checkout cone, it checks the SKIP_WORKTREE bit to see if the file
    exists outside of the sparse-checkout cone. This is usually correct,
    except in the case of a merge conflict outside of the cone.
    
    Modify add_pathspec_matched_against_index() to be more careful about
    paths by checking the sparse-checkout patterns in addition to the
    SKIP_WORKTREE bit. This causes 'git add' to no longer allow files
    outside of the cone that removed the SKIP_WORKTREE bit due to a merge
    conflict.
    
    With only this change, users will only be able to add the file after
    adding the file to the sparse-checkout cone. A later change will allow
    users to force adding even though the file is outside of the
    sparse-checkout cone.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    2c5c834 View commit details
    Browse the repository at this point in the history
  6. add: implement the --sparse option

    We previously modified 'git add' to refuse updating index entries
    outside of the sparse-checkout cone. This is justified to prevent users
    from accidentally getting into a confusing state when Git removes those
    files from the working tree at some later point.
    
    Unfortunately, this caused some workflows that were previously possible
    to become impossible, especially around merge conflicts outside of the
    sparse-checkout cone. These were documented in tests within t1092.
    
    We now re-enable these workflows using a new '--sparse' option to 'git
    add'. This allows users to signal "Yes, I do know what I'm doing with
    these files," and accept the consequences of the files leaving the
    worktree later.
    
    We delay updating the advice message until implementing a similar option
    in 'git rm' and 'git mv'.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    430ab44 View commit details
    Browse the repository at this point in the history
  7. add: update --chmod to skip sparse paths

    We added checks for path_in_sparse_checkout() to portions of 'git add'
    that add warnings and prevent staging a modification, but we skipped the
    --chmod mode. Update chmod_pathspec() to ignore cache entries whose path
    is outside of the sparse-checkout cone (unless --sparse is provided).
    Add a test in t3705.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    4f7b5cd View commit details
    Browse the repository at this point in the history
  8. add: update --renormalize to skip sparse paths

    We added checks for path_in_sparse_checkout() to portions of 'git add'
    that add warnings and prevent stagins a modification, but we skipped the
    --renormalize mode. Update renormalize_tracked_files() to ignore cache
    entries whose path is outside of the sparse-checkout cone (unless
    --sparse is provided). Add a test in t3705.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    30ec609 View commit details
    Browse the repository at this point in the history
  9. rm: add --sparse option

    As we did previously in 'git add', add a '--sparse' option to 'git rm'
    that allows modifying paths outside of the sparse-checkout definition.
    The existing checks in 'git rm' are restricted to tracked files that
    have the SKIP_WORKTREE bit in the current index. Future changes will
    cause 'git rm' to reject removing paths outside of the sparse-checkout
    definition, even if they are untracked or do not have the SKIP_WORKTREE
    bit.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    99d5092 View commit details
    Browse the repository at this point in the history
  10. rm: skip sparse paths with missing SKIP_WORKTREE

    If a path does not match the sparse-checkout cone but is somehow missing
    the SKIP_WORKTREE bit, then 'git rm' currently succeeds in removing the
    file. One reason a user might be in this situation is a merge conflict
    outside of the sparse-checkout cone. Removing such a file might be
    problematic for users who are not sure what they are doing.
    
    Add a check to path_in_sparse_checkout() when 'git rm' is checking if a
    path should be considered for deletion. Of course, this check is ignored
    if the '--sparse' option is specified, allowing users who accept the
    risks to continue with the removal.
    
    This also removes a confusing behavior where a user asks for a directory
    to be removed, but only the entries that are within the sparse-checkout
    definition are removed. Now, 'git rm <dir>' will fail without '--sparse'
    and will succeed in removing all contained paths with '--sparse'.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    47a1444 View commit details
    Browse the repository at this point in the history
  11. mv: refuse to move sparse paths

    Since cmd_mv() does not operate on cache entries and instead directly
    checks the filesystem, we can only use path_in_sparse_checkout() as a
    mechanism for seeing if a path is sparse or not. Be sure to skip
    returning a failure if '-k' is specified.
    
    To ensure that the advice around sparse paths is the only reason a move
    failed, be sure to check this as the very last thing before inserting
    into the src_for_dst list.
    
    The tests cover a variety of cases such as whether the target is tracked
    or untracked, and whether the source or destination are in or outside of
    the sparse-checkout definition.
    
    Helped-by: Matheus Tavares Bernardino <matheus.bernardino@usp.br>
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    28e703d View commit details
    Browse the repository at this point in the history
  12. advice: update message to suggest '--sparse'

    The previous changes modified the behavior of 'git add', 'git rm', and
    'git mv' to not adjust paths outside the sparse-checkout cone, even if
    they exist in the working tree and their cache entries lack the
    SKIP_WORKTREE bit. The intention is to warn users that they are doing
    something potentially dangerous. The '--sparse' option was added to each
    command to allow careful users the same ability they had before.
    
    To improve the discoverability of this new functionality, add a message
    to advice.updateSparsePath that mentions the existence of the option.
    
    The previous set of changes also modified the purpose of this message to
    include possibly a list of paths instead of only a list of pathspecs.
    Make the warning message more clear about this new behavior.
    
    Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
    derrickstolee committed Sep 24, 2021
    Configuration menu
    Copy the full SHA
    9fbc88e View commit details
    Browse the repository at this point in the history