Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply lfs.fetchexclude filter to previous commits when pruning #4968

Merged
merged 2 commits into from Apr 27, 2022

Conversation

chrisd8088
Copy link
Contributor

@chrisd8088 chrisd8088 commented Apr 26, 2022

In PR #2851 the git lfs prune command was changed to respect the lfs.fetchexclude configuration option such that objects would always be pruned if they were referenced by files whose paths matched one of the patterns in the configuration option (unless they were referenced by an unpushed commit).

However, while this filter was applied to files referenced by recent refs (including HEAD), it was not applied to files referenced only by recent commits previous to a recent ref. This can result in the unexpected consequence that an object in HEAD is pruned because it is referenced only by a file that matches the lfs.fetchexclude filter, but a recent previous version of the object is not pruned despite also only being referenced by the same file.

We therefore add equivalent filtering to the logPreviousSHAs() internal function which is called by the ScanPreviousVersions() GitScanner method used by pruneTaskGetPreviousVersionsOfRef() in the git lfs prune command's main phase.

(Note that in PR #1743 the Git log scanning functions were refactored, and a common parseScannerLogOutput() internal function was created to be used when performing log scans during git lfs prune commands. This replaced the original logPreviousSHAs() function, which had support for path-based filtering; however, that functionality was apparently never used.)

We also add a test which confirms that git lfs prune respects the lfs.fetchexclude configuration option insofar as it prunes Git LFS objects for files whose paths match a pattern in the filter, both when they appear in commits directly referenced by a recent ref and when they only appear in commits previous to those.

Note that we explicitly use a gitattributes(5)-compatible form of pattern match (i.e., /foo/**) for the lfs.fetchexclude options in this new test because otherwise we will see failures as per #4945. When that issue is addressed in a future PR we will revise this test to use the /foo pattern match form in order to demonstrate that gitignore(5)-style matching is being performed.

Note too that we add one extra check to the existing "prune unreferenced and old" test for consistency with our new test.

/cc @larsxschneider as author of #2851.

In commit d2221dc of PR git-lfs#2851
the "git lfs prune" command was changed to respect the
"lfs.fetchexclude" configuration option such that objects would
always be pruned if they were referenced by files whose paths
matched one of the patterns in the configuration option (unless
they were referenced by an unpushed commit).

However, while this filter was applied to files referenced by
recent refs (including HEAD), it was not applied to files referenced
only by recent commits previous to a recent ref.  This can result in
the unexpected consequence that an object in HEAD is pruned because
it is referenced only by a file that matches the "lfs.fetchexclude"
filter, but a recent previous version of the object is not pruned
despite also only being referenced by the same file.

We therefore add equivalent filtering to the logPreviousSHAs()
internal function which is called by the ScanPreviousVersions()
GitScanner method used by pruneTaskGetPreviousVersionsOfRef()
in the "git lfs prune" command's main phase.

(Note that in PR git-lfs#1743 the Git log scanning functions were refactored,
and a common parseScannerLogOutput() internal function was created to
be used when performing log scans during "git lfs prune" commands.
This replaced the original logPreviousSHAs() function, which had
support for path-based filtering; however, that functionality was
apparently never used.)

We also add a test which confirms that "git lfs prune" respects
the "lfs.fetchexclude" configuration option insofar as it prunes
Git LFS objects for files whose paths match a pattern in the filter,
both when they appear in commits directly referenced by a recent ref
and when they only appear in commits previous to those.

Note too that we add one extra check to the existing "prune
unreferenced and old" test for consistency with our new test.
@chrisd8088 chrisd8088 requested a review from a team as a code owner April 26, 2022 17:24
@chrisd8088 chrisd8088 merged commit 0df26c1 into git-lfs:main Apr 27, 2022
@chrisd8088 chrisd8088 deleted the prune-filter-prev-shas branch April 27, 2022 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants