Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't traverse gitignored dirs for gitignore files #397

Conversation

robotdana
Copy link
Contributor

@robotdana robotdana commented May 26, 2023

this seems to be an upstream issue in go-git, and i'll prepare a PR for them soon, but for now this copies in the affected function and fixes it by checking the accumulated patterns while walking the fs looking for gitignore files

fixes: #389

this seems to be an upstream issue in go-git, and i'll prepare a PR for
them soon, but for now this copies in the affected method and fixes it
by checking the accumulated patterns while walking the fs looking for
gitignore files

fixes: google#389
@robotdana robotdana force-pushed the dont-traverse-gitignored-dirs-for-gitignore-files branch from 96e1d81 to f32a631 Compare May 26, 2023 05:42
@another-rex
Copy link
Collaborator

Thank you! The gitignore while traversing fix looks good, though we still should ignore sub-directories when the -r flag is not being used in osv-scanner. (Probably requires making readIgnoreFile a public function)

@robramsaynz
Copy link
Contributor

robramsaynz commented Jan 31, 2024

Thank you! The gitignore while traversing fix looks good, though we still should ignore sub-directories when the -r flag is not being used in osv-scanner. (Probably requires making readIgnoreFile a public function)

Hi @another-rex. Do you mean

  1. we need to tweak the recursion system that's part of the of filepath.WalkDir() call

or

  1. you want to have functions in the gitignore package that only read a single .gitignore file at the root of the repo. I'm assuming we'd pass the recursive bool into osvscanner.parseGitIgnores() to handle this (ref, ref). That would then find any enclosing git-repo, but not only read the .gitignore at its root once it had done that. Did you also want to exclude .git/info/exclude in that case?

@another-rex
Copy link
Collaborator

Thank you! The gitignore while traversing fix looks good, though we still should ignore sub-directories when the -r flag is not being used in osv-scanner. (Probably requires making readIgnoreFile a public function)

Hi @another-rex. Do you mean

  1. we need to tweak the recursion system that's part of the of filepath.WalkDir() call

or

  1. you want to have functions in the gitignore package that only read a single .gitignore file at the root of the repo. I'm assuming we'd pass the recursive bool into osvscanner.parseGitIgnores() to handle this (ref, ref). That would then find any enclosing git-repo, but not only read the .gitignore at its root once it had done that. Did you also want to exclude .git/info/exclude in that case?

I think I meant the second option. If the recursive option is not passed in, we shouldn't look in sub-directories for gitignore files (since it'll never apply to what is being scanned).

So the only .gitignores that should apply is every .gitignore in the current and each parent/ancestor directory until we reach the root of the repository. (Or if we are not in a repository, just what is in the current directory). I am trying to match the search behavior to other tools like https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md#automatic-filtering.

Did you also want to exclude .git/info/exclude in that case?

Huh TIL about .git/info/exclude. I think we should respect the .git/info/exclude as well, since we are trying to match git behavior.

@robramsaynz
Copy link
Contributor

Thanks for the clarification @another-rex . So you preference is for the command

osv-scanner ~/projects/_git_repo/dir_a/dir_b

to pick up:

  • ~/projects/git_repo/.gitignore
  • ~/projects/git_repo/dir_a/.gitignore
  • ~/projects/git_repo/dir_a/dir_b/.gitignore

but not

  • ~/projects/git_repo/dir_a/dir_b/subdir/.gitignore
  • ~/projects/git_repo/not_in_original_path/.gitignore

Is that about right?

@robramsaynz
Copy link
Contributor

Huh TIL about .git/info/exclude. I think we should respect the .git/info/exclude as well, since we are trying to match git behavior

There's also a /etc/gitconfig file, and user-profile setting of the core.excludesfile property in ~/.gitconfig. The upstream lib has parsing functions for these:

which we haven't imported into this code.

@another-rex
Copy link
Collaborator

Thanks for the clarification @another-rex . So you preference is for the command

osv-scanner ~/projects/_git_repo/dir_a/dir_b

to pick up:

  • ~/projects/git_repo/.gitignore
  • ~/projects/git_repo/dir_a/.gitignore
  • ~/projects/git_repo/dir_a/dir_b/.gitignore

but not

  • ~/projects/git_repo/dir_a/dir_b/subdir/.gitignore
  • ~/projects/git_repo/not_in_original_path/.gitignore

Is that about right?

Yep, specifically to not even traverse the subdirs to find those .gitignore files.

@another-rex
Copy link
Collaborator

Huh TIL about .git/info/exclude. I think we should respect the .git/info/exclude as well, since we are trying to match git behavior

There's also a /etc/gitconfig file, and user-profile setting of the core.excludesfile property in ~/.gitconfig. The upstream lib has parsing functions for these:

which we haven't imported into this code.

Huh... Let's not worry about those ignores for now until we can update upstream then. I originally thought we got that for free with the go git library, but since we are copying it in, happy to just focus on .gitignore files for now.

another-rex added a commit that referenced this pull request Mar 4, 2024
Updated version of #397.

> this seems to be an upstream issue in go-git, and i'll prepare a PR
for them soon, but for now this copies in the affected function and
fixes it by checking the accumulated patterns while walking the fs
looking for gitignore files
> 
> fixes: #389

- is located github.com/**ackama**/osv-scanner, rather than
github.com/**robotdana**/osv-scanner
- has tests
- has extra changes to handle
#397 (comment)
(handles mid-tree dirs, handles non-git dirs, handles recursive flag)

---------

Co-authored-by: Dana Sherson <robot@dana.sh>
Co-authored-by: Rex P <106129829+another-rex@users.noreply.github.com>
Co-authored-by: Gareth Jones <Jones258@Gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gitignore parsing iterates through all directories unnecessarily
4 participants