Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add --prune flag #658

Merged
merged 7 commits into from
Oct 25, 2020
Merged

Add --prune flag #658

merged 7 commits into from
Oct 25, 2020

Conversation

reima
Copy link
Contributor

@reima reima commented Oct 3, 2020

This PR introduces the --prune flag and fixes #535.

The previous PR #546 for this issue has been closed. I'd like to continue the work and discussion on this issue in this PR.

The implementation proposed in this PR is the simplest one I could come up with (which is roughly the initial implementation from PR #546): If a directory matches (regardless of which options caused the match), and the --prune flag is set, do not traverse further into the directory.

To start off the discussion, I'd like to address the concerns raised by @sharkdp in the previous PR:

  • This is unlikely to break any existing features, as pruning strictly takes place after all the matching logic and does not alter it in any way.
  • The performance impact when not using the --prune flag should be negligible, as it's just one additional flag to check for each matching entry. I can still perform benchmarks if needed.
  • The interaction with other features is pretty straightforward, as pruning only applies to matching entries.

@reima reima changed the title Add --prune flag (closes #535) Add --prune flag Oct 3, 2020
@sharkdp
Copy link
Owner

sharkdp commented Oct 5, 2020

Thank you very much for your contribution!

That is a very simple implementation indeed! 👍 😄

The performance impact when not using the --prune flag should be negligible, as it's just one additional flag to check for each matching entry. I can still perform benchmarks if needed.

It would be great if you could run them just to be sure - but I agree that it does not look like it could have a significant impact in non-pruned searches.

This is unlikely to break any existing features, as pruning strictly takes place after all the matching logic and does not alter it in any way.

The interaction with other features is pretty straightforward, as pruning only applies to matching entries.

That sounds reasonable, but I would appreciate if we could think about this in detail (and add tests if necessary). Are there any options that could interfere with --prune? Should we mark some of them as conflicting with --prune?

I have listed a few options here: #546 (review)

@sharkdp
Copy link
Owner

sharkdp commented Oct 5, 2020

I would also ask you to please update the man page and ideally (if you feel comfortable with that) the newly added zsh completions.

Also, could you please add an entry to the "unreleased" section in CHANGELOG.md? The format for entries is:

- Description what has been changed, see #123 (@user)

where #123 links to the bug ticket and/or the PR and user is your username.

Thank you!

@reima
Copy link
Contributor Author

reima commented Oct 10, 2020

The performance impact when not using the --prune flag should be negligible, as it's just one additional flag to check for each matching entry. I can still perform benchmarks if needed.

It would be great if you could run them just to be sure - but I agree that it does not look like it could have a significant impact in non-pruned searches.

I ran release builds of fd from the master branch (fd-master.exe) and this PR branch (fd-prune.exe) on a directory with 127,635 files and 14,127 directories, and saw no significant performance difference without pruning:

Benchmark #1: fd-master.exe . E:\Development\rust
  Time (mean ± σ):      1.122 s ±  0.012 s    [User: 5.6 ms, System: 4.9 ms]
  Range (min … max):    1.104 s …  1.138 s    10 runs

Benchmark #2: fd-prune.exe . E:\Development\rust
  Time (mean ± σ):      1.125 s ±  0.018 s    [User: 2.8 ms, System: 1.0 ms]
  Range (min … max):    1.094 s …  1.158 s    10 runs

Summary
  'fd-master.exe . E:\Development\rust' ran
    1.00 ± 0.02 times faster than 'fd-prune.exe . E:\Development\rust'

Are there any options that could interfere with --prune? Should we mark some of them as conflicting with --prune?

I have listed a few options here: #546 (review)

The behavior of --prune (as implemented and documented right now) is: if a directory matches (i.e. it is printed), none of its descendants will be considered. This naturally means that if you prevent any directory from matching (e.g. with --type file), --prune won't have any effect. So there are no hard conflicts per se, as --prune will work as advertised regardless.

However, there are some options that will definitely make --prune superfluous, because either directories will never be matched, or traversal is already restricted by other means:

  • One or multiple instances of --type are present, but none of them are --type d/--type directory or --type l/--type symlink: will never match a directory
  • --type x/--type executable is present: will never match a directory
  • --type e/--type empty is present: will never match a directory with descendants
  • --size is present: will never match a directory
  • --exact-depth is present: traversal is already restricted
  • -1/--max-results 1 is present: exits after first match, so pruning makes no difference

Should we detect these combinations? If so, should fd refuse to run at all, or should it print a warning and continue?

I would also ask you to please update the man page and ideally (if you feel comfortable with that) the newly added zsh completions.

I've updated the man page and zsh completions. The completions are untested, though, as I'm not familiar with zsh.

Also, could you please add an entry to the "unreleased" section in CHANGELOG.md?

Done!

@sharkdp
Copy link
Owner

sharkdp commented Oct 13, 2020

  • One or multiple instances of --type are present, but none of them are --type d/--type directory or --type l/--type symlink: will never match a directory

  • --type x/--type executable is present: will never match a directory

  • --type e/--type empty is present: will never match a directory with descendants

I think we could safely just ignore all of these interactions.

--size is present: will never match a directory

That seems like a minor problem. A user could theoretically run a --size-based search and want --prune in addition. But --prune would have no effect since directories would not be matched. I think we should mark --size and --prune as conflicting. Would something similar be true for datetime-based searches?

-1/--max-results 1 is present: exits after first match, so pruning makes no difference

can be safely ignored, I guess.

I would also ask you to please update the man page and ideally (if you feel comfortable with that) the newly added zsh completions.

I've updated the man page and zsh completions. The completions are untested, though, as I'm not familiar with zsh.

Also, could you please add an entry to the "unreleased" section in CHANGELOG.md?

Done!

Thank you!

@reima
Copy link
Contributor Author

reima commented Oct 22, 2020

That seems like a minor problem. A user could theoretically run a --size-based search and want --prune in addition. But --prune would have no effect since directories would not be matched. I think we should mark --size and --prune as conflicting.

I agree. I've updated the PR accordingly.

Would something similar be true for datetime-based searches?

The time based filters can also match directories, so I don't think there's any interaction with pruning here.

You didn't mention --exact-depth in your response, did you miss that? Should that option also be marked as conflicting?

@sharkdp
Copy link
Owner

sharkdp commented Oct 24, 2020

You didn't mention --exact-depth in your response, did you miss that? Should that option also be marked as conflicting?

Yes, let's also mark it as conflicting. Would the same be true for --min-depth? I would imagine that --max-depth is fine, but the other two are not? I haven't thought about it in detail though.

@reima
Copy link
Contributor Author

reima commented Oct 24, 2020

--prune can be combined with both --min-depth and --max-depth to achieve different results:

$ tree
.
├── foo
│   ├── bar
│   │   └── x
│   │       └── x
│   └── x
│       └── x
└── x

7 directories, 0 files

$ fd x --min-depth 2
foo/bar/x
foo/bar/x/x
foo/x
foo/x/x

$ fd x --min-depth 2 --prune
foo/bar/x
foo/x

$ fd x --max-depth 3
foo/bar/x
foo/x
foo/x/x
x

$ fd x --max-depth 3 --prune
foo/bar/x
foo/x
x

$ fd x --min-depth 2 --max-depth 3
foo/bar/x
foo/x
foo/x/x

$ fd x --min-depth 2 --max-depth 3 --prune
foo/bar/x
foo/x

Copy link
Owner

@sharkdp sharkdp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the updates and the explanation!

@sharkdp sharkdp merged commit ec4cc98 into sharkdp:master Oct 25, 2020
@sharkdp sharkdp mentioned this pull request Oct 25, 2020
@vvilhonen
Copy link

Any chance to cut a release from master? I could use the flag

chenrui333 added a commit to chenrui333/homebrew-core that referenced this pull request Dec 6, 2020
@sharkdp
Copy link
Owner

sharkdp commented Dec 6, 2020

released in v8.2.0.

BrewTestBot pushed a commit to Homebrew/homebrew-core that referenced this pull request Dec 6, 2020
* fd 8.2.0
* fd: fix zsh_completion location
  relates to sharkdp/fd#658

Closes #66345.

Signed-off-by: FX Coudert <fxcoudert@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

-prune options
3 participants