Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Without pattern 'find' ~8x faster than 'fd' #191

Closed
dtcyganok opened this issue Dec 5, 2017 · 33 comments
Closed

Without pattern 'find' ~8x faster than 'fd' #191

dtcyganok opened this issue Dec 5, 2017 · 33 comments

Comments

@dtcyganok
Copy link

dtcyganok commented Dec 5, 2017

OS: Linux 4.14.3-1-ARCH
FS: ext4
fd: 6.0.0

Simple test:

> time find > /dev/null
find > /dev/null  0.00s user 0.06s system 97% cpu 0.065 total
> time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.75s user 0.10s system 133% cpu 0.631 total

Also fd with pattern faster ~8x than fd without pattern:

> time fd --color never -HI '.zshrc' > /dev/null
fd --color never -HI '.zshrc' > /dev/null  0.18s user 0.06s system 325% cpu 0.073 total
> time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.74s user 0.06s system 132% cpu 0.608 total
@alaymari
Copy link

alaymari commented Dec 5, 2017

OS: Debian testing
FS: ext4
fd: 6.0.0

Installed fd and tried it on my camera folder. The results are here:

find . -iname '*.jpg' 0.08s user 0.06s system 66% cpu 0.210 total

fd '.*\.jpg$' . 0.41s user 0.16s system 95% cpu 0.602 total

The time taken and cpu usage are very high contrary to the expectation.

@sharkdp
Copy link
Owner

sharkdp commented Dec 5, 2017

Interesting, thank you for reporting this.

How many entries are found (without the search pattern)?

Also, how many cores does your machine have?

@dtcyganok
Copy link
Author

Directory size: 27G
Count of entries: 35809
Count of cores: 4

@alaymari
Copy link

alaymari commented Dec 6, 2017

Machine details:

System:    Host: brahman Kernel: 4.14.0-2.3-liquorix-amd64 x86_64 bits: 64 Desktop: Openbox 3.6.1
           Distro: Debian GNU/Linux buster/sid
Machine:   Device: desktop Mobo: ASUSTeK model: M4A78LT-M-LE v: Rev X.0x serial: N/A
           BIOS: American Megatrends v: 0801 date: 01/17/2011
CPU:       Quad core AMD Phenom II X4 960T (-MCP-) cache: 2048 KB
           clock speeds: max: 3000 MHz 1: 1600 MHz 2: 3000 MHz 3: 800 MHz 4: 3000 MHz

Directory size: 261 GiB
Number of files:

> find . -name "*.jpg" | wc -l
16049
> fd ".*\.jpg$" . | wc -l     
17006
> find . -iname "*.jpg" | wc -l
17061
> find . -iregex ".*\.jpg$" | wc -l
17061

Even the numbers reported are different.

@sharkdp
Copy link
Owner

sharkdp commented Dec 6, 2017

@dtcyganok Thanks. Are you using the fd version from the official Arch repositories? Or did you build it yourself?

@alaymari

Even the numbers reported are different.

Please make sure to run fd with the -I and -H options. Otherwise you can not expect the results to be the same. Also, just as a remark, you can use fd -e jpg to perform searches for a particular file extension. So you could try:

> fd -HI -e jpg | wc -l

@dtcyganok
Copy link
Author

I am using the fd from the official Arch repositories

@alaymari
Copy link

alaymari commented Dec 6, 2017

@sharkdp

> fd -HI -e jpg | wc -l
17061

No change :-) This is just FYI.

Hope you will be coming out with a solution for this problem soon.

@partounian
Copy link

So fd is slower than find when you are not using any pattern? Anyway to avoid that?

@sharkdp
Copy link
Owner

sharkdp commented Dec 14, 2017

I might have an idea what's causing this.

@dtcyganok Could you please run your initial benchmark and add --max-buffer-time 0 as an additional option?

@sharkdp
Copy link
Owner

sharkdp commented Jan 3, 2018

This should be fixed now. Would be great if someone could confirm this.

@jakwings
Copy link
Contributor

jakwings commented Jan 3, 2018

That fix looks like a trickery. Is the buffer size the main issue of the slowness? Are you sure the buffering mode does not end too early before the timeout is reached in most cases (with the default buffer time)?

@sharkdp
Copy link
Owner

sharkdp commented Jan 3, 2018

Is the buffer size the main issue of the slowness? Are you sure the buffering mode does not end too early before the timeout is reached in most cases (with the default buffer time)?

For the cases mentioned in this ticket -- yes. If there is no search pattern and there is a huge search tree it means that results will accumulate fast (100 ms is enough to gather ~ 100,000 entries). In my test case (no search pattern, ~ 130,000 results), adding the buffer size limitation speeds up the search by a factor of 10.

@jakwings
Copy link
Contributor

jakwings commented Jan 3, 2018

@sharkdp Hmm, I think time::Instant::now() is main cause. But now I also think it make sense to limit the buffer size, as it is much slower to print the result to the terminal.

@sharkdp
Copy link
Owner

sharkdp commented Jan 3, 2018

Hmm, I think time::Instant::now() is main cause.

That could very well be the cause, thanks! Another potential cause could be the dynamic re-allocation for the buffer.

@jakwings
Copy link
Contributor

jakwings commented Jan 3, 2018

If 1000 is a good limit for buffer size for all cases, then --max-buffer-time can be deprecated. But I'm still uncertain about if it makes a nice change to the visual effect: You may get 1000 paths sorted, then the last 1000+ paths unordered (~2000 results in total). In my fork, I just throttle the time check by only checking it every 500 iterations. (My logic was in a tangle.)

@jakwings
Copy link
Contributor

jakwings commented Jan 3, 2018

Another potential cause could be the dynamic re-allocation for the buffer.

And another potential cause: sorting

@sharkdp
Copy link
Owner

sharkdp commented Jan 3, 2018

If 1000 is a good limit for buffer size for all cases, then --max-buffer-time can be deprecated.

I don't think so. If there are less than 1000 search results, --max-buffer-time is still relevant. It ensures that we show results (almost) immediately on a long-running search.

But I'm still uncertain about if it makes a nice change to the visual effect: You may get 1000 paths sorted, then the last 1000+ paths unordered (~2000 results in total).

That's not how it works. If we reach the time or size limit we just start streaming all results to the console (unsorted).

@jakwings
Copy link
Contributor

jakwings commented Jan 3, 2018

I don't think so. If there are less than 1000 search results, --max-buffer-time is still relevant. It ensures that we show results (almost) immediately on a long-running search.

Ah, my logic is in a tangle in the previous comment. Searching for 1000 results takes only ~20ms (less than 100ms) on my machine. Isn't the buffer time limit used for displaying sorted results when time allows?

@sharkdp
Copy link
Owner

sharkdp commented Jan 3, 2018

Isn't the buffer time limit used for displaying sorted results when time allows?

Correct. If the full search finishes faster than the max-buffer-time (t < T_max), the results are printed in order.

I think with the new mechanism (N_max = MAX_BUFFER_LENGTH) there are now four cases:

  • fast search (t <= T_max), few search results (n <= N_max):

    As before, results will be printed in a sorted way almost immediately.

  • fast search (t <= T_max), many search results (n > N_max):

    In contrast to before, search results will not be sorted but printing to the console should be much faster.

  • slow search (t > T_max), few search results (n < N_max):

    No change to before. Results will be printed as soon as they are found (except for the short period until T_max).

  • slow search (t > T_max), many search results (n > N_max):

    No change to before except that we might start streaming to the console even earlier.

@jakwings
Copy link
Contributor

jakwings commented Jan 3, 2018

Yeah, for fast search (~1000 results: < 20ms for gathering&sorting results, and < 300ms total run time), most of the time is spent on printing, not gathering. The buffer time is neglectable, I think.

UPDATE: Ah, the slow search means the matched files are not enough, not just the I/O is too slow. So the time limit is still needed.

@alaymari
Copy link

alaymari commented Jan 5, 2018

Tried the new version. Looks like there is no change. The results:

$ time find > /dev/null
find > /dev/null  0.02s user 0.01s system 110% cpu 0.027 total

$ fd --version                                    
fd 6.2.0

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.18s user 0.02s system 111% cpu 0.185 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.12s user 0.03s system 254% cpu 0.059 total

@sharkdp
Copy link
Owner

sharkdp commented Jan 5, 2018

@alaymari Thank you for taking a look at this again.

A couple of comments:

  • Please always run a "warm up" command (e.g. simply run find a few times) that fills the caches immediately before measuring any runtimes.
  • Please average the runtimes over multiple runs (ideally, use a tool like bench)
  • Your execution times are very low and possibly even limited by some initialization processes (which could be optimized as well, of course!). Could you try the same benchmark on a bigger search tree?

@alaymari
Copy link

alaymari commented Jan 6, 2018

OK. Here is what I did:

$ find > /dev/null
$ find > /dev/null
$ find > /dev/null
$ find > /dev/null
$ find > /dev/null
$ time find > /dev/null
find > /dev/null  0.01s user 0.02s system 114% cpu 0.030 total
$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.14s user 0.03s system 242% cpu 0.072 total

$ time fd '.*\.jpg$' > /dev/null                                              
fd '.*\.jpg$' > /dev/null  0.14s user 0.03s system 333% cpu 0.050 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.13s user 0.03s system 345% cpu 0.048 total

$ time fd '.*\.jpg$' > /dev/null       
fd '.*\.jpg$' > /dev/null  0.15s user 0.02s system 351% cpu 0.047 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.14s user 0.02s system 355% cpu 0.046 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.14s user 0.03s system 352% cpu 0.046 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.14s user 0.02s system 358% cpu 0.046 total

I will try later on a bigger set. Right now:

$ find . -name "*.jpg" | wc -l
16406

$ find . | wc -l              
37455

@partounian
Copy link

partounian commented Jan 6, 2018

I ran the bench tool in git [lfs] folder for a relatively big repo. It contains many 3d file assets. I can of course give some more generalized details upon request.

NOTE: This was run on a macOS 10.13.2 system, not sure if this find differs than on other machines.

benchmarking bench/find .
time                 448.4 ms   (439.6 ms .. 459.3 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 446.8 ms   (444.2 ms .. 448.4 ms)
std dev              2.510 ms   (0.0 s .. 2.897 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd
time                 102.0 ms   (99.34 ms .. 104.5 ms)
                     0.999 R²   (0.997 R² .. 1.000 R²)
mean                 102.1 ms   (101.2 ms .. 103.5 ms)
std dev              1.757 ms   (1.284 ms .. 2.386 ms)

And another set done here with --color never for fd

benchmarking bench/find .
time                 450.8 ms   (431.1 ms .. 465.4 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 474.3 ms   (468.8 ms .. 477.9 ms)
std dev              5.328 ms   (0.0 s .. 6.137 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd --color never
time                 108.7 ms   (103.5 ms .. 114.6 ms)
                     0.996 R²   (0.991 R² .. 1.000 R²)
mean                 107.4 ms   (105.4 ms .. 109.6 ms)
std dev              3.253 ms   (2.507 ms .. 4.248 ms)

This final set is a 1 to 1 comparison.

benchmarking bench/find .
time                 444.5 ms   (415.2 ms .. 481.7 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 454.0 ms   (444.8 ms .. 460.0 ms)
std dev              8.970 ms   (0.0 s .. 10.34 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd --color never -HI
time                 196.4 ms   (141.8 ms .. 223.5 ms)
                     0.961 R²   (0.893 R² .. 1.000 R²)
mean                 241.1 ms   (214.9 ms .. 285.3 ms)
std dev              39.80 ms   (19.16 ms .. 49.98 ms)
variance introduced by outliers: 48% (moderately inflated)

@sharkdp
Copy link
Owner

sharkdp commented Jan 6, 2018

@alaymari Thanks!

This issue is really about running fd without a pattern, so let's focus on this case. It seems like fd is indeed slower in your benchmark (although you did not average over multiple runs for both find and fd without a pattern). Is there anything special about the directory you are running these tests from? Do you get the same results from both find and fd? These execution times are still really fast for both programs. In my opinion, things get interesting for times (or time differences) larger than 100 ms because a human will be able to recognize the delay (and we are typically not running fd or find multiple times in a sequence).

@partounian Thanks! A comparison like this is not really fair because fd will not traverse hidden and ignored directories. Could you try fd -HI --color never?

Here is how I run this particular type of benchmark: https://gist.github.com/sharkdp/f2dda4ea0af1563a3dbdae4e14d9496a

@partounian
Copy link

@sharkdp the last example is actually what you've asked for :)

@sharkdp
Copy link
Owner

sharkdp commented Jan 6, 2018

Oh, thanks! I missed that. Good to know that it works for you.

@alaymari
Copy link

alaymari commented Jan 11, 2018

Sorry for the late reply. Would this help?

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.15s user 0.03s system 316% cpu 0.054 total

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.15s user 0.02s system 300% cpu 0.056 total

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.14s user 0.04s system 308% cpu 0.057 total

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.14s user 0.03s system 312% cpu 0.055 total

$ time find . > /dev/null              
find . > /dev/null  0.02s user 0.02s system 119% cpu 0.032 total

$ time find . > /dev/null
find . > /dev/null  0.02s user 0.02s system 116% cpu 0.028 total

$ time find . > /dev/null
find . > /dev/null  0.01s user 0.02s system 109% cpu 0.029 total

$ time find . > /dev/null
find . > /dev/null  0.02s user 0.02s system 112% cpu 0.030 total

I am not seeing any difference with or without the pattern for fd.

@partounian
Copy link

partounian commented Jan 11, 2018

@alaymari Could you please install bench and run this script?

EDIT: Here is my run of the script

e.sh
Running warmup ..... done

benchmarking bench/find '/Users/patrick' || true
time                 24.73 s    (23.18 s .. 26.71 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 24.21 s    (23.61 s .. 24.56 s)
std dev              539.7 ms   (0.0 s .. 598.1 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd --hidden --no-ignore '' '/Users/patrick'
time                 5.239 s    (2.937 s .. NaN s)
                     0.923 R²   (0.886 R² .. 1.000 R²)
mean                 3.729 s    (3.042 s .. 4.325 s)
std dev              965.2 ms   (0.0 s .. 1.033 s)
variance introduced by outliers: 72% (severely inflated)

WARNING: There were differences between the search results of fd and find!
Run 'diff /tmp/results.fd /tmp/results.find'.

@alaymari
Copy link

alaymari commented Jan 11, 2018

Installed haskell-stack.

# apt-get install haskell-stack

On trying to setup, got this error. I know the problem does not belong here, still am posting the error message. I am completely clueless about haskell :-(

$ stack setup
Writing implicit global project config file to: /home/mas/.stack/global-project/stack.yaml
Note: You can change the snapshot via the resolver field there.
Using latest snapshot resolver: lts-10.3
Downloaded lts-10.3 build plan.    
AesonException "Error in $.packages.cassava.constraints.flags['bytestring--lt-0_10_4']: Invalid flag name: \"bytestring--lt-0_10_4\""

So, I am stuck there trying to install bench.

@partounian
Copy link

Sorry cannot help as I am spoiled by homebrew. Maybe linuxbrew could help you.

@gbaz
Copy link

gbaz commented Jan 15, 2018

if you upgrade your stack version to the latest, you can use the latest resolver.

@sharkdp
Copy link
Owner

sharkdp commented Jan 15, 2018

If you still want to run the benchmark, you can also use my new tool hyperfine (in case that is easier to install). It features a --warmup option:

> export base_path=...
> hyperfine --warmup 5 'fd -HI "" $base_path' 'find $base_path'

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants