Without pattern 'find' ~8x faster than 'fd' #191

dtcyganok · 2017-12-05T15:44:28Z

OS: Linux 4.14.3-1-ARCH
FS: ext4
fd: 6.0.0

Simple test:

> time find > /dev/null
find > /dev/null  0.00s user 0.06s system 97% cpu 0.065 total
> time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.75s user 0.10s system 133% cpu 0.631 total

Also fd with pattern faster ~8x than fd without pattern:

> time fd --color never -HI '.zshrc' > /dev/null
fd --color never -HI '.zshrc' > /dev/null  0.18s user 0.06s system 325% cpu 0.073 total
> time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.74s user 0.06s system 132% cpu 0.608 total

The text was updated successfully, but these errors were encountered:

alaymari · 2017-12-05T16:39:51Z

OS: Debian testing
FS: ext4
fd: 6.0.0

Installed fd and tried it on my camera folder. The results are here:

find . -iname '*.jpg' 0.08s user 0.06s system 66% cpu 0.210 total

fd '.*\.jpg$' . 0.41s user 0.16s system 95% cpu 0.602 total

The time taken and cpu usage are very high contrary to the expectation.

sharkdp · 2017-12-05T20:14:30Z

Interesting, thank you for reporting this.

How many entries are found (without the search pattern)?

Also, how many cores does your machine have?

dtcyganok · 2017-12-05T20:53:53Z

Directory size: 27G
Count of entries: 35809
Count of cores: 4

alaymari · 2017-12-06T00:43:19Z

Machine details:

System:    Host: brahman Kernel: 4.14.0-2.3-liquorix-amd64 x86_64 bits: 64 Desktop: Openbox 3.6.1
           Distro: Debian GNU/Linux buster/sid
Machine:   Device: desktop Mobo: ASUSTeK model: M4A78LT-M-LE v: Rev X.0x serial: N/A
           BIOS: American Megatrends v: 0801 date: 01/17/2011
CPU:       Quad core AMD Phenom II X4 960T (-MCP-) cache: 2048 KB
           clock speeds: max: 3000 MHz 1: 1600 MHz 2: 3000 MHz 3: 800 MHz 4: 3000 MHz

Directory size: 261 GiB
Number of files:

> find . -name "*.jpg" | wc -l
16049
> fd ".*\.jpg$" . | wc -l     
17006
> find . -iname "*.jpg" | wc -l
17061
> find . -iregex ".*\.jpg$" | wc -l
17061

Even the numbers reported are different.

sharkdp · 2017-12-06T04:29:54Z

@dtcyganok Thanks. Are you using the fd version from the official Arch repositories? Or did you build it yourself?

@alaymari

Even the numbers reported are different.

Please make sure to run fd with the -I and -H options. Otherwise you can not expect the results to be the same. Also, just as a remark, you can use fd -e jpg to perform searches for a particular file extension. So you could try:

> fd -HI -e jpg | wc -l

dtcyganok · 2017-12-06T08:41:35Z

I am using the fd from the official Arch repositories

alaymari · 2017-12-06T12:30:43Z

@sharkdp

> fd -HI -e jpg | wc -l
17061

No change :-) This is just FYI.

Hope you will be coming out with a solution for this problem soon.

partounian · 2017-12-11T17:32:06Z

So fd is slower than find when you are not using any pattern? Anyway to avoid that?

sharkdp · 2017-12-14T04:07:19Z

I might have an idea what's causing this.

@dtcyganok Could you please run your initial benchmark and add --max-buffer-time 0 as an additional option?

Closes #191

sharkdp · 2018-01-03T09:41:36Z

This should be fixed now. Would be great if someone could confirm this.

jakwings · 2018-01-03T10:08:51Z

That fix looks like a trickery. Is the buffer size the main issue of the slowness? Are you sure the buffering mode does not end too early before the timeout is reached in most cases (with the default buffer time)?

sharkdp · 2018-01-03T10:23:46Z

Is the buffer size the main issue of the slowness? Are you sure the buffering mode does not end too early before the timeout is reached in most cases (with the default buffer time)?

For the cases mentioned in this ticket -- yes. If there is no search pattern and there is a huge search tree it means that results will accumulate fast (100 ms is enough to gather ~ 100,000 entries). In my test case (no search pattern, ~ 130,000 results), adding the buffer size limitation speeds up the search by a factor of 10.

jakwings · 2018-01-03T10:44:30Z

@sharkdp Hmm, I think time::Instant::now() is main cause. But now I also think it make sense to limit the buffer size, as it is much slower to print the result to the terminal.

sharkdp · 2018-01-03T10:46:32Z

Hmm, I think time::Instant::now() is main cause.

That could very well be the cause, thanks! Another potential cause could be the dynamic re-allocation for the buffer.

jakwings · 2018-01-03T11:33:59Z

If 1000 is a good limit for buffer size for all cases, then --max-buffer-time can be deprecated. But I'm still uncertain about if it makes a nice change to the visual effect: You may get 1000 paths sorted, then the last 1000+ paths unordered (~2000 results in total). In my fork, I just throttle the time check by only checking it every 500 iterations. (My logic was in a tangle.)

jakwings · 2018-01-03T11:39:45Z

Another potential cause could be the dynamic re-allocation for the buffer.

And another potential cause: sorting

sharkdp · 2018-01-03T12:15:48Z

If 1000 is a good limit for buffer size for all cases, then --max-buffer-time can be deprecated.

I don't think so. If there are less than 1000 search results, --max-buffer-time is still relevant. It ensures that we show results (almost) immediately on a long-running search.

But I'm still uncertain about if it makes a nice change to the visual effect: You may get 1000 paths sorted, then the last 1000+ paths unordered (~2000 results in total).

That's not how it works. If we reach the time or size limit we just start streaming all results to the console (unsorted).

jakwings · 2018-01-03T12:28:41Z

I don't think so. If there are less than 1000 search results, --max-buffer-time is still relevant. It ensures that we show results (almost) immediately on a long-running search.

Ah, my logic is in a tangle in the previous comment. Searching for 1000 results takes only ~20ms (less than 100ms) on my machine. Isn't the buffer time limit used for displaying sorted results when time allows?

sharkdp · 2018-01-03T12:49:21Z

Isn't the buffer time limit used for displaying sorted results when time allows?

Correct. If the full search finishes faster than the max-buffer-time (t < T_max), the results are printed in order.

I think with the new mechanism (N_max = MAX_BUFFER_LENGTH) there are now four cases:

fast search (t <= T_max), few search results (n <= N_max):

As before, results will be printed in a sorted way almost immediately.
fast search (t <= T_max), many search results (n > N_max):

In contrast to before, search results will not be sorted but printing to the console should be much faster.
slow search (t > T_max), few search results (n < N_max):

No change to before. Results will be printed as soon as they are found (except for the short period until T_max).
slow search (t > T_max), many search results (n > N_max):

No change to before except that we might start streaming to the console even earlier.

jakwings · 2018-01-03T12:59:55Z

Yeah, for fast search (~1000 results: < 20ms for gathering&sorting results, and < 300ms total run time), most of the time is spent on printing, not gathering. The buffer time is neglectable, I think.

UPDATE: Ah, the slow search means the matched files are not enough, not just the I/O is too slow. So the time limit is still needed.

alaymari · 2018-01-05T01:51:19Z

Tried the new version. Looks like there is no change. The results:

$ time find > /dev/null
find > /dev/null  0.02s user 0.01s system 110% cpu 0.027 total

$ fd --version                                    
fd 6.2.0

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.18s user 0.02s system 111% cpu 0.185 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.12s user 0.03s system 254% cpu 0.059 total

sharkdp · 2018-01-05T21:57:31Z

@alaymari Thank you for taking a look at this again.

A couple of comments:

Please always run a "warm up" command (e.g. simply run find a few times) that fills the caches immediately before measuring any runtimes.
Please average the runtimes over multiple runs (ideally, use a tool like bench)
Your execution times are very low and possibly even limited by some initialization processes (which could be optimized as well, of course!). Could you try the same benchmark on a bigger search tree?

alaymari · 2018-01-06T03:39:47Z

OK. Here is what I did:

$ find > /dev/null
$ find > /dev/null
$ find > /dev/null
$ find > /dev/null
$ find > /dev/null
$ time find > /dev/null
find > /dev/null  0.01s user 0.02s system 114% cpu 0.030 total
$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.14s user 0.03s system 242% cpu 0.072 total

$ time fd '.*\.jpg$' > /dev/null                                              
fd '.*\.jpg$' > /dev/null  0.14s user 0.03s system 333% cpu 0.050 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.13s user 0.03s system 345% cpu 0.048 total

$ time fd '.*\.jpg$' > /dev/null       
fd '.*\.jpg$' > /dev/null  0.15s user 0.02s system 351% cpu 0.047 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.14s user 0.02s system 355% cpu 0.046 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.14s user 0.03s system 352% cpu 0.046 total

$ time fd '.*\.jpg$' > /dev/null
fd '.*\.jpg$' > /dev/null  0.14s user 0.02s system 358% cpu 0.046 total

I will try later on a bigger set. Right now:

$ find . -name "*.jpg" | wc -l
16406

$ find . | wc -l              
37455

partounian · 2018-01-06T19:44:05Z

I ran the bench tool in git [lfs] folder for a relatively big repo. It contains many 3d file assets. I can of course give some more generalized details upon request.

NOTE: This was run on a macOS 10.13.2 system, not sure if this find differs than on other machines.

benchmarking bench/find .
time                 448.4 ms   (439.6 ms .. 459.3 ms)
                     1.000 R²   (1.000 R² .. 1.000 R²)
mean                 446.8 ms   (444.2 ms .. 448.4 ms)
std dev              2.510 ms   (0.0 s .. 2.897 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd
time                 102.0 ms   (99.34 ms .. 104.5 ms)
                     0.999 R²   (0.997 R² .. 1.000 R²)
mean                 102.1 ms   (101.2 ms .. 103.5 ms)
std dev              1.757 ms   (1.284 ms .. 2.386 ms)

And another set done here with --color never for fd

benchmarking bench/find .
time                 450.8 ms   (431.1 ms .. 465.4 ms)
                     1.000 R²   (0.999 R² .. 1.000 R²)
mean                 474.3 ms   (468.8 ms .. 477.9 ms)
std dev              5.328 ms   (0.0 s .. 6.137 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd --color never
time                 108.7 ms   (103.5 ms .. 114.6 ms)
                     0.996 R²   (0.991 R² .. 1.000 R²)
mean                 107.4 ms   (105.4 ms .. 109.6 ms)
std dev              3.253 ms   (2.507 ms .. 4.248 ms)

This final set is a 1 to 1 comparison.

benchmarking bench/find .
time                 444.5 ms   (415.2 ms .. 481.7 ms)
                     0.999 R²   (0.999 R² .. 1.000 R²)
mean                 454.0 ms   (444.8 ms .. 460.0 ms)
std dev              8.970 ms   (0.0 s .. 10.34 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd --color never -HI
time                 196.4 ms   (141.8 ms .. 223.5 ms)
                     0.961 R²   (0.893 R² .. 1.000 R²)
mean                 241.1 ms   (214.9 ms .. 285.3 ms)
std dev              39.80 ms   (19.16 ms .. 49.98 ms)
variance introduced by outliers: 48% (moderately inflated)

sharkdp · 2018-01-06T20:22:11Z

@alaymari Thanks!

This issue is really about running fd without a pattern, so let's focus on this case. It seems like fd is indeed slower in your benchmark (although you did not average over multiple runs for both find and fd without a pattern). Is there anything special about the directory you are running these tests from? Do you get the same results from both find and fd? These execution times are still really fast for both programs. In my opinion, things get interesting for times (or time differences) larger than 100 ms because a human will be able to recognize the delay (and we are typically not running fd or find multiple times in a sequence).

@partounian Thanks! A comparison like this is not really fair because fd will not traverse hidden and ignored directories. Could you try fd -HI --color never?

Here is how I run this particular type of benchmark: https://gist.github.com/sharkdp/f2dda4ea0af1563a3dbdae4e14d9496a

partounian · 2018-01-06T22:17:47Z

@sharkdp the last example is actually what you've asked for :)

sharkdp · 2018-01-06T22:28:26Z

Oh, thanks! I missed that. Good to know that it works for you.

alaymari · 2018-01-11T03:12:26Z

Sorry for the late reply. Would this help?

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.15s user 0.03s system 316% cpu 0.054 total

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.15s user 0.02s system 300% cpu 0.056 total

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.14s user 0.04s system 308% cpu 0.057 total

$ time fd --color never -HI > /dev/null
fd --color never -HI > /dev/null  0.14s user 0.03s system 312% cpu 0.055 total

$ time find . > /dev/null              
find . > /dev/null  0.02s user 0.02s system 119% cpu 0.032 total

$ time find . > /dev/null
find . > /dev/null  0.02s user 0.02s system 116% cpu 0.028 total

$ time find . > /dev/null
find . > /dev/null  0.01s user 0.02s system 109% cpu 0.029 total

$ time find . > /dev/null
find . > /dev/null  0.02s user 0.02s system 112% cpu 0.030 total

I am not seeing any difference with or without the pattern for fd.

partounian · 2018-01-11T04:19:36Z

@alaymari Could you please install bench and run this script?

EDIT: Here is my run of the script

e.sh
Running warmup ..... done

benchmarking bench/find '/Users/patrick' || true
time                 24.73 s    (23.18 s .. 26.71 s)
                     0.999 R²   (0.998 R² .. 1.000 R²)
mean                 24.21 s    (23.61 s .. 24.56 s)
std dev              539.7 ms   (0.0 s .. 598.1 ms)
variance introduced by outliers: 19% (moderately inflated)

benchmarking bench/fd --hidden --no-ignore '' '/Users/patrick'
time                 5.239 s    (2.937 s .. NaN s)
                     0.923 R²   (0.886 R² .. 1.000 R²)
mean                 3.729 s    (3.042 s .. 4.325 s)
std dev              965.2 ms   (0.0 s .. 1.033 s)
variance introduced by outliers: 72% (severely inflated)

WARNING: There were differences between the search results of fd and find!
Run 'diff /tmp/results.fd /tmp/results.find'.

alaymari · 2018-01-11T14:23:30Z

Installed haskell-stack.

# apt-get install haskell-stack

On trying to setup, got this error. I know the problem does not belong here, still am posting the error message. I am completely clueless about haskell :-(

$ stack setup
Writing implicit global project config file to: /home/mas/.stack/global-project/stack.yaml
Note: You can change the snapshot via the resolver field there.
Using latest snapshot resolver: lts-10.3
Downloaded lts-10.3 build plan.    
AesonException "Error in $.packages.cassava.constraints.flags['bytestring--lt-0_10_4']: Invalid flag name: \"bytestring--lt-0_10_4\""

So, I am stuck there trying to install bench.

partounian · 2018-01-11T16:52:35Z

Sorry cannot help as I am spoiled by homebrew. Maybe linuxbrew could help you.

gbaz · 2018-01-15T05:22:05Z

if you upgrade your stack version to the latest, you can use the latest resolver.

sharkdp · 2018-01-15T18:45:02Z

If you still want to run the benchmark, you can also use my new tool hyperfine (in case that is easier to install). It features a --warmup option:

> export base_path=...
> hyperfine --warmup 5 'fd -HI "" $base_path' 'find $base_path'

sharkdp added help wanted performance labels Dec 14, 2017

sharkdp added this to the 6.2.0 milestone Jan 1, 2018

sharkdp added a commit that referenced this issue Jan 3, 2018

Introduce maximum output buffer size

3963684

Closes #191

sharkdp mentioned this issue Jan 3, 2018

Introduce maximum output buffer size #212

Merged

sharkdp closed this as completed in #212 Jan 3, 2018

sharkdp added a commit that referenced this issue Jan 3, 2018

Introduce maximum output buffer size (#212)

32a34cf

Closes #191

sharkdp reopened this Jan 15, 2018

sharkdp closed this as completed Jan 15, 2018

sharkdp mentioned this issue Jun 14, 2018

fd without pattern is much slower than find #304

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Without pattern 'find' ~8x faster than 'fd' #191

Without pattern 'find' ~8x faster than 'fd' #191

dtcyganok commented Dec 5, 2017 •

edited

Loading

alaymari commented Dec 5, 2017 •

edited

Loading

sharkdp commented Dec 5, 2017

dtcyganok commented Dec 5, 2017

alaymari commented Dec 6, 2017 •

edited

Loading

sharkdp commented Dec 6, 2017

dtcyganok commented Dec 6, 2017

alaymari commented Dec 6, 2017

partounian commented Dec 11, 2017

sharkdp commented Dec 14, 2017

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018 •

edited

Loading

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018 •

edited

Loading

jakwings commented Jan 3, 2018

sharkdp commented Jan 3, 2018 •

edited

Loading

jakwings commented Jan 3, 2018

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018 •

edited

Loading

alaymari commented Jan 5, 2018 •

edited

Loading

sharkdp commented Jan 5, 2018

alaymari commented Jan 6, 2018

partounian commented Jan 6, 2018 •

edited

Loading

sharkdp commented Jan 6, 2018

partounian commented Jan 6, 2018

sharkdp commented Jan 6, 2018

alaymari commented Jan 11, 2018 •

edited

Loading

partounian commented Jan 11, 2018 •

edited

Loading

alaymari commented Jan 11, 2018 •

edited

Loading

partounian commented Jan 11, 2018

gbaz commented Jan 15, 2018

sharkdp commented Jan 15, 2018 •

edited

Loading

Without pattern 'find' ~8x faster than 'fd' #191

Without pattern 'find' ~8x faster than 'fd' #191

Comments

dtcyganok commented Dec 5, 2017 • edited Loading

alaymari commented Dec 5, 2017 • edited Loading

sharkdp commented Dec 5, 2017

dtcyganok commented Dec 5, 2017

alaymari commented Dec 6, 2017 • edited Loading

sharkdp commented Dec 6, 2017

dtcyganok commented Dec 6, 2017

alaymari commented Dec 6, 2017

partounian commented Dec 11, 2017

sharkdp commented Dec 14, 2017

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018 • edited Loading

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018 • edited Loading

jakwings commented Jan 3, 2018

sharkdp commented Jan 3, 2018 • edited Loading

jakwings commented Jan 3, 2018

sharkdp commented Jan 3, 2018

jakwings commented Jan 3, 2018 • edited Loading

alaymari commented Jan 5, 2018 • edited Loading

sharkdp commented Jan 5, 2018

alaymari commented Jan 6, 2018

partounian commented Jan 6, 2018 • edited Loading

sharkdp commented Jan 6, 2018

partounian commented Jan 6, 2018

sharkdp commented Jan 6, 2018

alaymari commented Jan 11, 2018 • edited Loading

partounian commented Jan 11, 2018 • edited Loading

alaymari commented Jan 11, 2018 • edited Loading

partounian commented Jan 11, 2018

gbaz commented Jan 15, 2018

sharkdp commented Jan 15, 2018 • edited Loading

dtcyganok commented Dec 5, 2017 •

edited

Loading

alaymari commented Dec 5, 2017 •

edited

Loading

alaymari commented Dec 6, 2017 •

edited

Loading

jakwings commented Jan 3, 2018 •

edited

Loading

jakwings commented Jan 3, 2018 •

edited

Loading

sharkdp commented Jan 3, 2018 •

edited

Loading

jakwings commented Jan 3, 2018 •

edited

Loading

alaymari commented Jan 5, 2018 •

edited

Loading

partounian commented Jan 6, 2018 •

edited

Loading

alaymari commented Jan 11, 2018 •

edited

Loading

partounian commented Jan 11, 2018 •

edited

Loading

alaymari commented Jan 11, 2018 •

edited

Loading

sharkdp commented Jan 15, 2018 •

edited

Loading