Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.
Sign upbenchmark time on very small corpora #33
Comments
BurntSushi
added
the
enhancement
label
Sep 23, 2016
BurntSushi
referenced this issue
Sep 25, 2016
Closed
rg doesn't stop after first match when -q is used #77
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
BurntSushi
Sep 25, 2016
Owner
#77 is another report of grep beating rg on a single file because of startup time.
|
#77 is another report of |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
BurntSushi
Sep 26, 2016
Owner
Someone ran micobenchmarks on infinitesimally sized input. grep is twice as fast.
|
Someone ran micobenchmarks on infinitesimally sized input. |
This comment has been minimized.
Show comment
Hide comment
This comment has been minimized.
BurntSushi
Nov 20, 2016
Owner
OK, I think I can call this one done. I think a variety of small improvements have mostly fixed this:
- Switch from Docopt to clap (probably the biggest contributor).
- Permit ignore handling/parsing to reuse previous results when applicable.
- Use a parallel recursive directory iterator.
In particular, on the previously linked microbenchmark:
$ python -m timeit -n 1 -r 10 -v -s 'import os' 'os.system("cat /etc/group | grep 1000 > /dev/null")'
raw times: 0.0115 0.0104 0.0103 0.0104 0.00902 0.00929 0.00842 0.00919 0.00792 0.00861
1 loops, best of 10: 7.92 msec per loop
$ python -m timeit -n 1 -r 10 -v -s 'import os' 'os.system("cat /etc/group | rg 1000 > /dev/null")'
raw times: 0.0103 0.0103 0.00946 0.00881 0.00871 0.00809 0.00802 0.00764 0.00722 0.00714
1 loops, best of 10: 7.14 msec per loop
Compare this with ripgrep 0.2.6:
python -m timeit -n 1 -r 10 -v -s 'import os' 'os.system("cat /etc/group | rg-0.2.6 1000 > /dev/null")'
raw times: 0.0261 0.0251 0.0257 0.0217 0.0234 0.0239 0.0237 0.0235 0.022 0.0245
1 loops, best of 10: 21.7 msec per loop
I've also done some testing on small repos. This repo qualifies as quite small. There is a ton of variance. The following were ran in succession:
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305
real 0m0.033s
user 0m0.257s
sys 0m0.007s
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305
real 0m0.025s
user 0m0.157s
sys 0m0.023s
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305
real 0m0.013s
user 0m0.017s
sys 0m0.030s
[andrew@Cheetah ripgrep] time rg ripgrep | wc -l
305
real 0m0.040s
user 0m0.270s
sys 0m0.007s
ag seems to have slightly less variance:
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306
real 0m0.029s
user 0m0.007s
sys 0m0.010s
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306
real 0m0.022s
user 0m0.007s
sys 0m0.010s
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306
real 0m0.023s
user 0m0.020s
sys 0m0.003s
[andrew@Cheetah ripgrep] time ag ripgrep | wc -l
306
real 0m0.016s
user 0m0.013s
sys 0m0.003s
But, if we fix both programs to use a single thread, then variance becomes almost non-existent:
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305
real 0m0.011s
user 0m0.003s
sys 0m0.007s
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305
real 0m0.011s
user 0m0.007s
sys 0m0.007s
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305
real 0m0.011s
user 0m0.000s
sys 0m0.010s
[andrew@Cheetah ripgrep] time rg -j1 ripgrep | wc -l
305
real 0m0.010s
user 0m0.003s
sys 0m0.003s
and ag:
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306
real 0m0.018s
user 0m0.010s
sys 0m0.000s
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306
real 0m0.018s
user 0m0.010s
sys 0m0.000s
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306
real 0m0.018s
user 0m0.010s
sys 0m0.003s
[andrew@Cheetah ripgrep] time ag --workers 1 ripgrep | wc -l
306
real 0m0.021s
user 0m0.007s
sys 0m0.003s
In other words, on small corpora, ripgrep (and, to a lesser extent, ag) seem to be quite susceptible to the overhead of creating threads, which also seems to introduce large variance.
I'm not sure there's anything we can really do about it. The difference in this particular example, at least, is nominal. From a user's perspective, we don't really care about the difference between 10ms and 30ms. However, the place where this can hurt us is if folks use ripgrep like it was grep, for example, in an xargs pipeline:
$ time find ./ -name '*.[ch]' -print0 | xargs -0 -P8 grep PM_RESUME | wc -l
10
real 0m0.213s
user 0m0.423s
sys 0m0.330s
$ time find ./ -name '*.[ch]' -print0 | xargs -0 -P8 rg PM_RESUME | wc -l
10
real 0m0.402s
user 0m4.750s
sys 0m0.460s
The overhead of ripgrep launching threads ends up making it twice as slow. Of course, in this case, ripgrep launching threads doesn't really make much sense since xargs is doing it for us. But this isn't something that a user will obviously know. Of course, forcing ripgrep to use a single thread brings it back down to grep speeds:
time find ./ -name '*.[ch]' -print0 | xargs -0 -P8 rg -j1 PM_RESUME | wc -l
10
real 0m0.206s
user 0m0.507s
sys 0m0.487s
The other case to consider is using xargs without -P and having ripgrep handle parallelism:
time find ./ -name '*.[ch]' -print0 | xargs -0 rg PM_RESUME | wc -l
10
real 0m0.463s
user 0m1.297s
sys 0m0.597s
While worse than using xargs -P, compare with grep:
time find ./ -name '*.[ch]' -print0 | xargs -0 grep PM_RESUME | wc -l
10
real 0m0.655s
user 0m0.477s
sys 0m0.347s
I'm not sure there's really too much we can do here. An end user can use ripgrep in an xargs pipeline optimally, but it definitely requires a bit of intuition on the user's behalf to realize that ripgrep is already multithreaded where as grep is not.
(This is yet another example proving that benchmarking these tools is ridiculously hard.)
|
OK, I think I can call this one done. I think a variety of small improvements have mostly fixed this:
In particular, on the previously linked microbenchmark:
Compare this with ripgrep 0.2.6:
I've also done some testing on small repos. This repo qualifies as quite small. There is a ton of variance. The following were ran in succession:
But, if we fix both programs to use a single thread, then variance becomes almost non-existent:
and
In other words, on small corpora, ripgrep (and, to a lesser extent, I'm not sure there's anything we can really do about it. The difference in this particular example, at least, is nominal. From a user's perspective, we don't really care about the difference between 10ms and 30ms. However, the place where this can hurt us is if folks use ripgrep like it was grep, for example, in an
The overhead of ripgrep launching threads ends up making it twice as slow. Of course, in this case, ripgrep launching threads doesn't really make much sense since
The other case to consider is using
While worse than using
I'm not sure there's really too much we can do here. An end user can use ripgrep in an (This is yet another example proving that benchmarking these tools is ridiculously hard.) |
BurntSushi commentedSep 23, 2016
An end user reports that
rgisn't as fast onagon very small repositories. While it seems trivial, if this is because of startup time, then it's worth investigating and fixing.