experimental and very fast implementation of a grep
C++ Makefile
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


grab - simple but very fast grep

This is my own, experimental, parallel version of grep so I can test various strategies to speed up access to large directory trees. On SSD's you can easily outsmart common greps by up to 100%.


 -O     -- print file offset of match
 -l     -- do not print the matching line (Useful if you want
           to see _all_ offsets; if you also print the line, only
           the first match in the line counts)
 -s     -- single match; dont search file further after first match
           (similar to grep on a binary)
 -L     -- machine has low mem; half chunk-size (default 1GB)
           may be used multiple times
 -I     -- enable highlighting of matches
 -c <n> -- Use n cores in parallel (useless and even slower in most situations)
           n <= 1 uses single-core
 -r     -- recurse on directory
 -R     -- same as -r

grab uses the pcre library, so basically its equivalent to a grep -P -a

Why is it faster?

grab is using mmap(2) and matches the whole file blob without counting newlines (which grep is doing even if there is no match) which is a lot faster than reading the file in chunks and counting the newlines. If available, grab also uses the PCRE JIT feature. However, speedups are only measurable on fast HDD's or SSD's. In the later case, the speedup can be really drastically (even up to 100%) if matching recursively. So clearly, the storage is the bottleneck, and parallelizing the search is in most cases even slower, as the seeking takes more time than just doing stuff in linear; even on SSD's.

Additionally, grab is skipping files which are too small to contain the regular expression. For larger regex's in a recursive search, this can skip quite good amount of files without even opening them.

A quite new pcre lib is required, on some older systems the build can fail due to PCRE_INFO_MINLENGTH and pcre_study().

Files are mmaped and matched in chunks of 1Gig. For files which are larger, the last 4096 byte (1 page) of a chunk are overlapped, so that matches on a 1 Gig boundary can be found. In this case, you see the match doubled (but with the same offset).

If you measure grep vs. grab, keep in mind to drop the dentry and page caches between each run: echo 3 > /proc/sys/vm/drop_caches

grab was made to quickly grep through large directory trees. The original grep has by far a more complete option-set. The speedup for a single file match is very small, if at all (stdin cannot be mmapped and I am too lazy to add a pread() workaround just for this useless case)

For SSD's, the multicore option can make sense. For HDD's it doesnt since the head has to be positioned back and forth between the threads, which kills performance.