Skip to content

Latest commit

 

History

History
35 lines (26 loc) · 1.12 KB

README.md

File metadata and controls

35 lines (26 loc) · 1.12 KB

bloom

When you need to search a string in a lot of files, bloom can index these files (using Bloom filter data structure) and then effectively filter out files that do not contain the searched string.

Compressed files are automatically decompressed before indexing.

bloom accepts list of files on standard input and searched phrase(s) as command line argument. List of possibly matching files is printed to standard output. If the --verbose (or -v) option is passed on command line, debug and statistics messages are printed to standard error output.

$ ls -1
mail.log
mail.log.1
mail.log.2.gz
$ find -type f | xargs xzegrep -H -c B3BC8A1F1D
./mail.log:0
./mail.log.1:5
./mail.log.2.gz:0
$ find -type f | bloom B3BC8A1F1D | xargs xzegrep -H -c B3BC8A1F1D
mail.log.1:5

The index data is stored in a SQLite database file, default path is ~/.cache/bloom/db. This path can be changed using command line option --db or environment variable BLOOM_DB.

Installation

$ python3 -m pip install https://github.com/messa/bloom/archive/main.zip