flow-indexer indexes flows Usage: flow-indexer [command] Available Commands: compact Compact the database daemon Start daemon expandcidr Expand a CIDR range from those seen in the database index Index flows search Search flows help Help about any command Flags: --dbpath="flows.db": Database path -h, --help[=false]: help for flow-indexer Use "flow-indexer [command] --help" for more information about a command.
$ export GOPATH=~/go $ go get github.com/JustinAzoff/flow-indexer
$ cp ~/go/src/github.com/JustinAzoff/flow-indexer/example_config.json config.json $ vi config.json # Adjust log paths and database paths.
The indexer configuration is as follows:
- name - The name of the indexer. Keep this short and lowercase, as you will use it as an http query param.
- backend - The backend log ip extractor to use. Choices: bro, bro_json, nfdump, syslog.
- file_glob - The shell globbing pattern that should match all of your log files.
- recent_file_glob - The strftime+shell globbing pattern that should match todays log files.
- filename_to_database_regex - A regular expression applied to each filename used to extract information used to name the database.
- database_root - Where databases will be written to. Should be indexer specific.
- datapath_path - The name of an individual database. This can contain $variables set in filename_to_database_regex.
The deciding factor for how to partition the databases is how many unique ips you see per day. I suggest starting with monthly indexes. If the indexing performance takes a huge hit by the end of the month, switch to daily indexes.
Run initial index
indexall command will expand file_glob and index any log file that matches.
$ ~/go/bin/flow-indexer indexall
Once the initial index is complete, start the daemon. Starting the daemon will expand recent_file_glob and index any recently created log file that matches.
$ ~/go/bin/flow-indexer daemon
It will do this in a 60 second loop to keep itself up to date.
$ curl -s 'localhost:8080/search?i=conn&q=18.104.22.168/24' $ curl -s 'localhost:8080/stats?i=conn&q=22.214.171.124/24'
Running flow-indexer as a service
To run flow-indexer as a service on a system using systemd, you can use the provided flow-indexer.service file.
If you are planning to run flow-indexer as a service on a system that uses upstart, you may want to consider a conf file like the following in order to properly syslog stdout and stderr from flow-indexer, and to run as a non-root user.
# flow-indexer - Flow Indexer # # flow-indexer is a service that indexes and allows retrieval of flows using bro logs description "Flow Indexer Daemon" start on runlevel  stop on runlevel [!345] setuid flowindexer setgid flowindexer exec /path/to/bin/flow-indexer daemon --config /path/to/flow-indexer/config.json 2>&1 | logger -t flow-indexer
In order to avoid
too many open files errors, you may want to increase the
number of open files you allow the user that flow-indexer runs as to have
access to. This can be done by changing your
nofile setting in
/etc/security/limits.conf as shown below.
flowindexer soft nofile 65535 flowindexer hard nofile 65535
Lower level commands example
./flow-indexer --dbpath /tmp/f/flows.db index /tmp/f/conn* 2016/02/06 23:36:51 /tmp/f/conn.00:00:00-01:00:00.log.gz: Read 4260 lines in 24.392765ms 2016/02/06 23:36:51 /tmp/f/conn.00:00:00-01:00:00.log.gz: Wrote 281 unique ips in 2.215219ms 2016/02/06 23:36:51 /tmp/f/conn.01:00:00-02:00:00.log.gz: Read 4376 lines in 24.186168ms 2016/02/06 23:36:51 /tmp/f/conn.01:00:00-02:00:00.log.gz: Wrote 310 unique ips in 1.495277ms [...] 2016/02/06 23:36:51 /tmp/f/conn.22:00:00-23:00:00.log.gz: Read 7799 lines in 18.350788ms 2016/02/06 23:36:51 /tmp/f/conn.22:00:00-23:00:00.log.gz: Wrote 775 unique ips in 5.155262ms 2016/02/06 23:36:51 /tmp/f/conn.23:00:00-00:00:00.log.gz: Read 5255 lines in 15.296847ms 2016/02/06 23:36:51 /tmp/f/conn.23:00:00-00:00:00.log.gz: Wrote 400 unique ips in 2.910344ms
./flow-indexer --dbpath /tmp/f/flows.db index /tmp/f/conn* 2016/02/06 23:37:36 /tmp/f/conn.00:00:00-01:00:00.log.gz Already indexed 2016/02/06 23:37:36 /tmp/f/conn.01:00:00-02:00:00.log.gz Already indexed 2016/02/06 23:37:36 /tmp/f/conn.02:00:00-03:00:00.log.gz Already indexed 2016/02/06 23:37:36 /tmp/f/conn.03:00:00-04:00:00.log.gz Already indexed [...] 2016/02/06 23:37:36 /tmp/f/conn.20:00:00-21:00:00.log.gz Already indexed 2016/02/06 23:37:36 /tmp/f/conn.21:00:00-22:00:00.log.gz Already indexed 2016/02/06 23:37:36 /tmp/f/conn.22:00:00-23:00:00.log.gz Already indexed 2016/02/06 23:37:36 /tmp/f/conn.23:00:00-00:00:00.log.gz Already indexed
Expand CIDR Range
./flow-indexer --dbpath /tmp/f/flows.db expandcidr 126.96.36.199/24 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199 188.8.131.52 184.108.40.206 220.127.116.11 18.104.22.168 22.214.171.124 126.96.36.199
./flow-indexer --dbpath /tmp/f/flows.db search 188.8.131.52/24 /tmp/f/conn.03:00:00-04:00:00.log.gz /tmp/f/conn.04:00:00-05:00:00.log.gz /tmp/f/conn.06:00:00-07:00:00.log.gz /tmp/f/conn.14:00:00-15:00:00.log.gz /tmp/f/conn.18:00:00-19:00:00.log.gz /tmp/f/conn.20:00:00-21:00:00.log.gz /tmp/f/conn.22:00:00-23:00:00.log.gz