Python command line tool for parsing raw firewall logs to a simple CSV or JSON representation. (With automated check against Threatfox listed IPs)
All text files can be processed. Works with gz or xz compressed files too.
Speeding up checking firewall data in Incident Response when no SIEM is available and the FW data format sucks (as most of the time)...
This parser guides you through the file format you want to process. It will ask you for delimiter, possitions and allows you to work with replaces (they are some times needed because firewall logs of some vendors do not have a fixed position for relevant keys).
After you defined the delimiter, positions of:
- Source IP
- Destination IP
- Source Port
- Destination Port
- Date
- Time
and created needed replaces, you can save you sprecifications for the format in a config file so you do not have to follow the whole process for new data.
-d DIR, --dir DIR
Use this to parse a whole directory. Make sure only valid text, gz or xz files are in there. (either -d or -f is needed)
-f FILE, --file FILE
Use this to parse from a single file (either this or -d is needed)
-t DELIMITER, --delimiter DELIMITER
Use this to specify the delimiter. If empty you will be asked. Or it can be specified in the config file
-o OUTPUT, --output OUTPUT
the path were the output files will be stored, cwd if not specified
-n NAME, --name NAME
use this if you want to parse everything into a single file. (without file extention)
-ip FILTER_IP, --filter-ip FILTER_IP
- 'threatfox' for IPs listed in threatfox https://threatfox.abuse.ch/export/ (default 30days. set -days for custom value);
- 'public' for only entries having a public IP in source or destination;
- single ip: eg. '192.168.0.1';
- list of IPs: eg. '192.168.0.1,192.168.0.5';
- range of ports: eg. '192.168.0.1-192.168.0.5' you can specify multiple ranges seperated by a ','
-days THREATFOX_DAYS, --threatfox-days THREATFOX_DAYS
'define range back in time for threatfox https://threatfox.abuse.ch/export/ IPs set to 0 will take the whole list. if not specified the default is 30 days
-p FILTER_PORT, --filter-port FILTER_PORT
- single port: eg. '53'or a
- list of ports: eg. '53,443' or a
- range of ports: eg. '1-1024' you can specify multiple ranges seperated by a ','
-c CONFIG, --config CONFIG
path to a config .json file
-b BATCH_SIZE, --batch-size BATCH_SIZE
By default 10000 lines will be processed. You should not go below 1000. The higher, the more RAM is used, but likely quicker
-x {csv,json}, --format {csv,json}
specify csv or json as output format
-z, --disable-validation
Disable the IP-Validation. This is only recommented for processing e.g. DNS or Proxy logs where the destination or source is no IP
-m, --connection-map
outputs a connection map as a json file having for each source a dict of each destination having a dict of each destination port having a list of timestamps
-v, --verbose
see more output on the console
-u, --debug
set logging level to debug and verbose
-s SKIP_FILES, --skip-files SKIP_FILES
number of files to skip in the list negative values will start from the end of the (directory) list and let this number of files away
--help
for list of arguments
python .\fwParser.py -f test_data\test_data.foo
You will get the first line prompted and asked to specify a delimiter
type , and press Enter
you will be asked for strings to replace. In the test_data.foo there are examples, we will come to this a bit later since we do not know yet.
we leave blank since we are not aware of this is needed.
You will get an other line displayed and already splitted by your spefified delimiter and applied replaces for visual confimation.
you will get the position and an example string prompted
you will be asked to specify the position for
- source_ip
- 0
- dest_ip
- 1
- source_port the ports sometimes can be directly behind the IP-Adresses. If so you can specify the delimiter instead of a position
- 2
- dest_port will not be asked if a delimiter is specified at the source_port prompt
- 3
- pos_date
- 4
- pos_time just leave blank if the time is in one line with the date
- 5
you will be asked if you want to validate the data. Since this will only go thrgouh the first batch of data, this is not recommented if you work with filters. Chance is high that the first batch does not show results because of the filter and throw an error.
You will be asked to save the config. I recommend doing so, since it is less effort manipulating the config file than going through all the steps again if you encounter anny issues. (Replaces are a good example).
press y and confirm enter a config name without file extention
Now the parser does his job
Looks good so far... lets check the logs if the parser encountered any errors.
open the log file with the matching timestamp of execution
The first line is just the header. Therefore its not an issue that it has not been processed.
The next two errors are not okay for us, since there is relevant data in there but anyhow (like some firewall vendors do) there is some stuff in the log lines that destroy the position :( lets fix that...
lets check the position of the second line:
it seems like we should get rid of the part "omethin_destroying_the_format_for_replace, " and then our positiions would fit the schema again.
Lets check the third line
here it seems like we have to replace "this is not good here too...," and "somethin__more_destroying_the_format_for_replace," to have our correct positions.
currently our config looks like this:
lets add our replaces:
since we do not want to insert anything we just prelace with "". (In some cases it could be neccessary to insert some placeholders with more delimiters in between to fit the schema)
python .\fwParser.py -f test_data\test_data.foo -c .\testFW_config.json
Check the log file for errors:
Sweet - only the header was threw an error!
Lets check the output file:
seems like our files have been processed correctly! :)
You can filter for known maliciouse connections by specifying the threatfox filter
python .\fwParser.py -ip threatfox -days 3 -f .\test_data\test_data.foo --debug -c .\testFW_config.json
This automatically downloads the threatfox IP IOC file and adds all IPs within the timeframe in the IP Filters list.
You will see additional information form threatfox about the IP that has been found: