-
Notifications
You must be signed in to change notification settings - Fork 327
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
redesign PCAP processing pipeline #80
Comments
mmguero
added
enhancement
New feature or request
moloch
zeek
Relating to Malcolm's use of Zeek
upload
Relating to PCAP and/or Zeek log ingestion
capture
Relating to pcap-capture container
labels
Nov 13, 2019
working on this in topic/dynamic-pipelines |
Merged
mmguero
added a commit
that referenced
this issue
Nov 15, 2019
Handling issue #80 and issue #78. * redesign PCAP processing pipeline so that there is [one service](/idaholab/Malcolm/tree/development/moloch/scripts/pcap_watcher.py) that watches the `/data/pcap/processed` directory and publishes to a ØMQ topic), then [other services](/idaholab/Malcolm/tree/development/moloch/scripts/pcap_moloch_and_zeek_processor.py) can subscribe to that topic and do what they want with the PCAP information they receive. This will make it much easier to add future PCAP processors, and also increases parallel-ness of the code. * move common Logstash enrichments to a separate pipeline. I've made the [pipelines](/idaholab/Malcolm/tree/development/logstash/pipelines) used for processing Logstash events more modular, and I've also made it more extensible by having the [startup script](/idaholab/Malcolm/tree/development/logstash/scripts/logstash-start.sh) dynamically detect and configure new pipelines on the fly. this will make it easier to add new parsers in the future (need to document how to do that in the [readme](/idaholab/Malcolm/tree/development/README.md) though).
mmguero
added a commit
that referenced
this issue
Nov 20, 2019
* Topic/dynamic pipelines (#81) (Handling issue #80 and issue #78) * redesign PCAP processing pipeline so that there is [one service](/idaholab/Malcolm/tree/development/moloch/scripts/pcap_watcher.py) that watches the `/data/pcap/processed` directory and publishes to a ØMQ topic), then [other services](/idaholab/Malcolm/tree/development/moloch/scripts/pcap_moloch_and_zeek_processor.py) can subscribe to that topic and do what they want with the PCAP information they receive. This will make it much easier to add future PCAP processors, and also increases parallel-ness of the code. * move common Logstash enrichments to a separate pipeline. I've made the [pipelines](/idaholab/Malcolm/tree/development/logstash/pipelines) used for processing Logstash events more modular, and I've also made it more extensible by having the [startup script](/idaholab/Malcolm/tree/development/logstash/scripts/logstash-start.sh) dynamically detect and configure new pipelines on the fly. this will make it easier to add new parsers in the future (need to document how to do that in the [readme](/idaholab/Malcolm/tree/development/README.md) though). * bump version for 1.7.1 release * set opencontainers-compatible labels on docker containers * fix path issue with fuser for the filebeat prune cronjob * fix issue #82, OUI vendor names used by Logstash don't match those used by Moloch * clean up unused code * split pcap-monitor into its own image * breaking out moloch and zeek docker containers into their own * make sure things run as the right users in new containers * fix issue with duplicate files not being detected by pcap_watcher.py * documentation fix * fix missing geoip section ids * clean up dockerfiles * decrease verbosity of moloch-capture since we're not seeing it anyway * Allow the ability to specify PCAP_PIPELINE_IGNORE_PREEXISTING in order to check and (if needed) reprocess PCAP files that didn't get finished before shutdown. Default is 'false' which meants to do the check, 'true' means ignore anything in there before the container starts
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently PCAP files are processed first by moloch-capture and then by zeek. This is not very extensible. A more elegant approach would be to have a PCAP topic that is published to similar to this and just have these other processors subscribe to and pull from that.
Have to work out things like queue size, persistence, workers, etc. but it shouldn't be too compilcated.
The text was updated successfully, but these errors were encountered: