Feature Request: Add Post-Decoding Phase #14751

mthbrown · 2022-08-27T02:28:03Z

Hi,

I'm not sure if someone has brought up this suggestion before or not but I think that it would be useful to have a post-decoding phase that can give users flexibility by allowing them to run Lua scripts.

Some examples of how this would be useful is to populate the message before it reaches the rules phase and:

add GeoIP data (this is currently done post rules by Filebeat and I know that you can re-compile it yourself to add this)
add threat intel
concatenate arguments in auditd for example as currently you can only write rules searching in a specific field
easily integrate with third-party resources including Redis, etc. I'm not sure how a cluster-based setup shares data on frequency but if it doesn't, this might be one option
have more dynamic CDBs
write rules that express relationships between events such as what you get from EQL
look for things such as entropy in strings or frequency analysis
possibly run ML models
etc.

The problem with just using Logstash or some other tier to do this is that you basically have to rewrite Wazuh's ruleset if you go down this path. The above architecture would mean you would just tweak the rules slightly when needed. Thanks.

juliancnn · 2022-08-28T20:53:43Z

Hi @mthbrown ,

Thank you for your contributions to the Wazuh community.

We know that these features would be useful in wazuh, and I believe that almost everything is included, or rather, could be satisfied by the requirements of the #11334 issue for a new log analysis engine (currently under development, which perhaps, if it meets expectations, in the future could replace analisysd).
I encourage you, if you have time, to watch the issues and share your opinion, this is very valuable for development.

Thank you!

mthbrown · 2022-08-29T01:05:41Z

Thanks @juliancnn. I had a quick look at it. Wouldn't it also be better to have a layer before the rules engine (this wasn't clear in the link you referenced)? Personally, I think having something like Fluentd or Logstash receive all logs first and then convert them to JSON and do enrichment, etc. before passing it on to the rules engine might make for a better design (so the decoders would basically be added to these log aggregators if they aren't currently supported). This will allow Wazuh to capitalize on all the work that has already been done in traditional ELK and EFK setups (and all the plugins, etc. that these log aggregators already have) while providing the unique features that it provides. It also means that the rules engine just has to support JSON and can focus more on adding missing features such as Sigma rule support

juliancnn · 2022-09-06T17:07:43Z

hi @mthbrown, Sorry for the late reply, the email got mixed up among others from github in the inbox 😞.

The new engine tries (as you say) to handle json, when it receives any kind of log, it adds it to a json field. This event enters in the chain of operations that will be defined by the 'Assets' (they would be like filters, decoders and rules). These assets will be defined by the user in YMl documents and will perform the task of enriching and normalizing the event.
The engine seeks to be as transparent as possible, receiving the events as they are collected by the agent, and inserted into a chain of operations that is simple to define, read and manipulate by the users. We want there to be no "hidden" or unavoidable manipulations, such as the predecoder of Wazuh-Analisisd.

I understand that the document is not easy to read (It is a very long document.), and that I am only conveying the spirit of what we are looking for (and maybe this is not enough), that is why I invite you to follow the progress of #11334, I am sure that in the following months there will be progress and information more easy to digest.

On the other hand, your contribution to this issue is very important, so I would like to leave it open for future analysis.

juliancnn added reporter/community type/enhancement/feature module/analysis Issues related to the Analysis daemon labels Aug 28, 2022

juliancnn added the module/engine label Sep 6, 2022

vikman90 added the type/enhancement New feature or request label May 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Add Post-Decoding Phase #14751

Feature Request: Add Post-Decoding Phase #14751

mthbrown commented Aug 27, 2022

juliancnn commented Aug 28, 2022

mthbrown commented Aug 29, 2022 •

edited

juliancnn commented Sep 6, 2022 •

edited

Feature Request: Add Post-Decoding Phase #14751

Feature Request: Add Post-Decoding Phase #14751

Comments

mthbrown commented Aug 27, 2022

juliancnn commented Aug 28, 2022

mthbrown commented Aug 29, 2022 • edited

juliancnn commented Sep 6, 2022 • edited

mthbrown commented Aug 29, 2022 •

edited

juliancnn commented Sep 6, 2022 •

edited