-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make low-level analyzers pluggable #248
Comments
Work on this is going to start soon at KIT. |
@J-Gras what's the state here? |
The interface design and benchmarks are completed while the implementation is work in progress. The master thesis should be finished end of January. |
@J-Gras Are there any updates on this? |
The thesis is finished and a POC can be found here: https://github.com/peet1993/zeek/tree/llpoc |
I rebased and merged the entire thing onto a local branch on the main repo, under topic/timw/peet1993-llpoc. I'll dig into it next week. |
Here's my notes on the code. I have a lot of questions. TL;DR: The overall design seems good. It follows pretty closely how we do plugins, especially the L3 analyzers. It just has some quirks that make it not work currently, and places where the code can be cleaned up. I marked the one big question mark where I think it all falls apart.
|
Thanks a lot for having a look! I will try to explain some of the design aspects.
I agree. A set of common analyzers should be contained in zeek. That should be done similar to the static plugins at application layer.
I'll leave the details to Peter. Likely, the branch still contains both paths, as Peter did some measurements to compare his architecture and the old path.
The architecture is designed to support multiple layers and tunneling (without maintaining the context so far). Figure 4.3 b) in the thesis shows an example.
Here the thesis ran out of time. In the thesis an approach using dedicated configuration files is explained. Another option would be to use the script-land.
That must be a path issue. I had to install zeek and the plugin to make it work. I will have a look to see whether I can clean this up.
Correct. That was the research part of the thesis. The document contains the numbers and elaborates on the setup of Peter's benchmarks.
Here I cannot follow. What would be the three copies?
Again, I'll leave that to Peter.
Unfortunately, this hasn't been implemented so far, as described above. The general idea is: Zeek looks for analyzers in a special path, each analyzer comes with a configuration file, which is loaded by zeek and the tree is build based on that (Section 4.3).
I guess this needs a major cleanup. To get the POC running I installed zeek and the demo analyzer that wraps the original stack. To make btest work, paths need to be handled properly.
This is thoroughly discussed in the thesis. The architecture was designed to support multiple layers, hence the tree. For each protocol, we expected a sparse set of identifiers to be dispatched. The thesis evaluated different data structures with respect to time complexity and memory consumption. Although we are well aware of the root of all evil, this was the research part of the thesis.
|
I'll admit I hadn't read the thesis yet and dove straight into the code from the technical side. Let me read through it today and I'll get back to my comments tomorrow. |
I think Peter moved the original zeek code for processing lower layers in the Ethernet llanalyzer for testing purposes. I moved that code into zeek like it's done for the application layer analyzers and renamed it to |
Hey everyone, first of all, sorry that it took me so long to answer, I didn't get around to it earlier. I hope I can clear up all questions. Most of them are already answered precisely by Jan, so not as much left for me :)
To be honest, I'm not either. Initially, I analyzed how you are doing plugins already and tried build on that. I copied the hooks-plugin example from the existing code, checked how it works and modified it to work for my use case. This results in all plugins being dynamic and being built separately, like a plugin author would to. Included plugins should definitely be static, but I didn't invest the time to change that as time was scarce already. I like the example implementation that Jan posted a few days ago, this should also solve the problem that building the PAS didn't work for you.
As Jan already said, it is really just a copy of what Zeek is already doing internally. I did that to analyze how usage of the plugin interface compared to internal code affects the monitor performance by running the layer 2 analysis once with the internal code and once with the plugin-based code. In both approaches, the packet was handed to the internal layer 3 analysis of Zeek afterwards. I found that the impact of the plugin interface is measureable but small for this simple case. You can find the details in section 6.2 of the thesis.
That is not entirely correct. The code is flexible enough to allow pluggable layer 3 analyzers as well, so the internal layer 3 analysis of Zeek can be replaced by plugins as well. This also makes it possible to make the included ARP analyzer better integrated into the analysis flow, right now it is basically treated as a special case during layer 3 protocol detection (Sessions.cc:172). I demonstrated this with a PPP-session trace, producing a logfile of all metadata for each packet (MACs, IP-Addresses etc.) using the analyzers provided in the llpoc-analyzers plugin. These analyzers output information into the debug.log if "-B llpoc" is enabled, it looks like this:
To make this work, you'll have to uncomment all "configuration.addMapping" lines in llanalyzer::Manager and swap the commented-out method "Ethernet::analyze" (Ethernet.cc:180) with the currently-not-commented-out one.
This has to be a path issue as Jan already mentioned. This process worked for me:
The reason you are only seeing a warning instead of an error and Zeek refusing to run is that I forgot to check if the created dispatching tree actually contains a root analyzer, the code only checks if there was a dispatcher object created (ProtocolAnalyzerSet.cc:23). When there are no analyzers (because the plugin cannot be found), a dispatching object is created without any content. Sorry about that :(
I only see two "copies": The Config object containing the configured mapping and the PAS, which is created from the configuration. It therefore also is not really a copy, as the Config object is just an internal representation of the user-provided configuration and the PAS is the actually usable system with dispatchers. The PAS gets recursively created from the configuration. Can you elaborate what you mean?
These are separated classes because I derived my code from the already existing code for application analyzer plugins which uses the Manager-based approach. I actually noticed that the Manager doesn't really do much in this context, but found it useful to have a separation between "machine" and "operator" if that analogy makes sense. The Manager builds the PAS from the configuration and then "uses" the PAS (which is basically a finite state machine) for the dispatching-and-analysis process through processPacket().
Yeah, as Jan said, implementation of the configuration was sadly not possible because of time constraints. I can work something out if this helps, but it is probably a good idea to talk about the concept I worked out first, if you like it or if you would handle it differently.
It does :) I stuck to the formatting I was used to. I hope this could clear up some of the confusion. I'm now also part of the Zeek Slack, if you have any quick questions just shoot me a DM (Peter Oettig) :) |
Thanks for the updates @peet1993! @rsmmr and I talked about this for a while on our call last week, and I think the first steps for this are:
From there I can work on how to rework the configuration tree stuff to be automatically built as analyzers are added, which seems to be the bulk of the work left here. |
In case you mean making it a "static plugin", this commit moves the existing code to a static one.
Am I right, that the first step would be to switch to the dynamic interface without introducing additional features? One example for a "new feature" might be how to deal with information collected by ll-analyzers. For example, at the moment L2 addresses are handed to the upper layers in a way that might not work for new analyzers.
I think all we need is a table per ll-analyzer. I looked into implementing this using the configuration framework. In case the "include mechanism" that was mentioned in the thesis is not of high priority, this should be straight forward. |
No, I mean moving the plugin code out of
You're probably right here. The real trick is going to be sorting out how to go from, for example, the IP4 llanalyzer down into the As a starting point, just making the IP4/IP6 analyzers call the Sessions analyzers directly is easiest to prove it works, and then go from there in adding Sessions to the analyzer chain directly.
Yep, shouldn't be too bad. |
That's what I did? I guess my wording is just confusing. I moved the plugins to |
I looked at the commit and it basically just moved the code from Let's discuss this more tomorrow on Slack and see if we can nail down a good path forward for all of it. |
Does it mean that 802.11 is already supported? |
@lealog There aren't packet analyzers written for 802.11 yet, but the framework merged in this work enables someone to write them. Unfortunately, the core Zeek team probably doesn't have the bandwidth work on those right now. It'd be an interesting project for someone from the community to pick up though. |
Implement a plugin interface for low-level analyzers like GOOSE (see #76). Requirements will be collected on zeek-dev mailinglist.
The text was updated successfully, but these errors were encountered: