New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove zeek::Batch and simplify API #64
Conversation
The Zeek-side batching was implemented in response to observing performance degradation with Broker compared to the old communication backend. This seems no longer necessary, so we can remove the opaque data as well as the batching messages in Broker (see zeek/broker#64). Instead of serializing into opaque buffers, we simply convert the threading values directly to `broker::data` and vice versa.
@jsiwek I also had to update the Python bindings (7627657). It seems like throwing exceptions is fine with |
@Neverlord nice job, the changes look good conceptually, but before I do the merge, can you check into why the Travis build fails for this Broker PR ? And after that, can you update your |
Yeah, this is neat. I was hoping we could benchmark it a bit more before merging, but not sure when that can happen. We don't have the new testbed in place yet, and not sure if we can find somebody who'd be able to run old vs new in a production environment. However, I'm a bit reluctant to just go ahead, as it would be difficult to back this out again in case it did unexpectedly cause trouble. |
It failed because of the documentation examples. Fixed.
Done. The CI isn't running automatically for me, but when it eventually runs I'll address any build issues in Zeek.
Me too, for the exact reasons you're pointing out. I think we should merge it only after having confidence in it. Also, this PR removes the opaque blobs Zeek is currently sending around and thus gives Broker subscribers full access to Zeek log messages. I'm sure there are some cool use cases for doing analysis using that data source in real time. We shouldn't merge it, get (some) people excited and then swap it back out again. |
1b0e43e
to
8190895
Compare
The Zeek-side batching was implemented in response to observing performance degradation with Broker compared to the old communication backend. This seems no longer necessary, so we can remove the opaque data as well as the batching messages in Broker (see zeek/broker#64). Instead of serializing into opaque buffers, we simply convert the threading values directly to `broker::data` and vice versa.
As mentioned in zeek/zeek#644, removal of Zeek-side batching is on hold until after CAF 0.18. The followup issue to track that: zeek/zeek#771 |
The Zeek-side batching was implemented in response to observing performance degradation with Broker compared to the old communication backend. Batching multiple Zeek messages into a single Broker message and making the content opaque to Broker by exchanging binary data (
serial_data
) mitigated the performance issues. Since then, we made several performance improvements. Most notably, Broker now uses copy-on-write messaging in order to avoid unnecessary copy overheads.Latest benchmark results all seem to indicate that we no longer need the Zeek-side batching. Hence, this commit essentially undoes the batching:
broker::zeek::Message::Batch
andbroker::zeek:: Batch
valid
functions and make them staticvalid
function checks whether it's safe to create an object in the first placeserial_data
members in favor of using Broker'sdata::vector
directlyThis API change also requires a patch in Zeek (see accompanying PR in
zeek/zeek
).