-
-
Notifications
You must be signed in to change notification settings - Fork 91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement the scaffolding for new logical operators #3004
Conversation
924c769
to
3604108
Compare
88a0de2
to
dab4b3a
Compare
This is because GCC 10 on our CI does not implement P2162.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First few review comments to make them visible. No reason to wait until I'm through to the end.
Coroutines are not always driven to the first suspension point. This meant that `logical_pipeline::make_local_executor` was prone to UAF. In contrast, the free function captures it by-value immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The operator abstraction should be able to handle all cases, at least for vast exec
. I expect the implementation to hold up for now, but we may discover that not all our assumptions hold when we implement hosted pipelines.
This PR makes some changes to the implementation of RFC-001 (#3004). The main goal is to simplify operator definitions and internal execution details. - The instantiation state is now only stored by the instantiated generator. - There is no more `physical_operator`/`logical_operator` separation in the code base, because we have only one intermediate representation for both (though we can still use these terms in explanations). - Operators can be copied without a `parse(to_string())` round-trip. - Schema-specific dispatching is an implementation detail of the operator, but we provide a class that makes this convenient. This drastically simplifies the execution engine. - Polymorphic operators are now possible. For instance, `head` could be made to work on both event and byte streams. - The empty pipeline is therefore now also correctly typed, same as `pass`. Pipeline serialization is left for a follow-up PR.
In a three-day Hackathon with @jachris, we've implemented the new scaffolding for logical operators as designed in RFC-001, and implemented a local executor that works without spinning up a new thread.
The goal for this PR is to provide the scaffolding such that follow-up PRs can make the following changes:
/query
endpoint, theimport
andexport
commands, and the partition transformer.For reviewing, we recommend the following order of files:
element_type.hpp
to get an understanding of element types and batches, and their type-erased versionlogical_operator.hpp
to get an understanding of the logical operator API.logical_pipeline.hpp
to get an understanding of how logical operators compose into a logical pipeline.physical_operator.hpp
,where.cpp
, andhead.cpp
to get an understanding of how operators transform data (including two examples).logical_pipeline.cpp
to see how we connect producers to be able to execute pipelines.During review, please pay attention to the desired destruction order of producer, physical operator, logical operator, and operator control plane.
We recommend pairing with us for the review, especially for
logical_pipeline::make_local_executor()
.