-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(pipelines transform): load and handle pipelines tranforms #9733
Conversation
✔️ Deploy Preview for vector-project canceled. 🔨 Explore the source changes: 975d3ac 🔍 Inspect the deploy log: https://app.netlify.com/sites/vector-project/deploys/6183b343ad5373000715f3ba |
5e23f55
to
6bb01bb
Compare
a79a9e2
to
cb81bdb
Compare
15e4292
to
5efe832
Compare
c5b4af9
to
e891617
Compare
Signed-off-by: Jérémie Drouet <jeremie.drouet@gmail.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this seems reasonable. 👍🏻
My biggest complaint is just a lack of documentation which made it hard to understand what is going on when looking only at the code. It's not a blocker, but I do think some more documentation should be added either in this PR or a subsequent one.
// This is a hack around the issue of cloning | ||
// trait objects. So instead to clone the config | ||
// we first serialize it into JSON, then back from | ||
// JSON. Originally we used TOML here but TOML does not | ||
// support serializing `None`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should already be possible, since we require that TransformConfig
is a supertrait of dyn_clone::DynClone
, and we use the dyn_clone::clone_trait_object!
macro to define a Clone
impl for Box<dyn TransformConfig>
:
vector/lib/vector-core/src/transform/config.rs
Lines 45 to 69 in d849e22
pub trait TransformConfig: core::fmt::Debug + Send + Sync + dyn_clone::DynClone { | |
async fn build(&self, globals: &TransformContext) | |
-> crate::Result<crate::transform::Transform>; | |
fn input_type(&self) -> DataType; | |
fn output_type(&self) -> DataType; | |
fn named_outputs(&self) -> Vec<String> { | |
Vec::new() | |
} | |
fn transform_type(&self) -> &'static str; | |
/// Allows a transform configuration to expand itself into multiple "child" | |
/// transformations to replace it. This allows a transform to act as a macro | |
/// for various patterns. | |
fn expand( | |
&mut self, | |
) -> crate::Result<Option<(IndexMap<String, Box<dyn TransformConfig>>, ExpandType)>> { | |
Ok(None) | |
} | |
} | |
dyn_clone::clone_trait_object!(TransformConfig); |
Should be as simple as deriving Clone
on PipelineConfig
. Does doing that return a specific error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's how it is currently done in the ConfigBuilder
struct. I didn't want to reinvent the wheel on this.
Maybe we should create an issue related to this and handle it in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created this issue #9898
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Echoing @tobz: it'd be useful to add some additional commentary around some of the struct changes and functions, to provide context for future readers.
I found myself jumping around the code a little, trying to figure out why a change was brought in - particularly around the transform Noop and the properties added to ExpandType
.
I'd also like sign-off from @lukesteensen before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely would echo the desire for some docs on how this all fits together. This takes our macro expansion quite a bit further than anything else, and it's difficult to keep straight how all of the new things fit together to result in the chart in the PR description.
In particular, with things like Expander
, EventRouterConfig
, and EventFilterConfig
, it seems like there are things that exist purely as intermediate expansions. I would like very much to simplify this by directly representing more of these concepts in the topology, but it would currently be difficult to refactor without worrying that something isn't going to be expanded in exactly the same way. Documentation and tests for how things are meant to expand would help that quite a bit.
Overall though, this is very neat and seems like it should work!
659a965
to
32e3b56
Compare
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
32e3b56
to
47ef9d3
Compare
7142659
to
cf0c101
Compare
cf0c101
to
2959fca
Compare
Signed-off-by: Jérémie Drouet <jeremie.drouet@datadoghq.com>
2959fca
to
975d3ac
Compare
/// This way of expanding will duplicate the inputs for every expanded node. | ||
/// If `aggregates` is set to `true`, then a `Noop` transform will be added | ||
/// so that you can use the original component name as an input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense! If I were to tweak the wording, I'd say something like
Duplicate the inputs onto every expanded node, fanning out so that each node receives inputs in parallel. If
aggregates
is set totrue
, then aNoop
transform will be added such that each expanded node's output is fanned back in to pass through that node, which can then be used as an input for other components.
/// This ways of expanding will take all the components and chain then in order. | ||
/// The first node will be renamed `component_name.0` and so on. | ||
/// If `alias` is set to `true, then a `Noop` transform will be added as the | ||
/// last component and named `component_name` so that it can be used as an input. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, just some small tweaks:
Chain components together one after another. Components will be named according to this order (e.g.
component_name.0
and so on). Ifalias
is set totrue
, then aNoop
transform will be added as the last component and given the rawcomponent_name
identifier so that it can be used as an input for other components.
src/transforms/pipelines/mod.rs
Outdated
/// This represent the configuration of a single pipeline, | ||
/// not the pipelines transform itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/represent/represents/ and I'd say "not the pipelines transform itself, which can contain multiple individual pipelines" just to be more clear.
@@ -43,6 +45,7 @@ impl PipelineConfig { | |||
} | |||
} | |||
|
|||
/// This represent an ordered list of pipelines depending on the event type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could use a bit more elaboration. From the description alone I'm still not sure what exactly it does.
@@ -1,3 +1,61 @@ | |||
/// This pipelines transform is a bit complex and needs a simple example. | |||
/// | |||
/// If we take the following example in consideration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would word this:
If we consider the following example:
/// The pipelines transform will first expand into 2 parallel transforms for `logs` and | ||
/// `metrics`. A `Noop` transform will be also added to aggregate `logs` and `metrics` | ||
/// into a single transform and to be able to use the transform name (`my_pipelines`) as an input. | ||
/// | ||
/// Then the `logs` group of pipelines will be expanded into a `EventFilter` followed by | ||
/// a series `PipelineConfig` via the `EventRouter` transform. At the end, a `Noop` alias is added | ||
/// to be able to refer `logs` as `my_pipelines.logs`. | ||
/// Same thing for the `metrics` group of pipelines. | ||
/// | ||
/// Each pipeline will then be expanded into a list of its transforms and at the end of each | ||
/// expansion, a `Noop` transform will be added to use the `pipeline` name as an alias | ||
/// (`my_pipelines.logs.transforms.foo`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really helpful! Thanks for adding this.
examples: [ | ||
{ | ||
title: "Filter by log level and reformat" | ||
configuration: """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting, this should be structured data, not a string. @jdrouet was there precedence in another component for using strings here? I can't find one.
Resources
How it works
The pipelines transform will expand into the following graph
What's left to do
What's left to be done (other PR)
For those who want to change the chart