Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Auto-configure middleware pipelines via priority and pipeline type properties #6660

Open
berndverst opened this issue Jul 13, 2023 · 19 comments · May be fixed by #7299
Open

Proposal: Auto-configure middleware pipelines via priority and pipeline type properties #6660

berndverst opened this issue Jul 13, 2023 · 19 comments · May be fixed by #7299

Comments

@berndverst
Copy link
Member

berndverst commented Jul 13, 2023

This is a proposal to add a new mechanism for configuring Middlewares, directly from the component YAML instead of the Configuration CRD.

Each middleware would support the optional priority property via which the ordering can be determined in which a middleware components should be applied to a pipeline.

Another property should be pipelineType with values httpPipeline and appHttpPipeline at this time.

This also will allow scoping of middleware components to individual apps in Kubernetes without having to rely on a global configuration which may be invalid for an individual app.

@yaron2
Copy link
Member

yaron2 commented Jul 13, 2023

Middleware does not target components, so I'm not clear on why or how a component would be used to specify a middleware.

As a refresher, middleware apply to incoming HTTP requests to the Dapr API and outgoing HTTP requests to the app's API.

As an aside, Configurations are not global.

@ItalyPaleAle
Copy link
Contributor

I like this proposal which I think makes using middlewares simpler. The current 2-step process can be confusing: it requires coordinating what's in the Component resource (defining the middleware and its spec) with the Configuration resource (actually enabling the middleware).

A few questions:

  1. How would this new approach work with the old one? To ensure backwards compatibility?
  2. Let's clarify how "priority" works. I assume, the lower, the earlier the middleware is loaded?

@yaron2
Copy link
Member

yaron2 commented Jul 13, 2023

Is the goal here to simplify the setting of a middleware? It'd be great to have a visualization of a priority chain with 2 middlewares with example YAMLs.

If this is the intended proposal:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: routerchecker1
spec:
  type: middleware.http.routerchecker
  version: v1
  metadata:
  - name: rule
    value: "^[A-Za-z0-9/._-]+$"
  - name: pipelineType
    value: appHttpPipeline
  - name: priority
    value: 1
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: routerchecker2
spec:
  type: middleware.http.routerchecker
  version: v1
  metadata:
  - name: rule
    value: "^[A-Za-z0-9/._-]+$"
  - name: pipelineType
    value: appHttpPipeline
  - name: priority
    value: 2

And the intended outcome is that routerchecker1 will execute first followed by routerchecker2 when Dapr is calling the app via HTTP, then I think this is a valid proposal.

The issues I have are mainly around backwards compat and that priority will become a non-typed reserved field name that will prohibit being used by individual middleware components, even though we have precedents for that.

Also, what happens if there are two middleware components with the same priority? this leads to the following question - if developers/operators want to change priority, how many and which steps would be required in order to do that? In the Configuration method, the call chain is visualized explicitly and changing priorities requires touching a single resource.

@berndverst
Copy link
Member Author

Is the goal here to simplify the setting of a middleware? It'd be great to have a visualization of a priority chain with 2 middlewares with example YAMLs.

If this is the intended proposal:

apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: routerchecker1
spec:
  type: middleware.http.routerchecker
  version: v1
  metadata:
  - name: rule
    value: "^[A-Za-z0-9/._-]+$"
  - name: pipelineType
    value: appHttpPipeline
  - name: priority
    value: 1
apiVersion: dapr.io/v1alpha1
kind: Component
metadata:
  name: routerchecker2
spec:
  type: middleware.http.routerchecker
  version: v1
  metadata:
  - name: rule
    value: "^[A-Za-z0-9/._-]+$"
  - name: pipelineType
    value: appHttpPipeline
  - name: priority
    value: 2

And the intended outcome is that routerchecker1 will execute first followed by routerchecker2 when Dapr is calling the app via HTTP, then I think this is a valid proposal.

The issues I have are mainly around backwards compat and that priority will become a non-typed reserved field name that will prohibit being used by individual middleware components, even though we have precedents for that.

Also, what happens if there are two middleware components with the same priority? this leads to the following question - if developers/operators want to change priority, how many and which steps would be required in order to do that? In the Configuration method, the call chain is visualized explicitly and changing priorities requires touching a single resource.

Yes the goal is to simplify the activation of middleware and their ordering. This can be done via the middleware components themselves. It does not need to live in the Configuration CRD - but that can be maintained for backwards compatibility.

Of course we can choose different reserved metadata fields. Keep in mind consumerID is such a field. We could even look for these fields (whatever they are called) at the runtime level without having to update the middleware components at all, similarly to the way consumerID is handled for PubSub components.

@yaron2
Copy link
Member

yaron2 commented Jul 13, 2023

  1. What happens if there are two middleware components with the same priority?
  2. If developers/operators want to change priority, how many and which steps would be required in order to do that? In the Configuration method, the call chain is visualized explicitly and changing priorities requires touching a single resource.

It does not need to live in the Configuration CRD - but that can be maintained for backwards compatibility.

If both methods are defined, who wins?

@berndverst
Copy link
Member Author

berndverst commented Jul 13, 2023

Middleware does not target components, so I'm not clear on why or how a component would be used to specify a middleware.

They are components. I know they don't target components. The idea is that config can shift into each components so that by looking at all components we can auto-generate / load / configure what today lives in the Configuration object for each app.

As a refresher, middleware apply to incoming HTTP requests to the Dapr API and outgoing HTTP requests to the app's API.

I always get confused about the ordering where it is applied. So thanks for the correction.

As an aside, Configurations are not global.

Ah, I forgot about the annotation dapr.io/config. Still, I find this unnecessarily complex for configuring the HTTP middlewares. So the idea is to simplify the pipeline configurations!

@ItalyPaleAle
Copy link
Contributor

The issues I have are mainly around backwards compat

Agree. We need to make sure that:

  1. Dapr can continue to use the current method of defining pipelines in a Configuration resource. We may deprecate that, but it should still be supported for a long time.
  2. Point 1 also means that we shouldn't enable a middleware in a pipeline unless there's the "priority" field for example (or something else we decide).
  3. Another thing to decide is what to do if we have both components that include a "priority" (so are enabled "automatically") and those that are loaded via a Configuration. This can get messy. Maybe we disable the use of Configuration for enabling middlewares if there at least one middleware compoent with a "priority"? Or maybe we say that anything defined in the Configuration resource is always loaded before middlewares with a "priority"?

priority will become a non-typed reserved field name that will prohibit being used by individual middleware components, even though we have precedents for that.

Yes we have lots of precedents, including "actorStateStore", "direction" (for bindings), etc.

What happens if there are two middleware components with the same priority?

My recommendation here is to show a warning and randomize the order they are loaded. We need to randomize them so users don't rely on an order "always" being returned.

Also worth highlighting that a pattern is to allow priorities to be non-sequential. So users can define priorities "-100, -200, 0, 100, 200, 300" etc

@yaron2
Copy link
Member

yaron2 commented Jul 13, 2023

So the idea is to simplify the pipeline configurations!

I like the simplicity offered here, for sure.

@berndverst
Copy link
Member Author

berndverst commented Jul 13, 2023

  1. What happens if there are two middleware components with the same priority?

Can we add an error to the components CRD controller for this and fail to apply the component? Or we can show a warning and apply a random ordering as @ItalyPaleAle mentions.

For legacy reason let's not use a default priority value - if no priority is defined the middleware component isn't using the new approach.

  1. If developers/operators want to change priority, how many and which steps would be required in order to do that? In the Configuration method, the call chain is visualized explicitly and changing priorities requires touching a single resource.

It does not need to live in the Configuration CRD - but that can be maintained for backwards compatibility.

If both methods are defined, who wins?

Since priority do not need to be sequential the best practice would be to leave adequate spacing. This is not unlike DNS record priority best practices! You then only need to update a single component with the new priority value.

If remember which component has which priority is a struggle (though hopefully you keep your component definitions in code), then maybe we can also update the METADATA API to show the effective ordering (and priorities) for each middleware pipeline for the given app.

@berndverst
Copy link
Member Author

berndverst commented Jul 13, 2023

I think whenever an app uses the explicit Configuration CRD way to configure middlewares we should disregard the priority approach.

The Configuration CRD is most explicit and therefore should always win, even if this is not the recommended approach (because it isn't the easiest approach) in the future.

Only if the Configuration CRD doesn't configure the pipelines then we should inspect the priorities and build the pipeline automatically.

What do you think @yaron2 @ItalyPaleAle ?

@ItalyPaleAle
Copy link
Contributor

I think whenever an app uses the explicit Configuration CRD way to configure middlewares we should disregard the priority approach.

The Configuration CRD is most explicit and therefore should always win, even if this is not the recommended approach (because it isn't the easiest approach) in the future.

Only if the Configuration CRD doesn't configure the pipelines then we should inspect the priorities and build the pipeline automatically.

What do you think @yaron2 @ItalyPaleAle ?

I don't have a strong opinion either way.

To recap the options for when there's a conflict (both a pipeline defined in the Configuration CRD, and one or more middleware(s) with the pipelineType and priority properties):

  1. Only follow the Configuration CRD and disregard the components that are not included there
  2. Load the components defined in the Configuration CRD first (in the order they are defined), and then load the components that "autoload" in the order of the priority they define.
  3. Show an error and refuse to initialize Dapr

IMHO they are all valid options (maybe 3 is the least desirable one, but still valid).

@berndverst
Copy link
Member Author

berndverst commented Jul 13, 2023

Conceivably a given component could be used in both pipelines at the same time.

So maybe it should be:

  - name: pipelineType
    value: appHttpPipeline,httpPipeline
  - name: priority
    value: 2

or what about a way to configure distinct priorities for the different pipeline types - this would also support future pipeline types:

  - name: priority
    value: appHttpPipeline:2
  - name: priority
    value: httpPipeline:5

Or that could be combined as:

  - name: priority
    value: appHttpPipeline:2,httpPipeline:5

@ItalyPaleAle
Copy link
Contributor

Conceivably a given component could be used in both pipelines at the same time.

IMHO if you need to do that, create 2 separate components. One with pipelineType=httpPipeline and one with pipelineType=appHttpPipeline.

@yaron2
Copy link
Member

yaron2 commented Jul 13, 2023

Conceivably a given component could be used in both pipelines at the same time.

IMHO if you need to do that, create 2 separate components. One with pipelineType=httpPipeline and one with pipelineType=appHttpPipeline.

Yes, that's preferable to different string variations.

Re: conflict behavior, I support letting Configuration win if pipelines are present.

@dapr-bot
Copy link
Collaborator

This issue has been automatically marked as stale because it has not had activity in the last 60 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

@dapr-bot dapr-bot added the stale Issues and PRs without response label Sep 11, 2023
@ItalyPaleAle
Copy link
Contributor

keep-alive

@dapr-bot dapr-bot removed the stale Issues and PRs without response label Sep 11, 2023
@JoshVanL JoshVanL mentioned this issue Oct 24, 2023
31 tasks
@ItalyPaleAle
Copy link
Contributor

Since this is in the 1.13 milestone, I'm recapping here the final design.

Goal: Simplify how middleware pipelines are defined, so users don't need to work with 2 separate kinds of resources (Component and Configuration). The current method will continue to work for backwards-compatibility and there's no plan to deprecate it at the moment (more on that later).

Defining pipelines in Component metadata

  • We are adding 2 new metadata options (all keys and values are handled as case-insensitive):
    • pipeline can be set to httpPipeline or appHttpPipeline (we'll have aliases including http-pipeline and http for the first; app-http-pipeline and app for the second)
    • priority is an integer from -MaxInt32 to +MaxInt32
  • Both metadata options are required for the component to be enabled in a pipeline. For backwards-compatibility, there's no default priority.

Option We could use a single metadata property (one of 2, or both):

  • appHttpPipeline -> value is the priority for this pipeline
  • httpPipeline -> value is the priority for this pipeline
  • When using this approach, only 1 new metadata option is required (and both can be set to configure the MW on both pipelines)

This wasn't discussed above (although there were hints to that), but I almost like this one more as it doesn't require specifying 2 options

The "priority" indicates the order of the component in the pipeline.

  • Components with lower priorities are executed first
  • Priorities don't necessarily need to be sequential. In the docs, we will document as best practice the use of values like "100", "200", "300", etc - so changing the order is less complex
  • If 2 components have the same priority value, they are both loaded but the order is randomized every time Dapr starts up. Randomization is necessary otherwise users may take a dependency on an order that appears to be stable, but isn't guaranteed. We may want to consider printing a warning in this case.
  • When Dapr starts up, we print a log indicating the pipelines that were loaded, with the components in order. This can be used for debugging

As per usual, Dapr only loads components that are scoped to that app ID.

Interactions with the pipeline specified in the Configuration CRD

If the Configuration CRD contains the definition of a middleware pipeline, Dapr uses the pipelines defined in there, ignoring pipelines that may be defined in Components (we will print a warning if we find that both are defined). This allows preserving backwards-compatibility.

Using the Configuration CRD to define pipelines continues to be a supported method and we have no plans to deprecate this. We will continue to have tests to ensure this is not broken.

However, this will become the "legacy" method. The new, simplified method is what we'll guide new users towards, and what we'll include in docs as the recommended approach.

@JoshVanL JoshVanL added this to the v1.13 milestone Nov 27, 2023
@JoshVanL JoshVanL added the P1 label Nov 27, 2023
@sadath-12
Copy link
Contributor

/assign

@dapr-bot
Copy link
Collaborator

This issue has been automatically marked as stale because it has not had activity in the last 60 days. It will be closed in the next 7 days unless it is tagged (pinned, good first issue, help wanted or triaged/resolved) or other activity occurs. Thank you for your contributions.

@dapr-bot dapr-bot added the stale Issues and PRs without response label Feb 11, 2024
@ItalyPaleAle ItalyPaleAle added pinned and removed stale Issues and PRs without response labels Feb 11, 2024
@mukundansundar mukundansundar modified the milestones: v1.13, v1.14 Feb 26, 2024
@mikeee mikeee mentioned this issue Mar 18, 2024
38 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Assigned - P1
Development

Successfully merging a pull request may close this issue.

7 participants