Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPOE next generation #2502

Open
capflam opened this issue Mar 25, 2024 · 10 comments
Open

SPOE next generation #2502

capflam opened this issue Mar 25, 2024 · 10 comments
Labels
type: feature This issue describes a feature request / wishlist.

Comments

@capflam
Copy link
Member

capflam commented Mar 25, 2024

Your Feature Request

The current implementation of the SPOE was deprecated for the 3.0 (https://www.mail-archive.com/haproxy@formilux.org/msg44680.html). Idea was not to remove the feature but to rework it. The engine is now too outdated to make it evolve and a new one must be implemented. Mainly to be based on recently implemented core features (muxes, connection sharing...) but also to be adapted to real usage. When it was developed, the SPOE was designed with the payload streaming in mind. But it was never implemented because of the feature's complexity. Because of that, the current design is overkill and make the SPOE even harder to maintain.

With this feature request, we will try to describe the base of the next generation of the SPOE.

What are you trying to do?

First of all, as far as possible, the protocol (SPOP) should be kept compatible with existing agent implementations. At first glance, it seems feasible. Then the streaming must be kept out of the scope of the study. This way we will be able to simplify the design and probably the agent implementations. Finally the asynchronous mode (the ability to send a request on a connection to eventually receive the response on another one) will be removed. It was most probably never used and it is too complex to be kept in the new design.

For the 3.0, the current implementation will be kept as deprecated with a warning to notify users some configuration changes should be expected in future versions. In same time, a SPOP mode for backends will be introduced. This should ease the introduction of a SPOP multiplexer in the 3.1. Fragmented frames will no longer be used by HAProxy. Without the payload streaming, it is overkill and almost useless. Some new configuration parameters will probably be introduced to prepare the refactoring but hard to be specific for now.

For the 3.1, the connection sharing mechanism must be reviewed to support SPOP connections (or more generally any type of connection ?). For now, only HTTP connections are concerned. And a SPOP multiplexer must then be implemented. It is still foggy but the HTX should be evaluated to represent SPOP messages exchanged between the SPOE applet and the SPOP multiplexer. It may be a good way to support other protocols with the same engine (gRPC for instance). The SPOE applet must be rewritten. It is of course a quick sum up of what should be done. Only the main milestones are mentioned. But it is a good start for discussion.

Output of haproxy -vv

HAProxy version 3.0-dev5-8674f6-59 2024/03/25 - https://haproxy.org/
Status: development branch - not safe for use in production.
Known bugs: https://github.com/haproxy/haproxy/issues?q=is:issue+is:open
Running on: Linux 6.7.6-200.fc39.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Feb 23 18:27:29 UTC 2024 x86_64
@capflam capflam added the type: feature This issue describes a feature request / wishlist. label Mar 25, 2024
@wlallemand wlallemand pinned this issue Mar 26, 2024
@capflam capflam unpinned this issue Mar 26, 2024
@capflam capflam pinned this issue Mar 26, 2024
@git001
Copy link
Contributor

git001 commented Mar 28, 2024

Let me add some feature request for the redesign phase.

It would be very helpful to be able to scan the body of a request via SPOE. Such a request comes from time to time and this is one of the limiting factor for the current implementation, from my point of view.

@wtarreau
Copy link
Member

That was definitely part of the initial design and it requires many more changes and added a huge amount of complexity for something that was never completed. One of the approaches that we'll follow on the redesign is to make the engine better, more reliable, faster, observable etc and this passes by first being simpler. Simpler definitely implies getting rid of the streaming attempts.

I know it can sound a bit sad but let's be honest, it has had very very little traction over the last 7-8 years and is responsible for most of the complexity. Maybe that would be revisited in the future but for now one of the plans are to not waste precious time and effort on this low-demand feature.

@git001
Copy link
Contributor

git001 commented Mar 28, 2024

Well I don't agree with that statement.

I know it can sound a bit sad but let's be honest, it has had very very little traction over the last 7-8 years and is responsible for most of the complexity. Maybe that would be revisited in the future but for now one of the plans are to not waste precious time and effort on this low-demand feature.

There was and is still are use cases where the body is very helpful for modules like the WAF's with mod_security. There is already a solution for that Coraza SPOA - HAProxy Web Application Firewall which could benefit from the feature to scan also the Body.

Maybe the streaming of the body could be limited to some size and do not stream the body but send some amount of data with a fixed size.

What I want to highlight is that at this redesign there should be not a design decision which makes the stream of the body almost impossible for the future.

I'm looked into this line and asked my self:

  • Could this body handling reliable work?
  • Is this already a working solution?

https://github.com/DropMorePackets/haproxy-go/blob/master/spop/example/header-to-body/engine.cfg#L13

    args id=unique-id src-ip=src method=method path=path query=query version=req.ver headers=req.hdrs body=req.body

@wtarreau
Copy link
Member

From my understanding Coraza can already scan the body contained in the first buffer. I'm not saying this is the panacea but if they developed it as an SPOA it means they do see some value in doing this. We've also seen two other WAFs working this way and being used (mod_security and mod_defender).

What you mean by "not stream the body but send some amount of data with a fixed size" is exactly what is currently being done and what will be re-implemented. There's no reason to change this.

What is a total pain is to stream the data before header processing, hence before a routing decision is taken, and re-inject them in return and pass them to the next stage -- all that using the same buffer for in and out. Even the limited forwarding that we originally envisioned inline (e.g. image recompression) is a pain to deal with: how do you adjust a response header to match what you're doing with the contents for example. All of these initial intents significantly contributed to the complexity of the current implementation while still making it insufficient.

And I'm not even speaking about all the design discussions about applet architecture simplification etc that often ended up with "wait wait we can't do that, there's SPOE". My feeling was that it was almost unused and that we could just get rid of it, even if it annoys a few users, because in the end everyone pays the consequences of maintaining it by delayed features and more complex bugs in the core. Given that there has been concerns and objections we discussed this and came to an agreement to prioritize a re-implementation but now the primary focus must be maintainability, and the immediate second one is to address all the design choices that were causing trouble (including the painful idle connection management that didn't support inter-thread sharing, and the LB that was only applying to connection creation instead of messages).

I think we'll get the best of both worlds: same filtering/processing features as right now, lower resource usage for agents, much better load balancing, higher reliability, higher performance and no more maintenance hell (i.e. no longer dream of getting rid of it).

@git001
Copy link
Contributor

git001 commented Mar 29, 2024

What is a total pain is to stream the data before header processing, hence before a routing decision is taken, and re-inject them in return and pass them to the next stage -- all that using the same buffer for in and out. Even the limited forwarding that we originally envisioned inline (e.g. image recompression) is a pain to deal with: how do you adjust a response header to match what you're doing with the contents for example. All of these initial intents significantly contributed to the complexity of the current implementation while still making it insufficient.

Ah okay, got you. I haven't thought that this would anybody need, but I'm in my own bubble 😄 .

To clarify it a bit about the wording.

  • The "streaming Body" means here to re-inject the body after some processing, right?
  • The "handling Body" means here to send the body do SPOA do with that what every wanted and set some variables for further HAProxy processing, right?

@wtarreau
Copy link
Member

The "streaming body" would mean that you'd have to send all of it to the agent with the agent sending it back. In terms of communication it would have the form of a "T", with client->ha, ha->agent, agent->ha, ha->server.

What's currently supported is what you mentioned, req.body is sent to the agent without having to leave the internal buffers, and the agent sends its verdict, variables etc. And we intend to keep this, of course!

@agirot
Copy link

agirot commented Apr 17, 2024

@wtarreau 👋 I am just curious, why rework SPOE and not try to implement proxy-wasm (https://github.com/proxy-wasm/spec) which now seems to be the standard way to add plugin/hook in proxy ecosystem (added in envoy, kong, traefik,ats...). Haproxy could take advantage of plugins already created for other proxies, it's seem pretty cool ?

#1482

@wtarreau
Copy link
Member

It seems totally orthogonal. I would like on the long term to support what others also support, to try to unify the ecosystems. But here we observed some resistance from some users who had invested a lot of time on their SPOA and it just seems that we underestimated the amount of code already deployed. In fact it seems that for many of them it works well enough that they never communicate about it. And when you think about it it makes sense, they implement their own stuff internally using it, and only have to discuss them publicly if the engine doesn't work, or when we mark it deprecated 4 months after announcing it.

Thus we discussed and decided that OK we're not going to leave them in the dust, but since we can't really pursue on the existing architecture either, we'll dedicate some of the core team's time to reimplement a better and more durable architecture that will not only stop always being in the way when we try to fix stuff, but will also bring them the performance and reliability that's currently missing in the aging architecture.

Then after that it will still be time to study what is commonly accepted as "the norm" for other projects. For me what's being described here as proxy-wasm is how proxies should interact with their own wasm virtual machine, just as we have with Lua right now. And of course, if we'd eventually go down that route it would be super important to be compatible with that. But we may also one day have to support existing external agents that are compatible with other proxies. Some stuff might already work on HTTP or gRPC nowadays for example. As an illustration, before SPOP we had a thought about IMAP which used to be the norm 20 years ago. What's important is to make sure that if others unite around a technology, we're not the only ones not using it. It causes double work for implementers, and fragmentation for users, that's not great.

@zino7825
Copy link

Spoa is a really necessary function, so you need to think about next negation.

We are using so many different types of spoa.

Authentication module, mirroring, mod_security, etc..
I think it's a necessary function.

I tried implementing it with lua, but the performance was not good.

@wtarreau
Copy link
Member

That's exactly why we're going to reimplement a new SPOE that will remain compatible with your existing SPOA.

haproxy-mirror pushed a commit that referenced this issue May 22, 2024
This reverts commits 885e404 and
dff9807.

We decided to spend some time to refactor and rationnalize the SPOE for the
3.1. Thus there is no reason to still consider it as deprecated for the
3.0. Compatibility between the both versions will be maintained.

See #2502 for more info.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature This issue describes a feature request / wishlist.
Projects
None yet
Development

No branches or pull requests

5 participants