Skip to content

Correlated request/response data API#93

Open
robobario wants to merge 3 commits intokroxylicious:mainfrom
robobario:request-response-api
Open

Correlated request/response data API#93
robobario wants to merge 3 commits intokroxylicious:mainfrom
robobario:request-response-api

Conversation

@robobario
Copy link
Copy Markdown
Member

Adds a proposal for a framework API to enable Filters to stash some Contextual data for a Request, that it can then access later when processing the Response.

@robobario robobario requested a review from a team as a code owner April 8, 2026 21:49
Copy link
Copy Markdown
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Rob.

I left a few comments.

I also have a slightly larger point:

Currently we assume all filters are stateful (in the sense of keeping conversational state in their fields). We've sometimes talked about the value of being able to declare a filter as being stateless. And when we knew a filter was stateless we would be able to use a @Stateless (in the Netty sense) version of FilterHandler to invoke it. Having those mechanics would mean stateless filters would require constant memory, rather than memory proportional to the number of connections.

One thing compromise we might be making by doing what is proposed is that we have no way of knowing whether a filter will actually use this new mechanism or not. The proposal compromises the type safety of the builder API a bit for supporting such explicitly stateless filters in the future: Such filters would have to not use this API (in addition to not having any fields and so on). From a purely API-focussed PoV it would be a little bit nicer if we could have used static typing to avoid stateless filters having access to request builders with these stateful methods at all.

I don't really think that's feasible with the current filter API, however. Therefore what you're proposing here is OK. But it might be a good idea to call out this tradeoff explicitly in the proposal text.

Comment thread proposals/xxx-correlated-request-response-data-api.md
Comment thread proposals/xxx-contextual-request-response-data-api.md Outdated

### Implementation Considerations
* **Scoping:** The data is strictly scoped to the filter instance that created it. It is not shared with other filters in the chain to avoid implicit coupling.
* **Lifecycle Management:** The framework becomes responsible for "popping" the data. If a request is a zero-ack Produce request, the framework will proactively prevent or discard the stashed data to prevent leaks. If the Response Filter does not access the data, the framework will handle discarding the stashed data.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proactively prevent or discard

which is it?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to discard. If we made it eagerly throw an exception on trying to stash data for a zero-ack Produce request, then the author (presumably unaware of zero-ack's special-ness) might not encounter this problem until they've put their Filter out there in the world. It feels like discarding will cause less heartache.

```java
return context.requestFilterResultBuilder()
.forwardRequest(header, request)
.withContextualData(myStateObject)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the fluent API, but I'm not keen on the method name. I don't think a typical developer, even one with some knowledge of the Kafka protocol, would know what "context" meant without looking at the javadoc.

Using the word "correlation" or "correlated" would at least be a hint that this was something to do with correlationId. So perhaps something like withCorrelatedData is better?

But even that doesn't really say what it does clearly. However, that following that path suggests something like savingStateForResponse, which seems like a bit of a mouthful.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have updated to correlated language

Signed-off-by: Robert Young <robertyoungnz@gmail.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
@robobario robobario force-pushed the request-response-api branch from 6cbf6f0 to 0aef499 Compare April 13, 2026 00:52
@robobario robobario changed the title Contextual request/response data API Correlated request/response data API Apr 13, 2026
Copy link
Copy Markdown
Member

@tombentley tombentley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @robobario, I think this looks good.

Comment thread proposals/xxx-correlated-request-response-data-api.md Outdated
Co-authored-by: Tom Bentley <tombentley@users.noreply.github.com>
Signed-off-by: Robert Young <robertyoungnz@gmail.com>
## Motivation

Manual state management for correlated data is error-prone and introduces several risks:
* **Memory Leaks:** If a response never returns or a filter fails to "pop" the data, the internal map grows indefinitely.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even the io.kroxylicious.proxy.filter.ResponseFilter#shouldHandleResponse could even say no


### Implementation Considerations
* **Scoping:** The data is strictly scoped to the filter instance that created it. It is not shared with other filters in the chain to avoid implicit coupling.
* **Lifecycle Management:** The framework becomes responsible for "popping" the data. If a request is a zero-ack Produce request, the framework will proactively discard the stashed data to prevent leaks. If the Response Filter does not access the data, the framework will handle discarding the stashed data.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this approach, but did you consider making it an error to try to store state in a zero-ack produce-request. should it be a rejected alternative?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a bad idea because when a filter developer forgets to gets the acks=0 case (and we've been guilty of that ourselves!) they release something which will break with an exception.

But there is a deeper question there: We're assuming that this state that we're keeping around for the response is only really using memory. But what if there are other resources in there, that are invisible to the runtime. Like a socket/connection/file descriptor? So I wonder if we should require that the data they associate should actually be Closable. That way we can guarantee that the runtime will fully clean it up even if the filter developer has forgotten that acks=0 is a thing.

Wdyt @robobario ?

### Implementation Considerations
* **Scoping:** The data is strictly scoped to the filter instance that created it. It is not shared with other filters in the chain to avoid implicit coupling.
* **Lifecycle Management:** The framework becomes responsible for "popping" the data. If a request is a zero-ack Produce request, the framework will proactively discard the stashed data to prevent leaks. If the Response Filter does not access the data, the framework will handle discarding the stashed data.
* **Data Structure:** The API will support a single state object. Filter authors can define a custom `Record` or `POJO` to hold multiple values, a complex Key-Value map is deemed unnecessary.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to say anything about stashing the request itself? A filter may stash the request, but there is no guarantee that request will carry any modifications made by filters upsteam of this one.

**Response Path:**
The context made available to the response filter will provide access to the stashed object:
```java
Optional<MyStateObject> data = context.correlatedData();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the request filter calls this method? is it an error?

Copy link
Copy Markdown
Member

@k-wall k-wall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of comments, but otherwise LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants