Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual connections in Kafka server #16658

Merged

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Feb 21, 2024

Implemented virtualization of Kafka connections on top of one physical
TCP connection. Requests processed in a virtual connection context are
independent from other requests in different virtual connections. This
way we prevent requests from different virtual connections from blocking
each other.

Virtual connections are only available if first requests handled by the
connection has the __redpanda_mpx client id set and Redpanda MPX
extensions are enabled.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.3.x
  • v23.2.x
  • v23.1.x

Release Notes

  • none

Copy link
Member

@dotnwat dotnwat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

largely looks ok to me.

I had imagined that instead of adding complexity to / generalizing connection_context, we would instead leave connection_context as the logical connection largely untounched (except for pulling out the bits that grab the requests off the wire), and virtualize above it.

src/v/kafka/server/server.h Outdated Show resolved Hide resolved
src/v/kafka/server/connection_context.cc Show resolved Hide resolved
Comment on lines +323 to +347
is_first_request()
&& h->client_id == multi_proxy_initial_client_id)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what happens if the first request doesn't have this magic client id, and then later the connection starts behaving like its virtualized?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, in this case connection is not virtualized

src/v/kafka/server/connection_context.cc Outdated Show resolved Hide resolved
src/v/kafka/server/connection_context.cc Outdated Show resolved Hide resolved
@mmaslankaprv
Copy link
Member Author

largely looks ok to me.

I had imagined that instead of adding complexity to / generalizing connection_context, we would instead leave connection_context as the logical connection largely untounched (except for pulling out the bits that grab the requests off the wire), and virtualize above it.

Connection context has a lot of state which is valid only for the physical connection like _mtls and auth state therefore i decided to do it inside of the connection context to make the virtualized state as small as possible.

Copy link
Contributor

@graphcareful graphcareful left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! I like how the diff is pretty small because the abstractions themselves implement process_request

src/v/kafka/server/connection_context.h Outdated Show resolved Hide resolved
@mmaslankaprv
Copy link
Member Author

/dt

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv
Copy link
Member Author

/dt

@mmaslankaprv
Copy link
Member Author

/dt

@mmaslankaprv mmaslankaprv marked this pull request as ready for review February 26, 2024 16:49
@mmaslankaprv
Copy link
Member Author

/ci-repeat 1

Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks nice, just a couple of questions

src/v/kafka/server/connection_context.cc Outdated Show resolved Hide resolved
Comment on lines 57 to 81
bytes parse_virtual_connection_id(const ss::temporary_buffer<char>& buffer) {
// TODO: should we use vcluster_id here ?
return bytes{
reinterpret_cast<const uint8_t*>(buffer.begin()), buffer.size()};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: So once we get the 'magic' mpx value, each subsequent request will have the virtual connection ID in the client_id. Is there any format to that value? Does it need any additional validation or once we know we're a virtual connection, any value for client ID is accepted?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we haven't yet agreed with the team about the format, i made it to accept any string for now, a plan is to add schema to it in future PR

Comment on lines 611 to 612
auto v_connection_id = parse_virtual_connection_id(
rctx.header().client_id_buffer);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is there any requirement/need to handle authentication messages differently than other requests? Like, only handle authn requests if it has the magic mpx header value? As RP now supports re-authentication should that be disallowed in a virtual connection? How does the MPX system authenticate with Redpanda? Is it via SASL/SCRAM or will it use mTLS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth noting that reauth is performed automagically (transparently) by modern enough Kafka clients. If the MPX system needs to mediate requests in some way to conform to the virtual connection protocol (and if reauthentication is desirable), this might require additional handling/care.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i went over the code and it seems that there is no additional handling required for authentication/reauth requests. They do not have any special handling of client id. I think in this case we should relay on client to order the requests correctly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They do not have any special handling of client id

The question I was trying to pose was whether or not authentication for virtual connections should be handled differently based on client ID (e.g. only honor auth requests from the mpx magic client for example).

I think in this case we should relay on client to order the requests correctly.

I think that's fine. I'm just posing the question as any change of authentication will effect all virtual connections.

Implemented virtualization of Kafka connections on top of one physical
TCP connection. Requests processed in a virtual connection context are
independent from other requests in different virtual connections. This
way we prevent requests from different virtual connections from blocking
each other.

Virtual connections are only available if first requests handled by the
connection has the `__redpanda_mpx` client id set and Redpanda MPX
extensions are enabled.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Added a test that validates if requests executed in different connection
contexts do not block each other. The test is using a `kafka-python`
client. Unfortunately we need to access the client internal to change
client ids and expected response sequences but we are able to execute
the virtual connection handling code without pulling in MPX into
ducktape.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Copy link
Contributor

@michael-redpanda michael-redpanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mmaslankaprv mmaslankaprv merged commit 3839e4c into redpanda-data:dev Mar 1, 2024
17 checks passed
@mmaslankaprv mmaslankaprv deleted the multiplexing-connections branch March 1, 2024 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants