-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(kernel): add listeners to services #258
Conversation
4978925
to
b2259cc
Compare
This lets us save the name `Connection` to represent the _established_ connection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some notes, happy to merge this and follow up. Thank you!
/// data from the client, this type can be set to [`()`]. | ||
// XXX(eliza): ideally, we could default `Hello` to () and `ConnectError` to | ||
// `Infallible`...do we want to do that? it requires a nightly feature. | ||
type Hello: 'static; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, we might want to give this a name like "Introduction" or "ConnectDetails" or something, but we can bikeshed later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i stole the name from TLS handshakes, which start with a ClientHello
message followed by a ServerHello
message. But, if you don't like it, i'm happy to change it later...
NotFound, | ||
/// The remote [`RegisteredDriver`] rejected the connection. | ||
Rejected(D::ConnectError), | ||
/// The remote [`RegisteredDriver`] has been registered, but the service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we could check this in the registry (awaiting channel closure?), tho we can't remove services anyway yet.
UserRequest<'_>, | ||
&ErasedKProducer, | ||
&bbq::MpscProducer, | ||
ServiceId, | ||
ClientId, | ||
) -> Result<(), UserHandlerError>; | ||
|
||
type ErasedHandshake = unsafe fn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, if we want to just yank all of the userspace stuff for now, we've already removed the k2u/u2k rings, and this will all be redesigned to be interface-like later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my motivation for actually writing the userspace code was mostly that i wanted to validate that the new design didn't completely break our type erasure mechanism, and that it was possible to do something like this for a serialization based service. i'm happy to remove this code and add something similar back in later, but i have a slight preference for keeping it so that we can build on top of it when integrating the new serializing channel. fundamentally up to you, though.
The registry emits a bunch of `tracing` diagnostics, but they're not *that* helpful in most cases. In particular, when we get a service, we log its UUID, but this isn't super readable for someone consuming the logs --- you have to then take that UUID and look it up against the table of potential UUIDs to figure out which service was being looked up. Also, some of the errors introduced in #258 look pretty lousy in `fmt::Debug` output. This branch fixes both of those issues. I've changing the logs emitted by the registry to also include the type name of the service, which makes it easier to tell what's going on without having to cross-reference UUIDs against the actual source code. Furthermore, I've improved the readability of some of the error types. Example log output: <details> <summary>Before</summary> ```rust running 1 test INFO registry::tests::user_connect Registry::register{uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635}: kernel::registry: Registered uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 INFO registry::tests::user_connect kernel::comms::bbq: Creating new mpsc BBQueue channel capacity=256 INFO registry::tests::user_connect kernel::comms::bbq: Channel created successfully INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635}: kernel::registry: Got KernelHandle from Registry uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: sending user request req=TestMessage(1) INFO registry::tests::user_connect kernel::registry::tests: received request val=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: return=Ok(TestMessage(2)) ERROR registry::tests::user_connect user_connect{hello=TestMessage(2)}: kernel::registry::tests: error=UserConnectError::Connect(ConnectError::<kernel::registry::tests::TestService>::Rejected(TestMessage(666))) INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635}: kernel::registry: Got KernelHandle from Registry uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: sending user request req=TestMessage(2) INFO registry::tests::user_connect kernel::registry::tests: received request val=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: return=Ok(TestMessage(3)) INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: sending user request req=TestMessage(3) INFO registry::tests::user_connect kernel::registry::tests: received request val=3 INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: return=Ok(TestMessage(4)) test registry::tests::user_connect ... ok ``` </details> <details> <summary>After</summary> ```rust running 1 test INFO registry::tests::user_connect Registry::register{svc=kernel::registry::tests::TestService}: kernel::registry: Registered svc=kernel::registry::tests::TestService uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 INFO registry::tests::user_connect kernel::comms::bbq: Creating new mpsc BBQueue channel capacity=256 INFO registry::tests::user_connect kernel::comms::bbq: Channel created successfully INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] svc=kernel::registry::tests::TestService}: kernel::registry: Got KernelHandle from Registry svc=kernel::registry::tests::TestService uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: sending user request req=TestMessage(1) INFO registry::tests::user_connect kernel::registry::tests: received request val=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: return=Ok(TestMessage(2)) ERROR registry::tests::user_connect user_connect{hello=TestMessage(2)}: kernel::registry::tests: error=Connect(Rejected { error: TestMessage(666), svc: kernel::registry::tests::TestService }) INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] svc=kernel::registry::tests::TestService}: kernel::registry: Got KernelHandle from Registry svc=kernel::registry::tests::TestService uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: sending user request req=TestMessage(2) INFO registry::tests::user_connect kernel::registry::tests: received request val=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: return=Ok(TestMessage(3)) INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: sending user request req=TestMessage(3) INFO registry::tests::user_connect kernel::registry::tests: received request val=3 INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: return=Ok(TestMessage(4)) test registry::tests::user_connect ... ok ````
Depends on #267 Currently, the `Registry::connect_*` methods return an error immediately if the requested service is not found in the registry. Most client types implement a `from_registry` method which retry in a loop when the requested service is not found in the registry. These methods will wait for a fixed (short) amount of time and then try again until the registry returns the service. This approach is quite inefficient, as we have to run a bunch of retry loops that keep trying to access a service that may not be there. This may happen several times before the service actually is registered, especially when registering a service requires connecting to another service. This branch improves the efficiency of waiting for a service to be registered. Now, rather than retrying with a fixed-duration sleep, we instead have the `Registry` own a `WaitCell` which is woken whenever a new service is registered. This wakes all takes potentially waiting to connect, allowing them to re-check whether the service they want is in the registry. This idea was initially proposed by @jamesmunns in a [comment] on PR #259 Connections are now established using either `Registry::connect`, which retries whenever a new service is registered, or `Registry::try_connect` which never retries. Additionally, we now have the capacity to indicate that a service is not found *and* that the registry is full, by closing the `WaitCell`. In this case, retrying will never succeed, because the registry is full and if the service isn't already there, it will never be added. In this case, the retrying methods will also return an error, rather than never completing, so we avoid a potential task leak. In order to make this change, we need to move the `RwLock` from being around the entire `Registry` to being inside the registry, around `items`. This allows the `WaitCell` to be accessed regardless. It also allows us to shorten the duration for which the lock is held. This requires changing all methods on `Registry` to take `&self`. Therefore, I've removed the wrapper methods on `Kernel` for connecting and registering, since they can now just be called on `kernel.registry` without a bunch of extra boilerplate for lock management. I've also simplified the API surface of the registry a bit by removing the `connect` methods that don't take a `Hello`, and just using `Registry::connect(())` in those cases. IMO, those methods weren't really pulling their weight, and they required us to have a method named `Registry::try_connect_userspace_with_hello` if we were going to add a non-retrying `connect` variant. Now, we can just have `Registry::try_connect_userspace`, `Registry::connect_userspace`, `Registry::connect`, and `Registry::try_connect`, which feels much less egregious. [comment]: #258 (comment)
Depends on #267 Currently, the `Registry::connect_*` methods return an error immediately if the requested service is not found in the registry. Most client types implement a `from_registry` method which retry in a loop when the requested service is not found in the registry. These methods will wait for a fixed (short) amount of time and then try again until the registry returns the service. This approach is quite inefficient, as we have to run a bunch of retry loops that keep trying to access a service that may not be there. This may happen several times before the service actually is registered, especially when registering a service requires connecting to another service. This branch improves the efficiency of waiting for a service to be registered. Now, rather than retrying with a fixed-duration sleep, we instead have the `Registry` own a `WaitCell` which is woken whenever a new service is registered. This wakes all takes potentially waiting to connect, allowing them to re-check whether the service they want is in the registry. This idea was initially proposed by @jamesmunns in a [comment] on PR #259 Connections are now established using either `Registry::connect`, which retries whenever a new service is registered, or `Registry::try_connect` which never retries. Additionally, we now have the capacity to indicate that a service is not found *and* that the registry is full, by closing the `WaitCell`. In this case, retrying will never succeed, because the registry is full and if the service isn't already there, it will never be added. In this case, the retrying methods will also return an error, rather than never completing, so we avoid a potential task leak. In order to make this change, we need to move the `RwLock` from being around the entire `Registry` to being inside the registry, around `items`. This allows the `WaitCell` to be accessed regardless. It also allows us to shorten the duration for which the lock is held. This requires changing all methods on `Registry` to take `&self`. Therefore, I've removed the wrapper methods on `Kernel` for connecting and registering, since they can now just be called on `kernel.registry` without a bunch of extra boilerplate for lock management. I've also simplified the API surface of the registry a bit by removing the `connect` methods that don't take a `Hello`, and just using `Registry::connect(())` in those cases. IMO, those methods weren't really pulling their weight, and they required us to have a method named `Registry::try_connect_userspace_with_hello` if we were going to add a non-retrying `connect` variant. Now, we can just have `Registry::try_connect_userspace`, `Registry::connect_userspace`, `Registry::connect`, and `Registry::try_connect`, which feels much less egregious. [comment]: #258 (comment)
Depends on #267 Currently, the `Registry::connect_*` methods return an error immediately if the requested service is not found in the registry. Most client types implement a `from_registry` method which retry in a loop when the requested service is not found in the registry. These methods will wait for a fixed (short) amount of time and then try again until the registry returns the service. This approach is quite inefficient, as we have to run a bunch of retry loops that keep trying to access a service that may not be there. This may happen several times before the service actually is registered, especially when registering a service requires connecting to another service. This branch improves the efficiency of waiting for a service to be registered. Now, rather than retrying with a fixed-duration sleep, we instead have the `Registry` own a `WaitCell` which is woken whenever a new service is registered. This wakes all takes potentially waiting to connect, allowing them to re-check whether the service they want is in the registry. This idea was initially proposed by @jamesmunns in a [comment] on PR #259 Connections are now established using either `Registry::connect`, which retries whenever a new service is registered, or `Registry::try_connect` which never retries. Additionally, we now have the capacity to indicate that a service is not found *and* that the registry is full, by closing the `WaitCell`. In this case, retrying will never succeed, because the registry is full and if the service isn't already there, it will never be added. In this case, the retrying methods will also return an error, rather than never completing, so we avoid a potential task leak. In order to make this change, we need to move the `RwLock` from being around the entire `Registry` to being inside the registry, around `items`. This allows the `WaitCell` to be accessed regardless. It also allows us to shorten the duration for which the lock is held. This requires changing all methods on `Registry` to take `&self`. Therefore, I've removed the wrapper methods on `Kernel` for connecting and registering, since they can now just be called on `kernel.registry` without a bunch of extra boilerplate for lock management. I've also simplified the API surface of the registry a bit by removing the `connect` methods that don't take a `Hello`, and just using `Registry::connect(())` in those cases. IMO, those methods weren't really pulling their weight, and they required us to have a method named `Registry::try_connect_userspace_with_hello` if we were going to add a non-retrying `connect` variant. Now, we can just have `Registry::try_connect_userspace`, `Registry::connect_userspace`, `Registry::connect`, and `Registry::try_connect`, which feels much less egregious. [comment]: #258 (comment)
Currently, when a task (or userspace) wishes to connect to a
RegisteredDriver
,it performs a registry lookup, which returns a handle for sending requests to
that service. These handles are essentially producers for a single
KChannel
which is associated with the service when it's registered, and the service can
use this channel's
KConsumer
to receive a stream of incoming requests.This approach works fine, but it limits what a service can do with its incoming
request stream. Because all requests arrive on the same
KConsumer
, the servermay not spawn a separate task to handle requests from each client. Similarly, it cannot
route requests to different worker tasks based on which resource the request is
associated with.
This branch changes the way clients communicate with servers by introducing a
handshake to this process. Rather than registering a service by giving the
registry a
KProducer
for a single request channel, the registry now owns aproducer for a channel of incoming connections. These connections contain a
Hello
message sent by the client, of a type defined by theRegisteredDriver
implementation for the service. This may be used by the service to determine
what resource the client wants to access, or otherwise route connections to
different tasks. This handshake is completed by the service responding to the
incoming connection with a
KProducer
for a request channel, which is thenreturned to the client.
The returned
KProducer
may still point to a single global request channelfor that service, allowing us to continue to implement the existing behavior.
However, the service may alternatively spawn a new task for that connection and
create a new channel for that worker. Or, it may return a different
KProducer
for a different request channel depending on the
Hello
message, allowing theserver to have separate worker tasks for different resources managed by the same
service, and route the connection to one of those workers depending on which
resource was requested in the
Hello
message.An adapter,
into_request_stream
, is provided, to make it easier for serviceswhich want the current behavior of one global stream of requests to have that.
Currently, all the existing services use this, but in the future, many of our
existing services can be rewritten to use the new handshake mechanism more
effectively. I'd also like to do some additional refactoring, but I wanted to
get a working PR up first.