feat(kernel): add listeners to services #258

hawkw · 2023-08-28T19:36:49Z

Currently, when a task (or userspace) wishes to connect to a RegisteredDriver,
it performs a registry lookup, which returns a handle for sending requests to
that service. These handles are essentially producers for a single KChannel
which is associated with the service when it's registered, and the service can
use this channel's KConsumer to receive a stream of incoming requests.

This approach works fine, but it limits what a service can do with its incoming
request stream. Because all requests arrive on the same KConsumer, the server
may not spawn a separate task to handle requests from each client. Similarly, it cannot
route requests to different worker tasks based on which resource the request is
associated with.

This branch changes the way clients communicate with servers by introducing a
handshake to this process. Rather than registering a service by giving the
registry a KProducer for a single request channel, the registry now owns a
producer for a channel of incoming connections. These connections contain a
Hello message sent by the client, of a type defined by the RegisteredDriver
implementation for the service. This may be used by the service to determine
what resource the client wants to access, or otherwise route connections to
different tasks. This handshake is completed by the service responding to the
incoming connection with a KProducer for a request channel, which is then
returned to the client.

The returned KProducer may still point to a single global request channel
for that service, allowing us to continue to implement the existing behavior.
However, the service may alternatively spawn a new task for that connection and
create a new channel for that worker. Or, it may return a different KProducer
for a different request channel depending on the Hello message, allowing the
server to have separate worker tasks for different resources managed by the same
service, and route the connection to one of those workers depending on which
resource was requested in the Hello message.

An adapter, into_request_stream, is provided, to make it easier for services
which want the current behavior of one global stream of requests to have that.
Currently, all the existing services use this, but in the future, many of our
existing services can be rewritten to use the new handshake mechanism more
effectively. I'd also like to do some additional refactoring, but I wanted to
get a working PR up first.

This lets us save the name `Connection` to represent the _established_ connection.

jamesmunns

Made some notes, happy to merge this and follow up. Thank you!

platforms/beepy/src/i2c_puppet.rs

jamesmunns · 2023-09-03T16:14:57Z

source/kernel/src/registry/mod.rs

+    /// data from the client, this type can be set to [`()`].
+    // XXX(eliza): ideally, we could default `Hello` to () and `ConnectError` to
+    // `Infallible`...do we want to do that? it requires a nightly feature.
+    type Hello: 'static;


Nit, we might want to give this a name like "Introduction" or "ConnectDetails" or something, but we can bikeshed later.

i stole the name from TLS handshakes, which start with a ClientHello message followed by a ServerHello message. But, if you don't like it, i'm happy to change it later...

source/kernel/src/registry/mod.rs

jamesmunns · 2023-09-03T16:16:40Z

source/kernel/src/registry/mod.rs

+    NotFound,
+    /// The remote [`RegisteredDriver`] rejected the connection.
+    Rejected(D::ConnectError),
+    /// The remote [`RegisteredDriver`] has been registered, but the service


I wonder if we could check this in the registry (awaiting channel closure?), tho we can't remove services anyway yet.

jamesmunns · 2023-09-03T16:17:37Z

source/kernel/src/registry/mod.rs

    UserRequest<'_>,
    &ErasedKProducer,
    &bbq::MpscProducer,
    ServiceId,
    ClientId,
 ) -> Result<(), UserHandlerError>;

+type ErasedHandshake = unsafe fn(


Honestly, if we want to just yank all of the userspace stuff for now, we've already removed the k2u/u2k rings, and this will all be redesigned to be interface-like later.

my motivation for actually writing the userspace code was mostly that i wanted to validate that the new design didn't completely break our type erasure mechanism, and that it was possible to do something like this for a serialization based service. i'm happy to remove this code and add something similar back in later, but i have a slight preference for keeping it so that we can build on top of it when integrating the new serializing channel. fundamentally up to you, though.

The registry emits a bunch of `tracing` diagnostics, but they're not *that* helpful in most cases. In particular, when we get a service, we log its UUID, but this isn't super readable for someone consuming the logs --- you have to then take that UUID and look it up against the table of potential UUIDs to figure out which service was being looked up. Also, some of the errors introduced in #258 look pretty lousy in `fmt::Debug` output. This branch fixes both of those issues. I've changing the logs emitted by the registry to also include the type name of the service, which makes it easier to tell what's going on without having to cross-reference UUIDs against the actual source code. Furthermore, I've improved the readability of some of the error types. Example log output: <details> <summary>Before</summary> ```rust running 1 test INFO registry::tests::user_connect Registry::register{uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635}: kernel::registry: Registered uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 INFO registry::tests::user_connect kernel::comms::bbq: Creating new mpsc BBQueue channel capacity=256 INFO registry::tests::user_connect kernel::comms::bbq: Channel created successfully INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635}: kernel::registry: Got KernelHandle from Registry uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: sending user request req=TestMessage(1) INFO registry::tests::user_connect kernel::registry::tests: received request val=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: return=Ok(TestMessage(2)) ERROR registry::tests::user_connect user_connect{hello=TestMessage(2)}: kernel::registry::tests: error=UserConnectError::Connect(ConnectError::<kernel::registry::tests::TestService>::Rejected(TestMessage(666))) INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635}: kernel::registry: Got KernelHandle from Registry uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: sending user request req=TestMessage(2) INFO registry::tests::user_connect kernel::registry::tests: received request val=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: return=Ok(TestMessage(3)) INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: sending user request req=TestMessage(3) INFO registry::tests::user_connect kernel::registry::tests: received request val=3 INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: return=Ok(TestMessage(4)) test registry::tests::user_connect ... ok ``` </details> <details> <summary>After</summary> ```rust running 1 test INFO registry::tests::user_connect Registry::register{svc=kernel::registry::tests::TestService}: kernel::registry: Registered svc=kernel::registry::tests::TestService uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 INFO registry::tests::user_connect kernel::comms::bbq: Creating new mpsc BBQueue channel capacity=256 INFO registry::tests::user_connect kernel::comms::bbq: Channel created successfully INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] svc=kernel::registry::tests::TestService}: kernel::registry: Got KernelHandle from Registry svc=kernel::registry::tests::TestService uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: sending user request req=TestMessage(1) INFO registry::tests::user_connect kernel::registry::tests: received request val=1 INFO registry::tests::user_connect user_request{req=TestMessage(1)}: kernel::registry::tests: return=Ok(TestMessage(2)) ERROR registry::tests::user_connect user_connect{hello=TestMessage(2)}: kernel::registry::tests: error=Connect(Rejected { error: TestMessage(666), svc: kernel::registry::tests::TestService }) INFO registry::tests::user_connect user_connect{hello=TestMessage(1)}:Registry::connect_userspace_with_hello{user_hello=[1] svc=kernel::registry::tests::TestService}: kernel::registry: Got KernelHandle from Registry svc=kernel::registry::tests::TestService uuid=05bcd4b7-dd81-434a-a958-f18ee84f8635 service_id=0 client_id=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: sending user request req=TestMessage(2) INFO registry::tests::user_connect kernel::registry::tests: received request val=2 INFO registry::tests::user_connect user_request{req=TestMessage(2)}: kernel::registry::tests: return=Ok(TestMessage(3)) INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: sending user request req=TestMessage(3) INFO registry::tests::user_connect kernel::registry::tests: received request val=3 INFO registry::tests::user_connect user_request{req=TestMessage(3)}: kernel::registry::tests: return=Ok(TestMessage(4)) test registry::tests::user_connect ... ok ````

@jamesmunns

Depends on #267 Currently, the `Registry::connect_*` methods return an error immediately if the requested service is not found in the registry. Most client types implement a `from_registry` method which retry in a loop when the requested service is not found in the registry. These methods will wait for a fixed (short) amount of time and then try again until the registry returns the service. This approach is quite inefficient, as we have to run a bunch of retry loops that keep trying to access a service that may not be there. This may happen several times before the service actually is registered, especially when registering a service requires connecting to another service. This branch improves the efficiency of waiting for a service to be registered. Now, rather than retrying with a fixed-duration sleep, we instead have the `Registry` own a `WaitCell` which is woken whenever a new service is registered. This wakes all takes potentially waiting to connect, allowing them to re-check whether the service they want is in the registry. This idea was initially proposed by @jamesmunns in a [comment] on PR #259 Connections are now established using either `Registry::connect`, which retries whenever a new service is registered, or `Registry::try_connect` which never retries. Additionally, we now have the capacity to indicate that a service is not found *and* that the registry is full, by closing the `WaitCell`. In this case, retrying will never succeed, because the registry is full and if the service isn't already there, it will never be added. In this case, the retrying methods will also return an error, rather than never completing, so we avoid a potential task leak. In order to make this change, we need to move the `RwLock` from being around the entire `Registry` to being inside the registry, around `items`. This allows the `WaitCell` to be accessed regardless. It also allows us to shorten the duration for which the lock is held. This requires changing all methods on `Registry` to take `&self`. Therefore, I've removed the wrapper methods on `Kernel` for connecting and registering, since they can now just be called on `kernel.registry` without a bunch of extra boilerplate for lock management. I've also simplified the API surface of the registry a bit by removing the `connect` methods that don't take a `Hello`, and just using `Registry::connect(())` in those cases. IMO, those methods weren't really pulling their weight, and they required us to have a method named `Registry::try_connect_userspace_with_hello` if we were going to add a non-retrying `connect` variant. Now, we can just have `Registry::try_connect_userspace`, `Registry::connect_userspace`, `Registry::connect`, and `Registry::try_connect`, which feels much less egregious. [comment]: #258 (comment)

@jamesmunns

Depends on #267 Currently, the `Registry::connect_*` methods return an error immediately if the requested service is not found in the registry. Most client types implement a `from_registry` method which retry in a loop when the requested service is not found in the registry. These methods will wait for a fixed (short) amount of time and then try again until the registry returns the service. This approach is quite inefficient, as we have to run a bunch of retry loops that keep trying to access a service that may not be there. This may happen several times before the service actually is registered, especially when registering a service requires connecting to another service. This branch improves the efficiency of waiting for a service to be registered. Now, rather than retrying with a fixed-duration sleep, we instead have the `Registry` own a `WaitCell` which is woken whenever a new service is registered. This wakes all takes potentially waiting to connect, allowing them to re-check whether the service they want is in the registry. This idea was initially proposed by @jamesmunns in a [comment] on PR #259 Connections are now established using either `Registry::connect`, which retries whenever a new service is registered, or `Registry::try_connect` which never retries. Additionally, we now have the capacity to indicate that a service is not found *and* that the registry is full, by closing the `WaitCell`. In this case, retrying will never succeed, because the registry is full and if the service isn't already there, it will never be added. In this case, the retrying methods will also return an error, rather than never completing, so we avoid a potential task leak. In order to make this change, we need to move the `RwLock` from being around the entire `Registry` to being inside the registry, around `items`. This allows the `WaitCell` to be accessed regardless. It also allows us to shorten the duration for which the lock is held. This requires changing all methods on `Registry` to take `&self`. Therefore, I've removed the wrapper methods on `Kernel` for connecting and registering, since they can now just be called on `kernel.registry` without a bunch of extra boilerplate for lock management. I've also simplified the API surface of the registry a bit by removing the `connect` methods that don't take a `Hello`, and just using `Registry::connect(())` in those cases. IMO, those methods weren't really pulling their weight, and they required us to have a method named `Registry::try_connect_userspace_with_hello` if we were going to add a non-retrying `connect` variant. Now, we can just have `Registry::try_connect_userspace`, `Registry::connect_userspace`, `Registry::connect`, and `Registry::try_connect`, which feels much less egregious. [comment]: #258 (comment)

@jamesmunns

Depends on #267 Currently, the `Registry::connect_*` methods return an error immediately if the requested service is not found in the registry. Most client types implement a `from_registry` method which retry in a loop when the requested service is not found in the registry. These methods will wait for a fixed (short) amount of time and then try again until the registry returns the service. This approach is quite inefficient, as we have to run a bunch of retry loops that keep trying to access a service that may not be there. This may happen several times before the service actually is registered, especially when registering a service requires connecting to another service. This branch improves the efficiency of waiting for a service to be registered. Now, rather than retrying with a fixed-duration sleep, we instead have the `Registry` own a `WaitCell` which is woken whenever a new service is registered. This wakes all takes potentially waiting to connect, allowing them to re-check whether the service they want is in the registry. This idea was initially proposed by @jamesmunns in a [comment] on PR #259 Connections are now established using either `Registry::connect`, which retries whenever a new service is registered, or `Registry::try_connect` which never retries. Additionally, we now have the capacity to indicate that a service is not found *and* that the registry is full, by closing the `WaitCell`. In this case, retrying will never succeed, because the registry is full and if the service isn't already there, it will never be added. In this case, the retrying methods will also return an error, rather than never completing, so we avoid a potential task leak. In order to make this change, we need to move the `RwLock` from being around the entire `Registry` to being inside the registry, around `items`. This allows the `WaitCell` to be accessed regardless. It also allows us to shorten the duration for which the lock is held. This requires changing all methods on `Registry` to take `&self`. Therefore, I've removed the wrapper methods on `Kernel` for connecting and registering, since they can now just be called on `kernel.registry` without a bunch of extra boilerplate for lock management. I've also simplified the API surface of the registry a bit by removing the `connect` methods that don't take a `Hello`, and just using `Registry::connect(())` in those cases. IMO, those methods weren't really pulling their weight, and they required us to have a method named `Registry::try_connect_userspace_with_hello` if we were going to add a non-retrying `connect` variant. Now, we can just have `Registry::try_connect_userspace`, `Registry::connect_userspace`, `Registry::connect`, and `Registry::try_connect`, which feels much less egregious. [comment]: #258 (comment)

hawkw force-pushed the eliza/listener branch from 50fd376 to 6d71446 Compare August 30, 2023 16:16

hawkw added 5 commits September 1, 2023 08:54

WIP: start on service listeners

ac04d9b

add RequestStream utility

31909e2

wip

30a38b1

wow whoa

cdd2210

start updating platform svcs

b2259cc

hawkw force-pushed the eliza/listener branch from 4978925 to b2259cc Compare September 1, 2023 15:54

hawkw added 10 commits September 1, 2023 10:30

port remaining svcs

48ad1fa

unused imports

4a65b2a

rename Connection to Handshake

9f13d43

This lets us save the name `Connection` to represent the _established_ connection.

docs embetterment

d0f356d

more docs

612f1d3

rename next to handshake

47b07c9

must_use all the things

37f473c

rename Registry::get to Registry::connect

7c43406

Extremely Gross hopefully working user handshake

0592b7b

error types

08cc2a2

hawkw changed the title ~~WIP: start on service listeners~~ feat(kernel): add listeners to services Sep 2, 2023

hawkw requested a review from jamesmunns September 2, 2023 17:45

hawkw marked this pull request as ready for review September 2, 2023 17:45

hawkw added 3 commits September 2, 2023 11:14

docs and error type improvement

c004b91

add register helpers

db18127

actually add real life tests!!!

56818c0

jamesmunns approved these changes Sep 3, 2023

View reviewed changes

fix comment

1abd474

hawkw merged commit 25443a2 into main Sep 3, 2023
7 checks passed

hawkw deleted the eliza/listener branch September 3, 2023 17:26

hawkw mentioned this pull request Sep 3, 2023

feat(kernel): improve registry tracing #266

Merged

hawkw mentioned this pull request Sep 3, 2023

feat(kernel): wake tasks on service registration #268

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(kernel): add listeners to services #258

feat(kernel): add listeners to services #258

hawkw commented Aug 28, 2023 •

edited

Loading

jamesmunns left a comment

jamesmunns Sep 3, 2023

hawkw Sep 3, 2023

jamesmunns Sep 3, 2023

jamesmunns Sep 3, 2023

hawkw Sep 3, 2023

feat(kernel): add listeners to services #258

feat(kernel): add listeners to services #258

Conversation

hawkw commented Aug 28, 2023 • edited Loading

jamesmunns left a comment

Choose a reason for hiding this comment

jamesmunns Sep 3, 2023

Choose a reason for hiding this comment

hawkw Sep 3, 2023

Choose a reason for hiding this comment

jamesmunns Sep 3, 2023

Choose a reason for hiding this comment

jamesmunns Sep 3, 2023

Choose a reason for hiding this comment

hawkw Sep 3, 2023

Choose a reason for hiding this comment

hawkw commented Aug 28, 2023 •

edited

Loading