Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to store request context #10

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

axos88
Copy link

@axos88 axos88 commented Feb 10, 2023

Resolves google#392

If a transport is created that sinks/streams (C, Item/SinkItem), and that is used to create the BaseChannel, the transport will be able to generate a value (c:C) in the stream that will be opaquely forwarded to the Sink.

This can be used for example to differentiate in exaclty how a request arrived at the transport, for example if the transport's channel is aggregating multiple clients. It will be used mainly for MQTT, where the response topic is received together with the Request, and the Sink needs to be know where to publish the response.

tikue and others added 7 commits November 21, 2022 21:52
Also adds a Client stub trait alias for each generated service.

Now that generic associated types are stable, it's almost possible to
define a trait for Channel that works with async fns on stable. `impl
trait in type aliases` is still necessary (and unstable), but we're
getting closer.

As a proof of concept, three more implementations of Stub are implemented;

1. A load balancer that round-robins requests between different stubs.
2. A load balancer that selects a stub based on a request hash, so that
   the same requests go to the same stubs.
3. A stub that retries requests based on a configurable policy.

   The "serde/rc" feature is added to the "full" feature because the Retry
   stub wraps the request in an Arc, so that the request is reusable for
   multiple calls.

   Server implementors commonly need to operate generically across all
   services or request types. For example, a server throttler may want to
   return errors telling clients to back off, which is not specific to any
   one service.
This allows plugging in horizontal functionality, such as authorization,
throttling, or latency recording, that should run before and/or after
execution of every request, regardless of the request type.

The tracing example is updated to show off both client stubs as well as
server hooks.

As part of this change, there were some changes to the Serve trait:

1. Serve's output type is now a Result<Response, ServerError>..
   Serve previously did not allow returning ServerErrors, which
   prevented using Serve for horizontal functionality like throttling or
   auth. Now, Serve's output type is Result<Resp, ServerError>, making
   Serve a more natural integration point for horizontal capabilities.
2. Serve's generic Request type changed to an associated type. The
   primary benefit of the generic type is that it allows one type to
   impl a trait multiple times (for example, u64 impls TryFrom<usize>,
   TryFrom<u128), etc.). In the case of Serve impls, while it is
   theoretically possible to contrive a type that could serve multiple
   request types, in practice I don't expect that to be needed.  Most
   users will use the Serve impl generated by #[tarpc::service], which
   only ever serves one type of request.
While using unstable feature type_alias_impl_trait.
Add helper fn to server::incoming module for spawning.
mem::forget is a dangerous tool, and it was being used carelessly for
things that have safer alternatives. There was at least one bug where a
cloned tokio::sync::mpsc::UnboundedSender used for request cancellation
was being leaked on every successful server response, so its refcounts
were never decremented. Because these are atomic refcounts, they'll wrap
around rather than overflow when reaching the maximum value, so I don't
believe this could lead to panics or unsoundness.
…uestContexts.

To do this, create a Transport with a Sink/Stream of (C, Item/SinkItem). C created in the stream will be opaqualy sent back when sinking the response on the server side.
@axos88
Copy link
Author

axos88 commented Feb 10, 2023

This currently breaks existing code. need to revise.

@@ -0,0 +1,326 @@
use crate::{
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is more or less a duplicate of the base channel :(

#[derive(Debug, Default)]
pub struct InFlightRequests {
request_data: FnvHashMap<u64, RequestData>,
#[derive(Debug)]
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot derive Default anymore, because C is not necessarily debug

/// An implementation of RequestSequencer, to provide a unique series of request ids.
/// The default implementation generates 0,1,2,3,4,5,..., but this option can be leveraged
/// to generate less predictable results, using a block cipher for example.
pub request_sequencer: Arc<dyn RequestSequencer>
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Abstract away the way we generate request ids.

axos88 and others added 12 commits February 13, 2023 08:19
Also adds a Client stub trait alias for each generated service.

Now that generic associated types are stable, it's almost possible to
define a trait for Channel that works with async fns on stable. `impl
trait in type aliases` is still necessary (and unstable), but we're
getting closer.

As a proof of concept, three more implementations of Stub are implemented;

1. A load balancer that round-robins requests between different stubs.
2. A load balancer that selects a stub based on a request hash, so that
   the same requests go to the same stubs.
3. A stub that retries requests based on a configurable policy.

   The "serde/rc" feature is added to the "full" feature because the Retry
   stub wraps the request in an Arc, so that the request is reusable for
   multiple calls.

   Server implementors commonly need to operate generically across all
   services or request types. For example, a server throttler may want to
   return errors telling clients to back off, which is not specific to any
   one service.
This allows plugging in horizontal functionality, such as authorization,
throttling, or latency recording, that should run before and/or after
execution of every request, regardless of the request type.

The tracing example is updated to show off both client stubs as well as
server hooks.

As part of this change, there were some changes to the Serve trait:

1. Serve's output type is now a Result<Response, ServerError>..
   Serve previously did not allow returning ServerErrors, which
   prevented using Serve for horizontal functionality like throttling or
   auth. Now, Serve's output type is Result<Resp, ServerError>, making
   Serve a more natural integration point for horizontal capabilities.
2. Serve's generic Request type changed to an associated type. The
   primary benefit of the generic type is that it allows one type to
   impl a trait multiple times (for example, u64 impls TryFrom<usize>,
   TryFrom<u128), etc.). In the case of Serve impls, while it is
   theoretically possible to contrive a type that could serve multiple
   request types, in practice I don't expect that to be needed.  Most
   users will use the Serve impl generated by #[tarpc::service], which
   only ever serves one type of request.
While using unstable feature type_alias_impl_trait.
Add helper fn to server::incoming module for spawning.
mem::forget is a dangerous tool, and it was being used carelessly for
things that have safer alternatives. There was at least one bug where a
cloned tokio::sync::mpsc::UnboundedSender used for request cancellation
was being leaked on every successful server response, so its refcounts
were never decremented. Because these are atomic refcounts, they'll wrap
around rather than overflow when reaching the maximum value, so I don't
believe this could lead to panics or unsoundness.
/// An implementation of RequestSequencer, to provide a unique series of request ids.
/// The default implementation generates 0,1,2,3,4,5,..., but this option can be leveraged
/// to generate less predictable results, using a block cipher for example.
pub request_sequencer: Arc<dyn RequestSequencer>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that you have set up multiple tarpc clients, but on the server side you have them all multiplexed on a single server channel. tarpc server channels only support a single client. The right way to do this would be to have multiple server channels all using the same underlying mqtt transport. That way, request IDs can overlap on the transport, but they are unique when they reach the server channel.

I think this is a fundamental difference between MQTT—which natively supports many-to-one relationships between client and server—and tarpc. I think we need to bridge the gap here, but I don't think we should do that by changing tarpc to support many-to-one server channels.

Does the MQTT protocol even have a way for the server to know if a message came from a different client? It seems like the entire notion of a client is kind of made up in this case. In MQTT, it's just seeing a client publishing a message with a different response topic, right? But for tarpc-over-MQTT, we need a way to uniquely identify clients. Can we use the correlation data for this purpose? I feel like something needs to map correlation data to unique clients, and to me, the transport feels most appropriate here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the MQTT protocol even have a way for the server to know if a message came from a different client? It seems like the entire notion of a client is kind of made up in this case. In MQTT, it's just seeing a client publishing a message with a different response topic, right? But for tarpc-over-MQTT, we need a way to uniquely identify clients. Can we use the correlation data for this purpose? I feel like something needs to map correlation data to unique clients, and to me, the transport feels most appropriate here.

No it doesn't. MQTT doesn't really have the notion of server and client either. Well MQTTv5 specs out request-response, but it's not really aimed at server and client, they are more or less equal peers.
What we CAN do is to define on the tarpc level that one needs to publish requests to /request/. That way the server can use the topic to distinguish between clients. But I think we can get away with this on the transport level by simply modifying the request ids by some cryptography / hashing to map the (ClientId, RequestId) into a globally unique RequestId. It is not a perfect solution, but it should be unique enough in practice. I don't see an easy way to define a separate channel for each client.

About correlation data: again, it can be meaningful for the requestor side only, to pair up a response with the original request. The server needs to be able to respond to any correlation data without inspecting it, including the case where all clients always generate 0x0000000000 for the CD.

I think we discussed the matter enough for me to be able to prepare another version of this PR, give me a week to get to it. We're mid-renovations, so it's a bit hard to get some free time.

Copy link
Owner

@tikue tikue Apr 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one needs to publish requests to /request/

Would this be a topic prefix?

I don't see an easy way to define a separate channel for each client.

I was thinking the transport could be responsible for establishing new connections based on a /client/<uid>/ prefix to the topic. I threw together a prototype here.

About correlation data: again, it can be meaningful for the requestor side only

Right, of course!

We're mid-renovations, so it's a bit hard to get some free time.

No worries at all, there's absolutely no time pressure on any of this :)

@tikue tikue changed the title Add abitiy to store request context Add ability to store request context Apr 17, 2023
@tikue tikue force-pushed the master branch 29 times, most recently from 1306b23 to 30067db Compare February 4, 2024 03:43
@tikue
Copy link
Owner

tikue commented Feb 4, 2024

Hey @axos88 sorry for missing your last comment. It has been a while indeed! It looks like this branch is out of date with the latest master.

More generally, I completely forget the details as we last discussed them! I remember there being a lot of details important to understand the changes being made. I just now briefly took a look at the changes, and I saw a lot of stuff I didn't really expect to see (not to say the changes were wrong — just unexpected — and again, I've forgotten everything we previously discussed). Is there a way this change could be broken into smaller, more easily reviewed, PRs? 31 commits is a lot to digest all at once...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

How to save transport context?
2 participants