Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

middleware refactoring #793

Merged
merged 34 commits into from
Jul 6, 2022
Merged

middleware refactoring #793

merged 34 commits into from
Jul 6, 2022

Conversation

niklasad1
Copy link
Member

@niklasad1 niklasad1 commented Jun 13, 2022

Closing #758, #747 and #699

It's a quite significant refactoring where the sink is only used for subscriptions as the rest of the calls just returns MethodResponse which is a type that has the final JSON-RPC response and status flag whether the calls was successful or not.
In order to intercept the messages to log the full responses in the middleware.

Further, it also modifies the SubscriptionMethod to be a future which returns when the subscription call has been answered which in turn forces plenty noise i.e, I had to change lots of code.

In addition I noticed that SubscriptionSink::send(&mut self, msg: M) is not needed so I changed it to SubscriptionSink::send(&self, msg: M) also then pipe_from_stream doesn't need &mut self anymore.
rustc may benefit from that to some optimizations....

It splits the middleware into two traits for WS and HTTP.

/// Defines a middleware specifically for HTTP requests with callbacks during the RPC request life-cycle.
/// The primary use case for this is to collect timings for a larger metrics collection solution.
///
/// See [`HttpServerBuilder::set_middleware`](../../jsonrpsee_http_server/struct.HttpServerBuilder.html#method.set_middleware) method
/// for examples.
pub trait HttpMiddleware: Send + Sync + Clone + 'static {
	/// Intended to carry timestamp of a request, for example `std::time::Instant`. How the middleware
	/// measures time, if at all, is entirely up to the implementation.
	type Instant: std::fmt::Debug + Send + Sync + Copy;

	/// Called when a new JSON-RPC request comes to the server.
	fn on_request(&self, remote_addr: SocketAddr, headers: &Headers) -> Self::Instant;

	/// Called on each JSON-RPC method call, batch requests will trigger `on_call` multiple times.
	fn on_call(&self, method_name: &str, params: Params);

	/// Called on each JSON-RPC method completion, batch requests will trigger `on_result` multiple times.
	fn on_result(&self, method_name: &str, success: bool, started_at: Self::Instant);

	/// Called once the JSON-RPC request is finished and response is sent to the output buffer.
	fn on_response(&self, result: &str, _started_at: Self::Instant);
}

/// Defines a middleware specifically for WebSocket connections with callbacks during the RPC request life-cycle.
/// The primary use case for this is to collect timings for a larger metrics collection solution.
///
/// See the [`WsServerBuilder::set_middleware`](../../jsonrpsee_ws_server/struct.WsServerBuilder.html#method.set_middleware)
/// for examples.
pub trait WsMiddleware: Send + Sync + Clone + 'static {
	/// Intended to carry timestamp of a request, for example `std::time::Instant`. How the middleware
	/// measures time, if at all, is entirely up to the implementation.
	type Instant: std::fmt::Debug + Send + Sync + Copy;

	/// Called when a new client connects
	fn on_connect(&self, remote_addr: SocketAddr, headers: &Headers);

	/// Called when a new JSON-RPC request comes to the server.
	fn on_request(&self) -> Self::Instant;

	/// Called on each JSON-RPC method call, batch requests will trigger `on_call` multiple times.
	fn on_call(&self, method_name: &str, params: Params);

	/// Called on each JSON-RPC method completion, batch requests will trigger `on_result` multiple times.
	fn on_result(&self, method_name: &str, success: bool, started_at: Self::Instant);

	/// Called once the JSON-RPC request is finished and response is sent to the output buffer.
	fn on_response(&self, result: &str, started_at: Self::Instant);

	/// Called when a client disconnects
	fn on_disconnect(&self, remote_addr: std::net::SocketAddr);
}

@niklasad1 niklasad1 marked this pull request as ready for review June 18, 2022 20:28
@niklasad1 niklasad1 requested a review from a team as a code owner June 18, 2022 20:28
@niklasad1 niklasad1 self-assigned this Jun 20, 2022
@niklasad1 niklasad1 changed the title WIP: middleware refactoring middleware refactoring Jun 20, 2022
/// Called when a new JSON-RPC comes to the server.
fn on_request(&self) -> Self::Instant;
/// Called when a new JSON-RPC request comes to the server.
fn on_request(&self, remote_addr: std::net::SocketAddr, headers: &http::HeaderMap) -> Self::Instant;
Copy link
Member Author

@niklasad1 niklasad1 Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit weird as the WS will have the same remote addr and headers for the entire connection

Then for HTTP it will be useful for each "request".

So what we could do here:_

  • on_connect(&self, remote_addr: std::net::SocketAddr, headers: &http::HeaderMap)
  • on_request(&self)

Then justify the documentation that on_connect for can be used to inspect headers and so on for stateless protocols, we could also move these to different traits as I could be useful to separate middleware impls because HTTP/WS is quite different.


Copy link
Collaborator

@jsdw jsdw Jun 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been mulling this over; I'm personally leaning towards separate middleware for Http and Ws, since it feels like we have to hack it around to have functions which map nicely to both.

Pros of separate middlware:

  • precise traits that better express what's actually happening. (http middleware doesn't need on_connect, ws middleware doesn't need headers in on_request, that sort of thing)
  • (maybe this helps perf or something?)

Cons:

  • we need to have two similar traits and sets of blanket impls.
  • users/we can't write generic middleware that applies to both (though I doubt that's an issue, and one could always write some wrapper GenericMiddleware thing that implements both where there is overlap if needed)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I agree it's better to have two different traits.

You most likely want to handle them differently.

core/src/server/helpers.rs Outdated Show resolved Hide resolved
core/src/server/rpc_module.rs Outdated Show resolved Hide resolved
@@ -427,252 +468,79 @@ async fn background_task(

tracing::debug!("recv {} bytes", data.len());

let request_start = middleware.on_request();
let request_start = middleware.on_request(remote_addr, &headers);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, it looks like those headers will only ever contain Host and Origin from above; what about any custom headers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

paritytech/soketto#60,

it's missing from soketto but I don't want to make this PR any larger.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha; makes sense!

@@ -304,8 +324,8 @@ where
}
}

async fn background_task(
server: SokettoServer<'_, BufReader<BufWriter<Compat<tokio::net::TcpStream>>>>,
struct BackgroundTask<'a, M> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to make this a struct!

where
M: Middleware,
{
async fn handshake<M: Middleware>(socket: tokio::net::TcpStream, mode: HandshakeResponse<'_, M>) -> Result<(), Error> {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably out of scope of this PR but it feels like having an accept_handshake and reject_handshake call (or similar) would make more sense, since this call basically just switches on an enum and has very little shared between them?

@@ -37,22 +39,22 @@
pub trait Middleware: Send + Sync + Clone + 'static {
/// Intended to carry timestamp of a request, for example `std::time::Instant`. How the middleware
/// measures time, if at all, is entirely up to the implementation.
type Instant: Send + Copy;
type Instant: std::fmt::Debug + Send + Sync + Copy;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another offtopic thought (just noting for future reference); before 1.0 is it worth making this more generic than Instant so that it can, if needed, be a way for users to pass context between middleware calls and such?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you elaborate a bit what you mean with an example?

It's possible to copy it between calls and stuff right now

core/src/server/helpers.rs Outdated Show resolved Hide resolved
fn on_response(&self, result: &str, started_at: Self::Instant);

/// Called when a client disconnects
fn on_disconnect(&self, remote_addr: std::net::SocketAddr);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is silly but I reckon that it might be useful to log when peers disconnect

fn on_request(&self, remote_addr: SocketAddr, headers: &Headers) -> Self::Instant;

/// Called on each JSON-RPC method call, batch requests will trigger `on_call` multiple times.
fn on_call(&self, method_name: &str, params: Params);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want method kind here as well i.e, if the call is a "subscription", "method call" or "notification"?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it easily accessbile? I guess it wouldn't do any harm to have!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't do it because it requires to lookup the method name in the HashMap.

should be quite easy to do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm, if it's a performance hit to do so (even when no middleware) then I'm happy not at the mo!

Copy link
Member Author

@niklasad1 niklasad1 Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do this lookup when doing the call so I didn't want to do it because the timing might be off.

but that it is probably picking hairs and I can do on_call after looking up the method

}

#[derive(Debug, Clone)]
struct CallData<'a, M: Middleware> {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be cheap to copy

use jsonrpsee::ws_client::WsClientBuilder;
use jsonrpsee::ws_server::{RpcModule, WsServerBuilder};

#[derive(Clone)]
struct Timings;

impl middleware::Middleware for Timings {
impl middleware::WsMiddleware for Timings {
Copy link
Collaborator

@jsdw jsdw Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love the middleware split!

I'm still wondering about type Instant. If I want to share some state between middleware calls, or capture the timings between say on_connect and on_disconnect or some random other thing, currently we can't do it.

But we could if we did something like:

trait WsMiddleware {
    type State: Default + Clone;
    fn on_connect(&Self, state: &mut State, ...otherParams);
    fn on_request(&self, state: &mut State, ...otherParams);
    //... and so on; mutable per-connection/request state given to each call 
}

Then we can set/read the state somewhat arbitrarily in any call that's interested, eg for timings:

struct ConnectionTimings;
struct ConnectionTimingsState(Option<Instant>);

impl WsMiddleware for ConnectionTimings {
    type State = ConnectionTimingsState;
    fn on_connect(&self, state: &mut State, ...) {
        *state = Some(Instant::now());
    }
    fn on_disconnect(&self, state: &mut State, ...) {
        println!("Disconnected; connection open for {}", state.0.unwrap().elapsed());
    }
}

Or something that logs the time of each call and prints them all off at the end or whatever else.

A default "no middleware" impl can have State = () which should be free.

Just a thought, since we are expanding middleware to be more useful; what do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this could very well be a separate issue too if you liked the idea; no need to do it all here necessarily).

Copy link
Member Author

@niklasad1 niklasad1 Jul 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable.

I guess it would be flexible as users can "measure" whatever they want as all methods on middleware traits are supposed to have &mut State such how long a connection lives, individual calls, batches and so on takes?

So why not.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my take on it yeah; it'd just give the flexibility to keep track of whatever was desired across the lifetime of a single connection :)

Copy link
Collaborator

@jsdw jsdw Jul 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we agreed, let's think about this in a followup issue/pr!

Edit: issue added: #816

Copy link
Contributor

@lexnv lexnv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 👍 As a follow-up we could extend the middleware with method_kind and custom user context/state to extend the type Instant

Copy link
Collaborator

@jsdw jsdw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants