Service layering #307

ximon18 · 2024-04-30T21:05:35Z

Status update: After internal discussion the conclusion now is make this PR ready for review and to promote this PR out of DRAFT status and merge it as-is, with the intent that the design is a work-in-progress and will continue to be refined as we go on. We may depart in future from the approach taken in this PR and earlier code, namely we may no longer keep support for both single request/response and single request/streaming response together in a single Service trait, but for now we will keep it like that as there is a substantial amount of work in the xfr branch that is built on the design in this PR. There are further changes to the Service interface in the xfr branch but we do not plan to merge that into this PR at this point, rather this PR should be merged as a stepping stone to the xfr PR #335, and this PR should thus not necessarily be seen as a "complete" chunk of finished functionality.

Summary of changes:

Transaction, MiddlewareBuilder, MiddlewareChain & MiddlewareProcessor have been removed.
Services are now async and return Future<Output = Stream> instead of Transaction.
Middleware are now implemented as Service impls that take a "next service" as input.
Removed the incomplete AXFR support in serve-zone.rs (to be replaced by the xfr branch work).

Previous PR description follows below.

This is a DRAFT PR intended for design review, feedback and discussion. It may be adopted and merged, or it may not.

This PR proposes an alternate design for services and middleware to move away from custom solutions to more standard approaches.

Background

In the code prior to this PR:

MiddlewareBuilder, MiddlewareChain, MiddlewareProcessor and CommonFlow together implement a means of flowing requests down a pipeline aka chain of request pre-processors to the actual application logic service, and back again through the corresponding post-processors, permitting early abort at any pre-processing stage such that only the post-processors up to that point will be run. The pre/post processing chain is walked explicitly tracking from which point to back track on early abort. The pre-processing chain was also limited to sync calls which doesn't work if you have say an XFR middleware that needs to query a zone and the zone backend is async (e.g. a database).
Returning a Transaction from Service impls requires implementers to understand this new concept of what is a transaction in domain, and prevents use of the various Stream adaptors that are already available, and the call() function returned a Transaction immediately causing complexity if the service impl needed to determine the stream content in a lazy/delayed fashion as it would have to return a Transaction that yielded a future which yielded a stream.

Best practice

If one looks at existing best practice ala the Hyper Service approach it uses a layering approach to services and a nested Service approach to middleware. This has the nice side effect that no logic is required to step up and down the chain, you just call the next Service impl and that's it.

Previous attempts

This was attempted during the development of the middleware chain but not completed and abandoned due to lack of time.

This PR

This PR returns to the service layering idea by removing all of the existing middleware types and re-implementing the middleware processors as middleware services that take an "upstream" service to pass requests on to.

Differences to Hyper

Unlike Hyper, the Service trait proposed by this PR is explicitly Stream based. This was sort of the case before but rather than Stream based it was Transaction based. This PR also removes Transaction and instead defines services in terms of standard Rust Stream types. We have to include the notion of Stream in the Service trait because the handling of a future that resolves to a single result is different than the handling of a future that resolves to many results. This is because the Rust Stream trait yields futures but is not itself a Future. Coding for possibly one or the other makes the code quite a bit more complex (and didn't fully work in earlier attempts to implement it). Both the code and the concepts are simpler if we always refer to a stream of results even in the case of a single result.

Boxing

This PR tries to avoid Boxing Service impls and instead tries to reference concrete Future and Stream types rather than Pin<Box<dyn Future... or Box<dyn Stream.

Overhead

One concern with this approach is the overhead added for a stream of one result but, if the application logic service only ever returns a ready(once(ready(item))) immediately ready stream of immediately ready single values of immediately ready futures (phew), the overhead is minimal as that translates in code to a few structs with no actual behaviour involved. And the annoyance of having to construct a ready(once(ready(item)))) return value from a Service impl can be smoothed over for the user by the service_fn() function (also ala Hyper).

Type complexity

There is a downside to the layered service approach and that is that the nesting of Service impls can cause very long Rust type names to appear in compilation error messages, they can be quite daunting if you're not used to them and not easy to work with even then.

Runtime dynamic configuration

Building a chain of services programmatically and altering that chain is harder with a layered stack of services, as one always has to start with the top application logic service and pass it as an argument to the next service during its construction and so on down the tree. You can't remove an item from the middle of the stack or insert an item or change their order, instead you have to build the chain anew. It can also be difficult to make types match, e.g. two branches of an if else that optionally add in a particular service layer must both return the same or compatible type. One way way is to erase the types through Box'ing but as said above we want to avoid Box'ing. This is possible as shown by fn server_tests() in net-server.rs, but there are challenges at present around trait bound matching, visible in the strange bounds on the finish_svc() fn in net-server.rs. Hopefully this can be improved/simplified.

Stream result mapping

Flowing requests one at a time up a chain of services and passing a single result per request back down again, with possible modification at each layer in either direction, is easy. Flowing requests one at a time up a chain of services which at any point could result in a stream of responses for the single request is harder, especially when the result is a future. This PR introduces the MiddlewareStream type which is used by middleware service impls to return a stream that will either return the upstream response unmodified (a so-called Identity transform hence MiddlewareStream::Identity), or returns a MiddlewareStream::Map stream that may post-process the stream as it passes through the middleware layer, or returns a MiddlewareStream::Result which produces the response at that layer instead of allowing the request to propagate further along the chain.

The mapping case is complicated by the fact that a service layer can't just inspect and possibly modify the result from the upstream service as it must first await the upstream future and then post-process each of the upstream stream results, and do so in asynchronous fashion (i.e. we don't want each middleware layer to collect all upstream results together then process them and pass them then back down to the next layer which may do the same again). The common logic needed to do this is provided by PostprocessingStream.

What's the point of all this?

Two-fold:

We wanted to try the layered service design, and it more closely matches the way the client transport code nests client transport impls.
We wanted to implement an XFR (AXFR and IXFR) middleware layer that can yield a stream of responses and do asynchronous zone access while doing so. This PR doesn't include it but is the base for another PR that uses this functionality to implement that AXFR/IXFR middleware layer.

The changes bring additional benefits such as simplifying the pre/post-processing flow, removing CommonFlow which abuses a trait to provide common logic, allows less forced use of async in response writing (currently problematic due to CommonFlow being a trait and wanting to avoid Pin<Box> and not being able to async fn in the trait), and simplifies propagating errors back to the server logic when an error occurs in the final stage of what was previously CommonFlow (as that invoked a non-self sync fn in the server impl). For example #309 is easier to fix in the new code than in the original code.

An MyAsyncStreamingService example has been added in this PR to server-transports.rs which shows a little bit of what can be done with the future stream based approach, though for simplicity it uses boxing.

The StatsMiddlewareProcessor in server-transports.rs has been replaced by StatsMiddlewareSvc which shows stream post-processing by middleware in action.

build_middleware_chain() in server-transports.rs shows how middleware is now constructed. One other upside of the new middleware approach is that the add_middleware_chain() functions on the Dgram and Stream server Config types are no longer needed as servers are now completely unaware of middleware, they just invoke a service and have no notion of whether they are invoking the final service or the first service in a chain of services. There is currently however no equivalent in this PR to the MiddlewareBuilder::standard() and MiddlewareBuilder::minimal() fns that previously existed, though we could easily add a function like build_middleware_chain to provide such a standard set of middleware for the user.

Final thoughts

Something like the Tower Layer and ServiceBuilder may make service composition easier.

Given that the Hyper service trait returns a Future, one could perhaps leverage it to return a Future<Output = Stream> and do as is done in this PR, and then also leverage the Hyper and Tower ecosystem... but I'm not sure it's an exact match and I know this was considered a long time ago already, plus it would cause us to have dependencies on some/a lot of 3rd party code and ideally we keep the dependency tree as slim as possible (though not-invented-here syndrome should also be avoided).

One thing missing from this PR is a replacement for the previous ability to swap middleware at runtime. This should be possible by placing the Service used by a Dgram or Stream/Connection server into an ArcSwap.

End monologue.

…are disabled.

…ith stream server and other middleware disabled.

…with stream server re-enabled but still no other middleware.

…a) being actually used and (b) with Box<Dyn>-less post-processing via new `PostprocessingMap` type based on stream::Map, with stream server re-enabled but still no other middleware.

…so fixes bugs in the Cookie middleware relating to adding the COOKIE option more than once, and not preserving OPT header values when appending OPT options.

…er, none of it is needed in the new middleware-as-layered-services world.

…and a stream preceeded by its length doesn't already need its length prepended to it.

- Return Future of Stream from Service::call() instead of the custom Transaction type. - Relax trait bounds/move trait bounds nearer to the point of use. - Introduce PostprocessingStream impl of Stream for stream based upstream result post-processing. - Introduce ServiceResult aka io::Result. - Modify the serve_zone example to assume it is powered by (not included in these changes) XFR middleware.

…ng the domain server functionality.

…567_agent`".

…ce'".

…t the identity case.

partim · 2024-05-06T15:09:42Z

I like the approach in principle, but I do agree that that the trait bounds can seem a little unwieldy. One thing that I learned from using Hyper – which has similar issues – is that you won’t really notice them if you just implement a concrete service. They will only haunt you if you try to build generic code.

I wonder if we can improve them by experimenting a bit with the Service trait’s definition.

An extremely radical approach would be to drop nearly all bounds in the trait definition and move them to wherever the trait is used. This would result in a very clean trait and it would collect all the bounds in the place where you use the service.

Something like:

pub trait Service {
    type Request;
    type Response;
    type Error;
    type Stream: Stream<Item = Self::Response>;
    type Future: Future<Output = Result<Self::Stream, Self::Error>>;

    fn call(&self, request: Request) -> Self::Future;
}

Allowing defining a specific request type (or requiring it via some bounds) instead of using a concrete type generic over some octets has the advantage that a middleware can require specific additional information to be available (e.g., via an AsRef<TcpConnInfo> bound on a fictitious middleware that does some TCP connection-specific stuff). This also allows for middleware to require to be wrapped around some other middleware.

Similarly for responses which would allow feeding back information to outer middlewares.

Error handling is a bit of a weird one. I think all Hyper does if it receives an error is close the connection immediately. The error type is only there to allow frameworks and middleware to bubble up errors and eventually convert them into error responses. I think this is a good approach for us, too. If you want to send an error response back, you have to create that response message and return it as an Ok(_) value somewhere before reaching the transport. I think in this model it is better to have the error in the future output rather than in the stream items because at least for now, error handling in an async function is much easier than in a stream which you likely have to tape together using combinators and converters.

Whether this approach could work and perhaps even lead to more manageable trait bounds might be worth a try.

ximon18 · 2024-05-06T15:16:40Z

Regarding dropping all trait bounds in Service, I actually started that way, but was forced to reintroduce them, but maybe now that the code has settled into a draft PR state I can go back and see if I can remove them again.

ximon18 · 2024-05-08T13:09:34Z

Regarding Error, until now Error was not an associated type because the Server/middleware matches on the type of Error so needs to know what the type is, and the Error isn't used anywhere else, so what value does it have to or how would it work to allow the Error type to be specifiable?

partim · 2024-05-10T08:43:34Z

Basically, the server itself (i.e., the raw transports) don’t care about the error at all. What exactly should happen if they get one is up for discussion. Hyper just closes the connection without a response. That would be an option for connection-oriented DNS transports as well, but for UDP, it should probably just return an empty REFUSED answer.

If middle-ware needs a specific error type, it can put bounds on the error type of the service it wraps. Ideally, this should be in the form of Into<_> bound into its own error type. But perhaps it would be better if the standard middlewares would behave similarly to the transports: pass through any error they receive and only do response processing on an Ok<_> response.

The idea then would be that the things we provide do not use errors at all and always expect an actual response to work normally.

ximon18 · 2024-05-11T19:59:10Z

Regarding Error, until now Error was not an associated type because the Server/middleware matches on the type of Error so needs to know what the type is, and the Error isn't used anywhere else, so what value does it have to or how would it work to allow the Error type to be specifiable?

What I said here was incorrect, or at least not entirely correct. Yes the released code matches on the error but it only cares about one of the four possible variants, ServiceError::InternalError, and even then it only logs the error in that case, the other variants are not handled at all.

Basically, the server itself (i.e., the raw transports) don’t care about the error at all.

So my argument until now was: if the user supplied error type is never acted upon but only logged, why complicate the Service trait by making the type of that error user definable?

for UDP, it should probably just return an empty REFUSED answer.

I would have thought SERVFAIL was more appropriate in this case.

But perhaps it would be better if the standard middlewares would behave similarly to the transports: pass through any error they receive and only do response processing on an Ok<_> response.

I think it has to be this way because if an error occurred we are in an unexpected unknown state, we don't know which layer raised the error, and its unsafe to assume anything making it very unwise to do anything in the midldeware at that point.

…ed out code.

…es in this branch.

ximon18 · 2024-07-23T15:11:11Z

@partim: I've created #361 to link to the further design change ideas mentioned here and others we have had and will come up with, as they won't be pursued further in this PR.

Philip-NLnetLabs

This should be merged to make progress.

I adapted my query routing code without too much problems. It is good that everything, including the middleware now implements the Service interface.

Two negative points: the unpin bounds should be removed in the future. The MiddlewareStream type is extremely complex.

ximon18 added 21 commits April 16, 2024 10:35

WIP: Working Box<dyn Stream>> based Service but with stream & middlew…

e0006c0

…are disabled.

WIP: Working Box<dyn Stream> based Service andmandatory middleware, w…

4aec8bd

…ith stream server and other middleware disabled.

WIP: Working Box<dyn Stream> based Service and mandatory middleware, …

724d670

…with stream server re-enabled but still no other middleware.

Compilation fix.

4cc2bd5

Remove commented out code.

32b4623

WIP: Working Box<dyn Stream> based Service and mandatory middleware (…

c6682b6

…a) being actually used and (b) with Box<Dyn>-less post-processing via new `PostprocessingMap` type based on stream::Map, with stream server re-enabled but still no other middleware.

Factor MiddlewareStream out to new util.rs.

8a406d3

Adds Service style versions of the EDNS and Cookie middleware too. Al…

98fcb65

…so fixes bugs in the Cookie middleware relating to adding the COOKIE option more than once, and not preserving OPT header values when appending OPT options.

Remove the old middleware processors, pocessor trait, chain and build…

3dc28be

…er, none of it is needed in the new middleware-as-layered-services world.

Remove diagnostic trace message accidentally left behind.

0618b82

Test fixes: A single test stream should yield only a single message, …

20220a0

…and a stream preceeded by its length doesn't already need its length prepended to it.

cargo fmt.

9feccce

Merge branch 'main' into service-layering

3a09c12

FIX: DisconnectWithFlush ignored in Stream Connection handler.

592304b

Update syntax in broken doc tests.

f4a1e0b

Add a bare bones demo of an RFC 9567 monitoring agent implemented usi…

d236fff

…ng the domain server functionality.

Fix broken RustDoc references.

31e5d7c

Fix compilation error "main function not found in crate `serve_rfc9…

a45a09f

…567_agent`".

Fix compilation error "use of unstable library feature 'option_as_sli…

429de28

…ce'".

Merge branch 'main' into service-layering.

e4b9d58

ximon18 requested a review from a team April 30, 2024 21:05

ximon18 added 3 commits May 1, 2024 00:09

FIX: Cookie middleware doesn't do post-processing, so actually suppor…

ce0f029

…t the identity case.

Files missed from previous commit.

4313cfd

Minor cleanup.

e991c59

ximon18 added 10 commits June 12, 2024 11:32

Merge branch 'main' into service-layering

d98401e

Compilation fix.

7b765dd

Merge branch 'main' into service-layering

f5e39f7

Merge branch 'main' into service-layering

5dd9b0e

Merge branch 'main' into service-layering

558aa41

Merge branch 'main' into service-layering

eac8e22

RustDoc fixes and additions, minor code tweaks and removal of comment…

16f369b

…ed out code.

RustDoc fixes.

def12fd

Remove RFC 9567 example. as it has nothing to do with the other chang…

0513f0b

…es in this branch.

cargo fmt.

47a3a15

ximon18 marked this pull request as ready for review July 23, 2024 15:06

ximon18 mentioned this pull request Jul 23, 2024

Service trait evolution #361

Open

ximon18 added 3 commits July 23, 2024 20:41

Merge branch 'main' into service-layering

d18cd26

Merge branch 'main' into service-layering

21a7f0b

Merge branch 'main' into service-layering

c6ec451

Philip-NLnetLabs approved these changes Jul 29, 2024

View reviewed changes

partim merged commit 67f53ea into main Jul 29, 2024
26 checks passed

partim deleted the service-layering branch July 29, 2024 09:34

ximon18 mentioned this pull request Aug 7, 2024

Add and improve middleware related RustDocs. #370

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Service layering #307

Service layering #307

ximon18 commented Apr 30, 2024 •

edited

Loading

partim commented May 6, 2024

ximon18 commented May 6, 2024

ximon18 commented May 8, 2024

partim commented May 10, 2024

ximon18 commented May 11, 2024

ximon18 commented Jul 23, 2024

Philip-NLnetLabs left a comment

Service layering #307

Service layering #307

Conversation

ximon18 commented Apr 30, 2024 • edited Loading

Background

Best practice

Previous attempts

This PR

Differences to Hyper

Boxing

Overhead

Type complexity

Runtime dynamic configuration

Stream result mapping

What's the point of all this?

Final thoughts

partim commented May 6, 2024

ximon18 commented May 6, 2024

ximon18 commented May 8, 2024

partim commented May 10, 2024

ximon18 commented May 11, 2024

ximon18 commented Jul 23, 2024

Philip-NLnetLabs left a comment

Choose a reason for hiding this comment

ximon18 commented Apr 30, 2024 •

edited

Loading