title = "SIP 017 - Service Chaining" template = "main" date = "2024-02-21T23:00:00Z"

Summary: Allow in-process HTTP requests to application components without going via the network

Created: Feb 21, 2024

Requirements and scope

In Spin 2.2, a component can make a HTTP request with only a path (e.g. /api/accounts). Such a request will be routed to the same application, but this is done by Spin prepending the host where the application is running (e.g. https://mysrsbizness.fermyon.app/api/accounts). The consequence of this is that requests go out to the network and come back in, including through load balancers etc. This adds overhead, and in some environnments could count against quota or incur bandwidth charges.

Service chaining refers to carrying out HTTP requests in memory, without going off the node, by invoking the receiving Spin component directly instead of via the HTTP server.

We have prototyped dispatching HTTP requests directly to the component handling the route, but this was in the context of an implementation detail of Spin. The requirement for this project is for an application to declare that it requires dispatch to run locally on the node, and for hosts that cannot meet that prerequisite to reject the application. This allows developers to write applications as a network of chained microservices, without having to be concerned about what happens if the microservices are not co-located.

User experience

Expressing a requirement for chaining

Spin will infer that an application requires service chaining if any component includes a spin.internal subdomain (or subdomain wildcard) in its allowed_outbound_hosts.

This has two implications:

The "allow all" wildcard (*://*:*) will not, by itself, enable service chaining.
A templated host (https://{{ accounts_service }}:*), with the variable set to a spin.internal subdomain, will not, by itself, allow service chaining.

Hosts must reject applications that ask for spin.internal if the host cannot provide service chaining. However, this will be achieved via the lockfile rather than via the spin.toml: see below for details.

Requesting a component via chaining

There are two ways we could express a HTTP request to a chained component:

Use the existing self-request style. The host may chain self-requests, and must chain them if the "require chaining" setting is on. This is super easy for the application developer, at both the send and receive end. However, it becomes more complex for the host, as the host now has to route the request - this drags in much more of the HTTP trigger plumbing. (The prototype did this and it was egregiously hairy.)
Address requests directly to the destination components by name. We still need a URL, so this would introduce a special host name such as spin.internal. Senders address individual components via subdomains. Thus, the URL http://accounts.spin.internal would invoke the accounts component. This bypasses routing, which should simplify implementation. It also opens the door to private endpoints, where an internal microservice is not exposed to the network (does not have a HTTP route) and so is accessible only from within the application. However, chained components which parse URLs may need to take extra care to correctly handle both routed and chaining URLs.

The preferred user experience is option 2: http://<component-name>.spin.internal/

Incoming request headers

To avoid spoofing, Spin must strip any Host: *.spin.internal headers from routed requests.

Additionally, Spin will provide a Host: <component>.spin.internal header on the chained request, overwriting any host header set in the request.

Other Spin headers such as spin-full-url should be set where appropriate (e.g. spin-path-info) and omitted where not (e.g. spin-client-addr). The HTTP executor may handle this for us - this is a matter for implementation.

Permissions

The developer must enable chaining to a component by adding http://<component-name>.spin.internal, or http://*.spin.internal to allow any destination component, to the list of allowed outbound hosts.

self permission does not grant permissing to make chained requests; nor, as noted above, does the wildcard permission.

Runtime and framework considerations

Chaining requirements in the lock file

The lock file format built into existing hosts is designed for extension, and so avoids using serde "deny unknown fields". Unfortunately, this means there is no way to add a prerequisites section to the lock file in such a way that existing hosts will reject it if present. For example, if someone deploys a chaining-required app onto an older version of the containerd shim, or even in the CLI to an external trigger built with pre-chaining crates, it will appear to accept it, but will not chain.

Unfortunately, I don't think there's anything we can do about this, except to introduce a new spin_lock_version. Doing this naively would mean existing hosts would reject the lock file whether or not it contains prerequisites. So we will define a v1 lockfile format that lists host requirements, but smartly emit lock file version 0 if the list of host requirements is empty.

To mitigate future cases like this, the v1 lockfile will add a must_understand section. This is an array of strings describing features that the host is expected to understand/support: if it doesn't recognise a must-understand string, it must reject the application.

Spin command line implementation

Where do invoked components run?

In Spin, all components of an app run within a trigger process (typically a child Spin process, but can also be a plugin).

When the calling component is on the HTTP trigger, life is (relatively) easy, because the host process has all the HTTP components - that is, all possible invocation targets - loaded and ready to go.

When the calling command is on a non-HTTP trigger, things are more challenging. There are two options if we want to do this:

We must host a large chunk of the HTTP trigger engine (and load HTTP components) in the non-HTTP trigger process. Physically this means moving the HTTP execution engine into a crate that can be shared by the outbound-http and trigger-http crates. However, this will still need a lot of the same infrastructure that the full trigger does, e.g. loading host components, permissions, etc. - I don't think it will omit much except the hooking up to the server/router. The two different ways of loading HTTP components feels like an exciting area for innovating in subtle surprises, and this is without even thinking about external trigger processes embedding different versions of the HTTP engine...
We somehow transfer the request to the HTTP trigger for handling. This is both difficult and unappealing. Let's not do it.

Additional context: Today, only HTTP components can make self-requests anyway (because non-HTTP components can't infer the host URL).

For the initial release we will allow chaining only from HTTP components to HTTP components. This does not preclude allowing chaining from non-HTTP components in a future drop.

Streaming and concurrency

It's possible for a caller to be streaming a request to a callee and the callee streaming its response at the same time. The implementation needs to allow for this: it may just fall out for free from everything on the host side being async, but we need to test and make sure the bytes keep flowing! Chained requests will not have hyper in the flow helpfully spawning tasks so we may end up needing to do it manually - this can be deferred to implementation.

Failures

The implementation must determine and document what happens if the callee fails - specifically if it traps. We expect that the trap will be caught by the HTTP infrastructure and returned to the caller as a 500 HTTP response, rather than the trap taking down the caller as well. This behaviour is different from if the components were composed, but is acceptable.

Other implementations

Fermyon Cloud

Cloud must examine the lock file for must-understand prerequisites. Ideally this would happen at deployment time (so that rejection could be returned via the API), but I understand that this may not be practical.

The Cloud outbound HTTP implementation is significantly different from Spin's, so this feature may require a bit more than just porting.

`containerd` shim

The containerd shim uses the Spin outbound HTTP implementation, so this should Just Work, but we may need to do some analysis to ensure that it meets the scheduling requirements (all components on same node).

The shim links all triggers to be in-process: I am not sure if this has any implications for the "host the HTTP engine inside the host component" strategy.

Future possibilities

In the "web of microservices," many chaining targets should be called only by other components, not by external entities. Such "private endpoints" should be planned for even if they don't happen in the first cut of the design.
Service chaining opens the door to structured middleware. A component acting as middleware would, instead of making a request to a specific endpoint, make a request to "the next component in the chain."
The initial proposal allows apps to express "this request be handled locally." This suggests a possible future enhancement of being able to express that a request must be handled "nearby," e.g. same rack, same datacentre, same region. This loosens scheduling while retaining at least some guarantees about performance and cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

017-service-chaining.md

017-service-chaining.md

Requirements and scope

User experience

Expressing a requirement for chaining

Requesting a component via chaining

Incoming request headers

Permissions

Runtime and framework considerations

Chaining requirements in the lock file

Spin command line implementation

Where do invoked components run?

Streaming and concurrency

Failures

Other implementations

Fermyon Cloud

`containerd` shim

Future possibilities

Files

017-service-chaining.md

Latest commit

History

017-service-chaining.md

File metadata and controls

Requirements and scope

User experience

Expressing a requirement for chaining

Requesting a component via chaining

Incoming request headers

Permissions

Runtime and framework considerations

Chaining requirements in the lock file

Spin command line implementation

Where do invoked components run?

Streaming and concurrency

Failures

Other implementations

Fermyon Cloud

containerd shim

Future possibilities

`containerd` shim