Skip to content

Canary Testing/Routing #175

@scheuchzer

Description

@scheuchzer

Enhancement

We're going to implement canary deployments in my company based on the gateway. A first proof-of-concept showed that this can be done quite easily with some minor enhancements if we keep the canary logic out of the gateway (in a separate app for example)

Why separate the canary logic from the gateway? Canary deployments need metrics and logic that interpret these metrics. Some metrics like response times and response codes might come from the gateway itself but there can be others like log file analysis and so on. We think the gateway should do the routing and not the thinking.

How could it look like:

We define two routes for the same path. One route is for the new application version. It throttles the requests, say 1 percent of the traffic. The second route is configure without throttling. It represents the old version of the target application.

@Bean
public RouteLocator routes(RouteLocatorBuilder builder, ThrottleRoutePredicateFactory throttle) {
    return builder.routes()
        .route(r -> r.path("/myapp").and().predicate(throttle.apply(0.01)).uri("http://my-app-v-2-0.org"))
        .route(r -> r.path("/myapp").uri("http://my-app-v-1-0.org"))           
        .build();
}

with a predicate shortcut it would look like

@Bean
public RouteLocator routes(RouteLocatorBuilder builder) {
    return builder.routes()
        .route(r -> r.path("/myapp").and().throttle(0.01).uri("http://my-app-v-2-0.org"))
        .route(r -> r.path("/myapp").uri("http://my-app-v-1-0.org"))           
        .build();
}

That's it for the routing itself. Ok, this is a static example. In real life you would read these routes from a RouteDefinitionRepository like the InMemoryRouteDefinitionRepository. The canary logic pushes routing updates to the gateway.

ThrottleRoutePredicateFactory

In our poc this predicate contained a simple counter. To make this work across multiple instances we need a distributed counter. This can be done the same way as with the rate limiter with help of redis. On second thought I feel that we can drop the distributed counter. The requirements are not that strict as with the rate limiter. If every instance is doing the throttling on its own the overall throttling should eventually be ok.

Feedback / Metrics

Collecting metrics could be part of the gateway. This can be done with some filters that collect:

  • response times
  • status codes

Metrics will be feed back to the canary logic through an interface. In our case this will probably be based on redis.

Discussion

I think that we can implement canary testing with spring-cloud-gateway basically with one additional predicate that actually is independent from the canary testing itself. The canary logic is outside of the gateway. I think that the canary logic is very specific for each company and hard to generalize. If there's common sense this could lead to a separate open source project or maybe/probably there already is :-)

  • Does this design fit into spring-cloud-gateway?
  • Naming suggestions?
  • what do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions