Kuadrant Rate Limiting

A Kuadrant RateLimitPolicy custom resource, often abbreviated "RLP":

Targets Gateway API networking resources such as HTTPRoutes and Gateways, using these resources to obtain additional context, i.e., which traffic workload (HTTP attributes, hostnames, user attributes, etc) to rate limit.
Supports targeting subsets (sections) of a network resource to apply the limits to.
Abstracts the details of the underlying Rate Limit protocol and configuration resources, that have a much broader remit and surface area.
Enables cluster operators to set defaults that govern behavior at the lower levels of the network, until a more specific policy is applied.

How it works

Envoy's Rate Limit Service Protocol

Kuadrant's Rate Limit implementation relies on the Envoy's Rate Limit Service (RLS) protocol. The workflow per request goes:

On incoming request, the gateway checks the matching rules for enforcing rate limits, as stated in the RateLimitPolicy custom resources and targeted Gateway API networking objects
If the request matches, the gateway sends one RateLimitRequest to the external rate limiting service ("Limitador").
The external rate limiting service responds with a RateLimitResponse back to the gateway with either an OK or OVER_LIMIT response code.

A RateLimitPolicy and its targeted Gateway API networking resource contain all the statements to configure both the ingress gateway and the external rate limiting service.

The RateLimitPolicy custom resource

Overview

The RateLimitPolicy spec includes, basically, two parts:

A reference to an existing Gateway API resource (spec.targetRef)
Limit definitions (spec.limits)

Each limit definition includes:

A set of rate limits (spec.limits.<limit-name>.rates[])
(Optional) A set of dynamic counter qualifiers (spec.limits.<limit-name>.counters[])
(Optional) A set of route selectors, to further qualify the specific routing rules when to activate the limit (spec.limits.<limit-name>.routeSelectors[])
(Optional) A set of additional dynamic conditions to activate the limit (spec.limits.<limit-name>.when[])

Check out Kuadrant RFC 0002 to learn more about the Well-known Attributes that can be used to define counter qualifiers (counters) and conditions (when).

High-level example and field definition

apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: my-rate-limit-policy
spec:
  # reference to an existing networking resource to attach the policy to
  # it can be a Gateway API HTTPRoute or Gateway resource
  # it can only refer to objects in the same namespace as the RateLimitPolicy
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute / Gateway
    name: myroute / mygateway

  # the limits definitions to apply to the network traffic routed through the targeted resource
  limits:
    "my_limit":
      # the rate limits associated with this limit definition
      # e.g., to specify a 50rps rate limit, add `{ limit: 50, duration: 1, unit: secod }`
      rates: […]

      # (optional) counter qualifiers
      # each dynamic value in the data plane starts a separate counter, combined with each rate limit
      # e.g., to define a separate rate limit for each user name detected by the auth layer, add `metadata.filter_metadata.envoy\.filters\.http\.ext_authz.username`
      # check out Kuadrant RFC 0002 (https://github.com/Kuadrant/architecture/blob/main/rfcs/0002-well-known-attributes.md) to learn more about the Well-known Attributes that can be used in this field
      counters: […]

      # (optional) further qualification of the scpecific HTTPRouteRules within the targeted HTTPRoute that should trigger the limit
      # each element contains a HTTPRouteMatch object that will be used to select HTTPRouteRules that include at least one identical HTTPRouteMatch
      # the HTTPRouteMatch part does not have to be fully identical, but the what's stated in the selector must be identically stated in the HTTPRouteRule
      # do not use it on RateLimitPolicies that target a Gateway
      routeSelectors: […]

      # (optional) additional dynamic conditions to trigger the limit.
      # use it for filtering attributes not supported by HTTPRouteRule or with RateLimitPolicies that target a Gateway
      # check out Kuadrant RFC 0002 (https://github.com/Kuadrant/architecture/blob/main/rfcs/0002-well-known-attributes.md) to learn more about the Well-known Attributes that can be used in this field
      when: […]

Using the RateLimitPolicy

Targeting a HTTPRoute networking resource

When a RLP targets a HTTPRoute, the policy is enforced to all traffic routed according to the rules and hostnames specified in the HTTPRoute, across all Gateways referenced in the spec.parentRefs field of the HTTPRoute.

The targeted HTTPRoute's rules and/or hostnames to which the policy must be enforced can be filtered to specific subsets, by specifying the routeSelectors field of the limit definition.

Target a HTTPRoute by setting the spec.targetRef field of the RLP as follows:

apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: <RLP name>
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: HTTPRoute
    name: <HTTPRoute Name>
  limits: {…}

Multiple HTTPRoutes with the same hostname

When multiple HTTPRoutes state the same hostname, these HTTPRoutes are usually all admitted and merged together by the gateway implemetation in the same virtual host configuration of the gateway. Similarly, the Kuadrant control plane will also register all rate limit policies referencing the HTTPRoutes, activating the correct limits across policies according to the routing matching rules of the targeted HTTPRoutes.

Hostnames and wildcards

If a RLP targets a route defined for *.com and another RLP targets another route for api.com, the Kuadrant control plane will not merge these two RLPs. Rather, it will mimic the behavior of gateway implementation by which the "most specific hostname wins", thus enforcing only the corresponding applicable policies and limit definitions.

E.g., a request coming for api.com will be rate limited according to the rules from the RLP that targets the route for api.com; while a request for other.com will be rate limited with the rules from the RLP targeting the route for *.com.

Example with 3 RLPs and 3 HTTPRoutes:

RLP A → HTTPRoute A (a.toystore.com)
RLP B → HTTPRoute B (b.toystore.com)
RLP W → HTTPRoute W (*.toystore.com)

Expected behavior:

Request to a.toystore.com → RLP A will be enforced
Request to b.toystore.com → RLP B will be enforced
Request to other.toystore.com → RLP W will be enforced

Targeting a Gateway networking resource

When a RLP targets a Gateway, the policy will be enforced to all HTTP traffic hitting the gateway, unless a more specific RLP targeting a matching HTTPRoute exists.

Any new HTTPRoute referrencing the gateway as parent will be automatically covered by the RLP that targets the Gateway, as well as changes in the existing HTTPRoutes.

This effectively provides cluster operators with the ability to set defaults to protect the infrastructure against unplanned and malicious network traffic attempt, such as by setting preemptive limits for hostnames and hostname wildcards.

Target a Gateway HTTPRoute by setting the spec.targetRef field of the RLP as follows:

apiVersion: kuadrant.io/v1beta2
kind: RateLimitPolicy
metadata:
  name: <RLP name>
spec:
  targetRef:
    group: gateway.networking.k8s.io
    kind: Gateway
    name: <Gateway Name>
  limits: {…}

Overlapping Gateway and HTTPRoute RLPs

Gateway-targeted RLPs will serve as a default to protect all traffic routed through the gateway until a more specific HTTPRoute-targeted RLP exists, in which case the HTTPRoute RLP prevails.

Example with 4 RLPs, 3 HTTPRoutes and 1 Gateway (plus 2 HTTPRoute and 2 Gateways without RLPs attached):

RLP A → HTTPRoute A (a.toystore.com) → Gateway G (*.com)
RLP B → HTTPRoute B (b.toystore.com) → Gateway G (*.com)
RLP W → HTTPRoute W (*.toystore.com) → Gateway G (*.com)
RLP G → Gateway G (*.com)

Expected behavior:

Request to a.toystore.com → RLP A will be enforced
Request to b.toystore.com → RLP B will be enforced
Request to other.toystore.com → RLP W will be enforced
Request to other.com (suppose a route exists) → RLP G will be enforced
Request to yet-another.net (suppose a route and gateway exist) → No RLP will be enforced

Limit definition

A limit will be activated whenever a request comes in and the request matches:

any of the route rules selected by the limit (via routeSelectors or implicit "catch-all" selector), and
all of the when conditions specified in the limit.

A limit can define:

counters that are qualified based on dynamic values fetched from the request, or
global counters (implicitly, when no qualified counter is specified)

A limit is composed of one or more rate limits.

E.g.

spec:
  limits:
    "toystore-all":
      rates:
      - limit: 5000
        duration: 1
        unit: second

    "toystore-api-per-username":
      rates:
      - limit: 100
        duration: 1
        unit: second
      - limit: 1000
        duration: 1
        unit: minute
      counters:
      - auth.identity.username
      routeSelectors:
        hostnames:
        - api.toystore.com

    "toystore-admin-unverified-users":
      rates:
      - limit: 250
        duration: 1
        unit: second
      routeSelectors:
        hostnames:
        - admin.toystore.com
      when:
      - selector: auth.identity.email_verified
        operator: eq
        value: "false"

Request to	Rate limits enforced
`api.toystore.com`	100rps/username or 1000rpm/username (whatever happens first)
`admin.toystore.com`	250rps
`other.toystore.com`	5000rps

Route selectors

Route selectors allow targeting sections of a HTTPRoute, by specifying sets of HTTPRouteMatches and/or hostnames that make the policy controller look up within the HTTPRoute spec for compatible declarations, and select the corresponding HTTPRouteRules and hostnames, to then build conditions that activate the policy or policy rule.

Check out Route selectors for a full description, semantics and API reference.

`when` conditions

when conditions can be used to scope a limit (i.e. to filter the traffic to which a limit definition applies) without any coupling to the underlying network topology, i.e. without making direct references to HTTPRouteRules via routeSelectors.

Use when conditions to conditionally activate limits based on attributes that cannot be expressed in the HTTPRoutes' spec.hostnames and spec.rules.matches fields, or in general in RLPs that target a Gateway.

The selectors within the when conditions of a RateLimitPolicy are a subset of Kuadrant's Well-known Attributes (RFC 0002). Check out the reference for the full list of supported selectors.

Examples

Check out the following user guides for examples of rate limiting services with Kuadrant:

Simple Rate Limiting for Application Developers
Authenticated Rate Limiting for Application Developers
Gateway Rate Limiting for Cluster Operators
Authenticated Rate Limiting with JWTs and Kubernetes RBAC

Known limitations

One HTTPRoute can only be targeted by one RLP.
One Gateway can only be targeted by one RLP.
RLPs can only target HTTPRoutes/Gateways defined within the same namespace of the RLP.

Implementation details

Driven by limitations related to how Istio injects configuration in the filter chains of the ingress gateways, Kuadrant relies on Envoy's Wasm Network filter in the data plane, to manage the integration with rate limiting service ("Limitador"), instead of the Rate Limit filter.

Motivation: Multiple rate limit domains
The first limitation comes from having only one filter chain per listener. This often leads to one single global rate limiting filter configuration per gateway, and therefore to a shared rate limit domain across applications and policies. Even though, in a rate limit filter, the triggering of rate limit calls, via actions to build so-called "descriptors", can be defined at the level of the virtual host and/or specific route rule, the overall rate limit configuration is only one, i.e., always the same rate limit domain for all calls to Limitador.

On the other hand, the possibility to configure and invoke the rate limit service for multiple domains depending on the context allows to isolate groups of policy rules, as well as to optimize performance in the rate limit service, which can rely on the domain for indexation.

Motivation: Fine-grained matching rules
A second limitation of configuring the rate limit filter via Istio, particularly from Gateway API resources, is that rate limit descriptors at the level of a specific HTTP route rule require "named routes" – defined only in an Istio VirtualService resource and referred in an EnvoyFilter one. Because Gateway API HTTPRoute rules lack a "name" property¹, as well as the Istio VirtualService resources are only ephemeral data structures handled by Istio in-memory in its implementation of gateway configuration for Gateway API, where the names of individual route rules are auto-generated and not referable by users in a policy²³, rate limiting by attributes of the HTTP request (e.g., path, method, headers, etc) would be very limited while depending only on Envoy's Rate Limit filter.

Motivated by the desire to support multiple rate limit domains per ingress gateway, as well as fine-grained HTTP route matching rules for rate limiting, Kuadrant implements a wasm-shim that handles the rules to invoke the rate limiting service, complying with Envoy's Rate Limit Service (RLS) protocol.

The wasm module integrates with the gateway in the data plane via Wasm Network filter, and parses a configuration composed out of user-defined RateLimitPolicy resources by the Kuadrant control plane. Whereas the rate limiting service ("Limitador") remains an implementation of Envoy's RLS protocol, capable of being integrated directly via Rate Limit extension or by Kuadrant, via wasm module for the Istio Gateway API implementation.

As a consequence of this design:

Users can define fine-grained rate limit rules that match their Gateway and HTTPRoute definitions including for subsections of these.
Rate limit definitions are insulated, not leaking across unrelated policies or applications.
Conditions to activate limits are evaluated in the context of the gateway process, reducing the gRPC calls to the external rate limiting service only to the cases where rate limit counters are known in advance to have to be checked/incremented.
The rate limiting service can rely on the indexation to look up for groups of limit definitions and counters.
Components remain compliant with industry protocols and flexible for different integration options.

A Kuadrant wasm-shim configuration for a composition of RateLimitPolicy custom resources looks like the following and it is generated automatically by the Kuadrant control plane:

apiVersion: extensions.istio.io/v1alpha1
kind: WasmPlugin
metadata:
  name: kuadrant-istio-ingressgateway
  namespace: istio-system
  …
spec:
  phase: STATS
  pluginConfig:
    failureMode: deny
    rateLimitPolicies:
    - domain: istio-system/gw-rlp # allows isolating policy rules and improve performance of the rate limit service
      hostnames:
      - '*.website'
      - '*.io'
      name: istio-system/gw-rlp
      rules: # match rules from the gateway and according to conditions specified in the rlp
      - conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /
        data:
        - static: # tells which rate limit definitions and counters to activate
            key: limit.internet_traffic_all__593de456
            value: "1"
      - conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /
          - operator: endswith
            selector: request.host
            value: .io
        data:
        - static:
            key: limit.internet_traffic_apis_per_host__a2b149d2
            value: "1"
        - selector:
            selector: request.host
      service: kuadrant-rate-limiting-service
    - domain: default/app-rlp
      hostnames:
      - '*.toystore.website'
      - '*.toystore.io'
      name: default/app-rlp
      rules: # matches rules from a httproute and additional specified in the rlp
      - conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /assets/
        data:
        - static:
            key: limit.toystore_assets_all_domains__8cfb7371
            value: "1"
      - conditions:
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /v1/
          - operator: eq
            selector: request.method
            value: GET
          - operator: endswith
            selector: request.host
            value: .toystore.website
          - operator: eq
            selector: auth.identity.username
            value: ""
        - allOf:
          - operator: startswith
            selector: request.url_path
            value: /v1/
          - operator: eq
            selector: request.method
            value: POST
          - operator: endswith
            selector: request.host
            value: .toystore.website
          - operator: eq
            selector: auth.identity.username
            value: ""
        data:
        - static:
            key: limit.toystore_v1_website_unauthenticated__3f9c40c6
            value: "1"
      service: kuadrant-rate-limiting-service
  selector:
    matchLabels:
      istio.io/gateway-name: istio-ingressgateway
  url: oci://quay.io/kuadrant/wasm-shim:v0.3.0

Footnotes

https://github.com/kubernetes-sigs/gateway-api/pull/996 ↩
https://github.com/istio/istio/issues/36790 ↩
https://github.com/istio/istio/issues/37346 ↩

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rate-limiting.md

rate-limiting.md

Kuadrant Rate Limiting

How it works

Envoy's Rate Limit Service Protocol

The RateLimitPolicy custom resource

Overview

High-level example and field definition

Using the RateLimitPolicy

Targeting a HTTPRoute networking resource

Multiple HTTPRoutes with the same hostname

Hostnames and wildcards

Targeting a Gateway networking resource

Overlapping Gateway and HTTPRoute RLPs

Limit definition

Route selectors

`when` conditions

Examples

Known limitations

Implementation details

Files

rate-limiting.md

Latest commit

History

rate-limiting.md

File metadata and controls

Kuadrant Rate Limiting

How it works

Envoy's Rate Limit Service Protocol

The RateLimitPolicy custom resource

Overview

High-level example and field definition

Using the RateLimitPolicy

Targeting a HTTPRoute networking resource

Multiple HTTPRoutes with the same hostname

Hostnames and wildcards

Targeting a Gateway networking resource

Overlapping Gateway and HTTPRoute RLPs

Limit definition

Route selectors

when conditions

Examples

Known limitations

Implementation details

Footnotes

`when` conditions