Skip to content
/ qms Public

A generic, horizontally-scalable, highly-available and fault-tolerant quota management service

License

Notifications You must be signed in to change notification settings

Blinkuu/qms

Repository files navigation

Quota Management Service

License: AGPL v3 test lint go report codecov

Overview

Quota Management Service (QMS) is a generic, horizontally-scalable, highly-available, and fault-tolerant system for managing quotas. QMS is different from other quota systems in that it natively supports both allocation and rate quotas.

Rate quotas are typically used to limit the number of requests to a shared resource, such as an API or service. Their key characteristic is that they reset after a specified time interval. QMS provides extensible support for this type of workload and can be configured to use various rate-limiting algorithms. Currently, the rate component supports only the memory storage backend.

Allocation quotas are commonly used to restrict the use of resources that do not have a usage rate. Common examples include limiting the amount of used cloud storage or instances deployed. An essential property of allocation quotas is that they do not reset over time and must be explicitly released when they are no longer needed. The alloc component supports three storage backends: memory, local, and raft.

Warning: QMS is not production ready. This project was developed as part of my Master's Thesis and requires further polishing to consider it stable.

Quick Start

This repository hosts all information needed to build and run QMS from source. There are two options to build and run QMS locally:

You have a working Go environment

# Clone QMS repository from GitHub
git clone https://github.com/Blinkuu/qms

# Change directory
cd qms

# Build QMS
make build

# Run QMS locally in a monolithic mode
./bin/qms

You have a working Tilt environment

# Clone QMS repository from GitHub
git clone https://github.com/Blinkuu/qms

# Change directory
cd qms

# Run QMS locally in microservices mode
tilt up microservices

Configuration

Quota Management Service leverages the YAML file format for configuration. Below is an example of the most basic configuration file that can be used to run QMS in monolithic mode on a local machine. For more advanced configuration examples, please refer to the configs directory.

target: all
otel_collector_target: agent:4317

server:
  http_port: 6789

memberlist:
  join_addresses:
    - 127.0.0.1:7946

proxy:
  alloc_addresses:
    - 127.0.0.1:6789
  rate_addresses:
    - 127.0.0.1:6789

rate:
  storage:
    backend: memory
  quotas:
    - namespace: namespace1
      resource: resource1
      strategy:
        algorithm: token-bucket
        unit: minute
        requests_per_unit: 120

alloc:
  storage:
    backend: memory
  quotas:
    - namespace: namespace1
      resource: resource1
      strategy:
        capacity: 10

Deployment

QMS has a microservices-based architecture and is designed to run as a horizontally scalable distributed system. There are three primary components: QMS Proxy, QMS Rate, and QMS Alloc. All those components can run separately and in parallel. Because of QMS’s architecture and implementation, all of the code compiles into a single binary. The target option controls the behavior of this single binary at runtime and determines which components will be run.

Monolithic mode

The monolithic mode runs all required components in a single process. This is the default mode and can be defined by specifying -target=all command-line flag or configuring the target parameter in the YAML config file. Monolithic mode is the simplest mode of operation, and it is useful for getting started quickly to experiment with QMS or setting up the system in the development environment.

Microservices mode

In microservices deployment mode, components are deployed as separate processes. Similarly to the monolithic mode, the target option specifies which component is run. There are three possible choices: proxy, rate, and alloc. Microservices mode is the recommended method for production deployment as it offers the greatest level of flexibility and control over failure domains.

In microservices mode, scaling is done on a per-component basis. As a result, each component can be scaled independently according to the needs. This flexibility comes at a cost. Microservices mode is a bit more complex to set up, deploy and operate than the monolithic mode. The recommended way to run QMS in this mode is to use Kubernetes.

API

QMS by default exposes a JSON over HTTP API.

Ping

Pings the instance. Can be used to check basic availability. For a more advanced liveness check, please refer to the health endpoint.

GET /ping

Example response

{
  "status": 1001,
  "msg": "ok",
  "result": {
    "msg": "pong"
  }
}

Ready

Check whether an instance is ready to accept traffic. This endpoint is designed to work with the Kubernetes readiness probe.

GET /ready

Example response

{
  "status": 1001,
  "msg": "ok"
}

Health

Check whether an instance is in a healthy state. This endpoint is designed to work with the Kubernetes liveness probe.

GET /healthz

Example response

{
  "status": 1001,
  "msg": "ok"
}

Metrics

Gets instance metrics in a Prometheus-compatible format.

GET /metrics

For response format, please refer to the Prometheus documentation.

Memberlist

Returns the current view of the cluster as seen by an instance.

GET /memberlist

Example response

{
  "status": 1001,
  "msg": "ok",
  "result": {
    "members": [
      {
        "service": "proxy",
        "hostname": "qms-proxy-5bfc6ccf44-tmwd9",
        "host": "10.1.54.225",
        "http_port": 6789,
        "gossip_port": 7946
      },
      {
        "service": "proxy",
        "hostname": "qms-proxy-5bfc6ccf44-bpjs2",
        "host": "10.1.54.224",
        "http_port": 6789,
        "gossip_port": 7946
      },
      {
        "service": "proxy",
        "hostname": "qms-proxy-5bfc6ccf44-7jqml",
        "host": "10.1.54.222",
        "http_port": 6789,
        "gossip_port": 7946
      }
    ]
  }
}

Allow

Checks whether a request to a particular resource can be allowed based on the definition of a concrete rate quota. This is the endpoint that can be used to implement rate limiting with QMS.

POST /api/v1/allow

Parameters

Name Type In Description
namespace string body Namespace where the resource resides.
resource string body Name of the resource.
tokens int body Amount of tokens to request.

Example response

{
  "status": 1001,
  "msg": "ok",
  "result": {
    "wait_time": 0,
    "ok": true
  }
}

View

Returns the current status of a particular allocation quota.

POST /api/v1/view

Parameters

Name Type In Description
namespace string body Namespace where the resource resides.
resource string body Name of the resource.

Example response

{
  "status": 1001,
  "msg": "ok",
  "result": {
    "allocated": 14,
    "capacity": 100,
    "version": 3
  }
}

Alloc

Acquires a certain amount of tokens from a particular allocation quota.

POST /api/v1/alloc

Parameters

Name Type In Description
namespace string body Namespace where the resource resides.
resource string body Name of the resource.
tokens int body Amount of tokens to request.
version int body Current version of the resource. If set to 0, no optimistic concurrency control check is performed.
{
  "msg": "ok",
  "result": {
    "remaining_tokens": 86,
    "current_version": 3,
    "ok": true
  }
}

Free

Releases a certain amount of tokens from a particular allocation quota.

POST /api/v1/free

Parameters

Name Type In Description
namespace string body Namespace where the resource resides.
resource string body Name of the resource.
tokens int body Amount of tokens to request.
version int body Current version of the resource. If set to 0, no optimistic concurrency control check is performed.

Example response

{
  "status": 1001,
  "msg": "ok",
  "result": {
    "remaining_tokens": 87,
    "current_version": 4,
    "ok": true
  }
}

Contributing

Contributions are very welcome! Either by reporting issues or submitting pull requests.

License

QMS is distributed under the AGPL-3.0 license.

About

A generic, horizontally-scalable, highly-available and fault-tolerant quota management service

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published