Skip to content

Adaptive System Protection

Eric Zhao edited this page Dec 2, 2019 · 6 revisions

Overview

Adaptive system protection maintains high system throughput under the premise of reliability of the system.

The idea of TCP BBR gives us inspiration. We should balance the requests that the system can handle and the requests that are allowed to pass, rather than relying on a single metric (system load). Our ultimate goal is to increase the throughput of the system within the appropriate system load, rather than the load must be restricted below a threshold.

Sentinel's approach to system load protection is to use load1 as the metric to initiate traffic control, and the traffic allowed to pass is determined by the ability to process the request, including the response time and current QPS.

We've provided a demo for system adaptive protection: SystemGuardDemo.

Usage

There are several kinds of global protection item:

  • System load (load1)
  • System CPU usage
  • Global inbound QPS
  • Global average response time
  • Global max concurrency (of inbound traffic)

Note that the system rules will take effect only for inbound traffic (EntryType.IN).

Principle

Load Protection

Note: load protection only takes effect in Linux/Unix-like OS.

TCP BBR

The request will be blocked under the condition:

  • Current system load (load1) exceeds the threshold (highestSystemLoad);
  • Current concurrent requests exceed the estimated capacity (thread count > minRt * maxQps)

Global Metrics Protection

We have a global statistic node ENTRY_NODE which records global metrics (e.g. inbound QPS, average RT and thread count).

Clone this wiki locally