Cache layer

Background

Cloud-native technologies represented by Kubernetes have become the core driving force of enterprises' digital transformation and business amplifiers. As one of the cornerstone technologies of the cloud-native ecosystem, Harbor plays an extremely important role in supporting flexible image distribution. With more and more applications and CI/CD pipelines being implemented, Harbor needs to be able to handle thousands or more requests at one time. With the increasing demand, the Harbor team has started making performance improvements to handle high request scenarios. This includes the Harbor Cache Layer which was introduced in Harbor v2.6.

User Story

Some typical user stories for Harbor include:

As a developer/tester, I want to pull/push my business application to a registry for feature testing or bug validation.
As an operator, I want to maintain the registry by executing some daily scan/replication/retention/gc... jobs.
As a user, I want to deploy my applications by pulling images from the registry.

While seemingly straightforward, these scenarios can lead to the need for a highly available and performant registry as the Harbor's integration points and user base grows, data requests are made transiently or periodically. In summary, it's hard to evaluate the concurrency requests from the outside, so Harbor needs some adaptive and advanced modules to help to improve its performance in high utilization scenarios.

Design

This evaluation of Harbor's use cases, lead to the design and development of a cache layer for Harbor to improve performance. The design of the cache layer centers around the the most common and widely used scenario of massive image pull requests with high concurrency. In this page, we'll not introduce too much about cache layer, but you could refer to proposal for more details.

Architecture

cache layer

Note: Image blobs will not be cached in the harbor side, could use outside solution such as CDN to cache them if needed.

When a user makes a request to Harbor core

Harbor will check the Redis cache for the requested resource. If the requested Harbor resource is found in the cache, then Harbor will return the request will the cached resource.
If the requested Harbor resource is not in the cache, Harbor will initiate its normal retrieval process, either by retrieving the Harbor resource from the Harbor database, if a artifact, project, project_metadata, or repository is requested, or the Harbor distribution, if a manifest is requested.
The Harbor resource is then written back to the Redis cache for future requests, and the requested resource is returned to the user.

The cache layer is disabled by default on Harbor instances, and must be enabled as a harbor.yml configuration setting for your Harbor instance. The cache layer expiration window can also be configured in the harbor.yml file, the default is 24 hours. See Harbor's documentation on how to Configure the Harbor YML File.

NOTICE

If you are deploying Harbor in HA mode, make sure that all the harbor instances have the same behavior, all with caching enabled or disabled, otherwise it can lead to potential data inconsistency.

Comparison

Environment

Type	Description
Hosts	4 * AMD EPYC 7402P 24-Core Processor / 64GB
Kubernetes	1 * master , 3 * worker (v1.23.6)
Harbor	https://github.com/goharbor/harbor-helm (standard deploy)
Storage	https://github.com/openebs/dynamic-nfs-provisioner

Test Scenario

Pulling the manifest with high concurrency and compare results from three dimensions(TPS/Response Time/Success Rate).

Manifest: library/alpine

Benchmark Tool: https://github.com/rakyll/hey

Example Script: hey -c 500 -z 10s -H "Authorization: YWRtaW46SGFyYm9yMTIzNDU=" https://harbor.domain/v2/library/alpine/manifests/sha256:4ff3ca91275773af45cb4b0834e12b7eb47d1c18f770a0b151381cd227f4c253

Response Time

Above diagram is the comparison of API response time, the blue bar represents cache-disabled, the green bar represents cache-enabled, and the x-axis is the number of the concurrent, from one thousand to twenty thousands, the y-axis is the time of response, the time unit is seconds. For example, when the concurrent is 1000, not enable cache needs about 9 seconds, but after enable cache, the time is only less than half second, the effect is very obvious, we can see from the diagram when the concurrency less than eight thousands, cache enable can improve about 10-30 times, and when the concurrency over than eight thousands, the improvement can be 3-10 times.

TPS

tps

The TPS means the transaction every second, this metric represents the number of requests that harbor can handle every second. In the different concurrency, if not enable cache, harbor can only handle about six hundreds requests every second, but if enable cache, harbor can process over 4 thousands requests, the improvement is about 7 times.

Success Rate

cache-rate

This success rate metric reflects the availability of harbor, it’s very valuable, from the chart we can see if not enable cache, the success rate will continue to decrease when the concurrency over than 8 thousands, but if enable cache, even with the 20 thousands concurrency, harbor can still handle all these requests successfully but in the same condition, not enable cache will have over half failed requests because the success rate is only 38 percent, the improvement is about 2X-3X. This result can also reflect the first two diagrams, the response time and TPS seems to be better in the 20 thousands concurrency, but that not the truth, because the mass failed requests improved the average.

Metrics

Benchmark with 5000 concurrency in 10 minutes durations, first 5 minutes cache was disabled and enable cache for last 5 minutes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache layer

Background

User Story

Design

Architecture

Comparison

Environment

Test Scenario

Response Time

TPS

Success Rate

Metrics

Database Active Connections

Database CPU Usage

Registry CPU Usage

Core CPU Usage

Clone this wiki locally