Performance issues under tight cgroup limits

> [!Note]
>
> Apologies for the long issue, it was a lot of investigation work. Note that this is not AI generated. It's me taking the pains of writing it down.

Hey all, we're perhaps an unusual user of `fscrypt`, but wanted to share our current challenges. We have a non-trivial production fleet on Kubernetes (100s of nodes, ~20k cores, 150 TiB memory, with high CPU and memory usage across GCP, AWS, Azure). The services in question use `fscrypt` to encrypt customer data on persistent volumes (block storage) on a per-customer basis. `fscrypt` runs as a separate Pod in all those nodes with low CPU and memory limits, for efficiency and cost reasons.

The core of the issue is that `fscrypt` does not respect CPU and memory limits set in cgroups. The behavior under such conditions is pathological, especially with large nodes (>= 64 CPUs).

This is in summary what I see:

1. Ignoring CPU limits causes massive slowdowns.
2. Ignoring memory limits causes OOM crashes, unless they are >= 256 MiB
3. Large CPU consumption spikes on large machines (the greater the bigger the machine is)
4. These spikes in usage occur only during `fscrypt setup`

I've run some benchmarks to demonstrate the problem on a 64 cores machine, while varying CPU and memory limits. See some graphs below.

Large execution time under low CPU limits:

<img width="984" height="484" alt="Image" src="https://github.com/user-attachments/assets/28fea430-a28d-4133-aaf3-e10805a8544c" />

Memory usage is 140 MiB regardless of limits:

<img width="584" height="384" alt="Image" src="https://github.com/user-attachments/assets/f5c7ddde-eca0-49f4-a3a6-a5ea6b6df04e" />

CPU usage depends on the number of CPUs rather than limits, and on a 64 machine causes a large spike (150s across all cores):

<img width="984" height="484" alt="Image" src="https://github.com/user-attachments/assets/2f3d1eeb-0300-4934-a73d-cd9b77bfa359" />

## Root causes

`fscrypt setup` runs tests to determine hashing costs. However,

- Instead of using cgroup limits, it uses `runtime.NumCPU`
- For memory limits, it uses `Sysinfo.Totalram`.
- Argon2 spawns up to 256 threads for the testing, which is quite expensive on large machines. Even more when limits are much lower than CPU count.
- It uses actual CPU usage, rather than wall time, to determine for how long to run the tests, which is very skewed when limits are much lower than CPU count.

## Possible fixes

I tried fixing this in several ways on our side, but I think this would be better fixed here. What I tried:

- **Remove CPU limits and keep requests:** This makes performance really unpredictable. On busy nodes, Kubernetes starts enforcing the actual requests, making the problem appear at random times, rather than consistently.
- **Increasing CPU limits:** We can't really do that as the nodes are already pretty full, and the excess capacity is essentially wasted after startup.
- **Use `--time=5ms` in setup:** This fixes the CPU usage by artificially setting very low limits for the hashing test, but doesn't fix memory usage.

I believe the only proper fix is to make `fscrypt` cgroup aware, and my preliminary tests seem to show that the problem goes completely away in that scenario.

---

Please let me know what you think. I'm happy to contribute a fix, as I fiddled with this for long enough at this point.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance issues under tight cgroup limits #442

Root causes

Possible fixes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Performance issues under tight cgroup limits #442

Description

Root causes

Possible fixes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions