Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high CPU usage #133

Closed
dimejo opened this issue Jan 3, 2021 · 12 comments · Fixed by #138
Closed

high CPU usage #133

dimejo opened this issue Jan 3, 2021 · 12 comments · Fixed by #138

Comments

@dimejo
Copy link

dimejo commented Jan 3, 2021

I recently ran some tests to compare restore speed (see restic's PR 3109) and noticed a huge difference between SFTP and rest-server. Restores with the SFTP took roughly 1m 45s while restores with rest-server as backend took roughly 7m 45s.

Output of rest-server --version

rest-server 0.10.0 compiled with go1.15.2 on linux/amd64

How did you run rest-server exactly?

systemd service file:

[Unit]
Description=Rest Server
After=syslog.target
After=network.target

[Service]
Type=simple
User=rest-server
Group=rest-server
ExecStart=/usr/local/bin/rest-server --listen :8000 --append-only --path /srv/rest-server/repos
Restart=always
RestartSec=7

# Optional security enhancements
NoNewPrivileges=yes
PrivateTmp=yes
ProtectSystem=strict
ProtectHome=yes
ReadWritePaths=/srv/rest-server

[Install]
WantedBy=multi-user.target

What backend/server/service did you use to store the repository?

The repository is stored on a NAS (Shuttle XPC slim DL10J) with Intel Celeron J4005, 8GB RAM and 4TB SSD. The client is a desktop machine in the same network.

Actual behavior

Restoring a snapshot via rest-server was a lot slower and used 100% CPU.

Do you have any idea what may have caused this?

I would have expected TLS to consume some CPU ressources but it is disabled on this NAS. Maybe authentication is using so much CPU?

Do you have an idea how to solve the issue?

Unfortunately no.

Did rest-server help you today? Did it make you happy in any way?

Of course. 😃

@dimejo
Copy link
Author

dimejo commented Jan 3, 2021

I just found the culprit - the authentication method is to blame.

I created the .htpasswd file with Ansible which uses python3-bcrypt to create the hashes. python3-bcrypt by default uses the $2b$ prefix. I don't know why, but this prefix uses a lot more CPU than the $2y$ prefix.

While the original issue is now resolved for me I've started wondering why CPU usage was so high for the whole duration of the restore. Does rest-server need to authenticate every request? Would it be possible to somehow cache authentication?

@MichaelEischer
Copy link
Member

I've tried to reproduce this with my raspberry pi 4, but I didn't encounter a performance problem. I've restored about 10k file with nearly 800MB. The restore time using REST with default connections, REST with 8 connections and SFTP was each time approx. 46 seconds. The rest-server was configured to use a htpasswd, but still it didn't even saturate a single CPU core.

I've also run check --read-data-subset 1/256 which took 6 minutes with the REST backend vs. 12 minutes with SFTP (8 minutes with my sftp-speedup PR). The authentication only consumed about 20% in the CPU profile. Reading and transfering files took 60% of the total time. The htpasswd was created using htpasswd from the apache2-utils package.

@MichaelEischer
Copy link
Member

@dimejo According to https://en.wikipedia.org/wiki/Bcrypt the prefix $2y$ is just the bcrypt variant used, the cost factor follows after that. My test key had $2y$05 so the cost factor would be 2^5=32. I've generated the bcrypt key directly on my raspberrypi and not on a faster machine.

@MichaelEischer
Copy link
Member

For the second part of your question: the restic REST protocol follows the philosophy of REST in that it is stateless. Therefore the rest-server has to authenticate each individual request. Directly caching authentication has the big problem that this would keep plaintext passwords in memory, which just feels wrong.

@dimejo
Copy link
Author

dimejo commented Jan 3, 2021

@dimejo According to https://en.wikipedia.org/wiki/Bcrypt the prefix $2y$ is just the bcrypt variant used, the cost factor follows after that. My test key had $2y$05 so the cost factor would be 2^5=32. I've generated the bcrypt key directly on my raspberrypi and not on a faster machine.

This of course explains it. The keys created with Ansible had a cost factor of 2^12=4096. Thanks for the explanation!

For the second part of your question: the restic REST protocol follows the philosophy of REST in that it is stateless. Therefore the rest-server has to authenticate each individual request. Directly caching authentication has the big problem that this would keep plaintext passwords in memory, which just feels wrong.

This sounds reasonable. I was just wondering if this is something that could be improved to be more lightweight for small hardware.

@dimejo dimejo closed this as completed Jan 3, 2021
@MichaelEischer
Copy link
Member

I've just run a short bcrypt benchmark go test -bench . ./... (see https://github.com/MichaelEischer/rest-server/tree/bcrypt-benchmark) on my desktop computer with a AMD Ryzen 5600X:

BenchmarkBcrypt/5-12         802           1494678 ns/op
BenchmarkBcrypt/12-12          6         186168517 ns/op

On my Raspberrypi 4 the results are

BenchmarkBcrypt/5-4          229           4829457 ns/op
BenchmarkBcrypt/12-4                   2         598322435 ns/op

That is even for a low cost factor (5 seems to be the default of the htpasswd utility), the authentication overhead is not completely negligible. One idea to alleviate that problem would be to introduce short-lived tokens which the rest-server could use to skip the bcrypt hash verification.

@MichaelEischer MichaelEischer reopened this Jan 3, 2021
@dimejo
Copy link
Author

dimejo commented Jan 4, 2021

One idea to alleviate that problem would be to introduce short-lived tokens which the rest-server could use to skip the bcrypt hash verification.

That would be great. But I would assume that it requires changes to both restic and rest-server to work, correct?

@wojas
Copy link
Contributor

wojas commented Jan 4, 2021

rest-server could keep a cheaper hash of the previously accepted Authorization header in memory for a number of seconds.

I think it's also fine to keep the plaintext password in memory for a few seconds, as this is already available on the request object for the duration of the HTTP request and not wiped after use anyway. If an attacker has access to rest-server's memory, you probably have bigger problems than that these passwords could be read.

@MichaelEischer
Copy link
Member

MichaelEischer commented Jan 4, 2021

To use short-lived tokens we could introduce a /:repo/token endpoint where a client could exchange basic auth credentials for an auth token. The rest server would only have to store the hash of the token together with an expiry date in memory. Later requests can then present that token instead of the basic auth credentials. To simplify token refreshes a validity of an hour or so would be best, such that the client can reliably renew the token before it expires. (It would also be possible to let the client renew it's auth token as soon as the first request fails and then let it retry the request). Older clients which don't support the token endpoint would simply continue using http basic auth requests.

The benefit of this solution (assuming randomly generated tokens) would be that even a memory dump of the hashed tokens would be completely useless (assuming it's not possible to quickly invert the hash function). And as the tokens expire after a short time, recovery from such a leak would also be automatic. The downside is that this requires changes to the client.

Caching a hash of valid basic auth headers in memory would also solve the performance problem. To make the auth overhead negligible the hash would have to be cached for at least a minute or so (at least for more expensive bcrypt hashes).

The benefit of this solution would be that it wouldn't require modifications to clients. The downside is that this stores a (fast) hash that's directly linked to a clients password in memory. That is recovery from such a leak would require users to change their passwords.

Assuming there's no possibility to leak the rest-server's memory both variants are equivalent. If such a leak is possible for some reason (maybe something like Heartbleed? Although I'm not sure what a golang equivalent would look like) then the tokens are much more secure.

@wojas
Copy link
Contributor

wojas commented Jan 4, 2021

I have a feeling that such a token mechanism would just add complexity and not actually address any realistic threat:

  1. To prevent an attacker from removing existing backups, one really should use --append-only mode. Otherwise an attacker that has access to a server can also wipe the server's backup.
  2. The restic encryption safeguards the actual backup contents, even if an attacker can read the repo.
  3. Other than reading, once --append-only mode is enabled, all the attacker can write are objects for which the hash will be verified by your PR Verify uploaded files #130 in the future. The original data is still safe and the attacker cannot fake actual snapshots or other objects.
  4. A heartbleed style attack seems very unlikely for a pure Go application. This is a type of bug that should only be able to occur in non memory-safe languages.
  5. Normal access to the process memory requires sufficient permissions or access that you can use to do far worse things than reading the process memory.
  6. All of this of course assumes that communication occurs over TLS, otherwise all is lost anyway.

Some practical considerations:

a. Implementing such a token API in rest-server would make it harder to perform the authentication on a proxy in front of rest-server, e.g. Caddy.
b. Such a token requires state, which could make horizontal scaling a bit harder, but this is solvable.

This is not to say that we should not support alternative authentication mechanisms, I just don't think that this should become part of the REST protocol. For example, restic could add support for OAuth to basically do what you suggest in a standardized way. Server-side support for this could live in rest-server, or in a higher layer like Caddy. I also have some ideas for pluggable authentication backends in rest-server, but that's something for later.

Caching the Basic auth token is orthogonal to that discussion. If we are already sending the password on every request anyway, there is little harm of caching this for a few seconds.

@MichaelEischer
Copy link
Member

I've opened PR #138 which caches the Basic auth credentials.

@dimejo
Copy link
Author

dimejo commented Jan 4, 2021

Thanks a lot for the fast PR!

First test shows a huge improvement:

bcrypt cost time
5 1:38
12 1:57

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants