New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
high CPU usage #133
Comments
I just found the culprit - the authentication method is to blame. I created the While the original issue is now resolved for me I've started wondering why CPU usage was so high for the whole duration of the restore. Does rest-server need to authenticate every request? Would it be possible to somehow cache authentication? |
I've tried to reproduce this with my raspberry pi 4, but I didn't encounter a performance problem. I've restored about 10k file with nearly 800MB. The restore time using REST with default connections, REST with 8 connections and SFTP was each time approx. 46 seconds. The rest-server was configured to use a htpasswd, but still it didn't even saturate a single CPU core. I've also run |
@dimejo According to https://en.wikipedia.org/wiki/Bcrypt the prefix |
For the second part of your question: the restic REST protocol follows the philosophy of REST in that it is stateless. Therefore the rest-server has to authenticate each individual request. Directly caching authentication has the big problem that this would keep plaintext passwords in memory, which just feels wrong. |
This of course explains it. The keys created with Ansible had a cost factor of 2^12=4096. Thanks for the explanation!
This sounds reasonable. I was just wondering if this is something that could be improved to be more lightweight for small hardware. |
I've just run a short bcrypt benchmark
On my Raspberrypi 4 the results are
That is even for a low cost factor (5 seems to be the default of the htpasswd utility), the authentication overhead is not completely negligible. One idea to alleviate that problem would be to introduce short-lived tokens which the rest-server could use to skip the bcrypt hash verification. |
That would be great. But I would assume that it requires changes to both restic and rest-server to work, correct? |
rest-server could keep a cheaper hash of the previously accepted I think it's also fine to keep the plaintext password in memory for a few seconds, as this is already available on the request object for the duration of the HTTP request and not wiped after use anyway. If an attacker has access to rest-server's memory, you probably have bigger problems than that these passwords could be read. |
To use short-lived tokens we could introduce a The benefit of this solution (assuming randomly generated tokens) would be that even a memory dump of the hashed tokens would be completely useless (assuming it's not possible to quickly invert the hash function). And as the tokens expire after a short time, recovery from such a leak would also be automatic. The downside is that this requires changes to the client. Caching a hash of valid basic auth headers in memory would also solve the performance problem. To make the auth overhead negligible the hash would have to be cached for at least a minute or so (at least for more expensive bcrypt hashes). The benefit of this solution would be that it wouldn't require modifications to clients. The downside is that this stores a (fast) hash that's directly linked to a clients password in memory. That is recovery from such a leak would require users to change their passwords. Assuming there's no possibility to leak the rest-server's memory both variants are equivalent. If such a leak is possible for some reason (maybe something like Heartbleed? Although I'm not sure what a golang equivalent would look like) then the tokens are much more secure. |
I have a feeling that such a token mechanism would just add complexity and not actually address any realistic threat:
Some practical considerations: a. Implementing such a token API in rest-server would make it harder to perform the authentication on a proxy in front of rest-server, e.g. Caddy. This is not to say that we should not support alternative authentication mechanisms, I just don't think that this should become part of the REST protocol. For example, restic could add support for OAuth to basically do what you suggest in a standardized way. Server-side support for this could live in rest-server, or in a higher layer like Caddy. I also have some ideas for pluggable authentication backends in rest-server, but that's something for later. Caching the Basic auth token is orthogonal to that discussion. If we are already sending the password on every request anyway, there is little harm of caching this for a few seconds. |
I've opened PR #138 which caches the Basic auth credentials. |
Thanks a lot for the fast PR! First test shows a huge improvement:
|
I recently ran some tests to compare restore speed (see restic's PR 3109) and noticed a huge difference between SFTP and rest-server. Restores with the SFTP took roughly 1m 45s while restores with rest-server as backend took roughly 7m 45s.
Output of
rest-server --version
rest-server 0.10.0 compiled with go1.15.2 on linux/amd64
How did you run rest-server exactly?
systemd service file:
What backend/server/service did you use to store the repository?
The repository is stored on a NAS (Shuttle XPC slim DL10J) with Intel Celeron J4005, 8GB RAM and 4TB SSD. The client is a desktop machine in the same network.
Actual behavior
Restoring a snapshot via rest-server was a lot slower and used 100% CPU.
Do you have any idea what may have caused this?
I would have expected TLS to consume some CPU ressources but it is disabled on this NAS. Maybe authentication is using so much CPU?
Do you have an idea how to solve the issue?
Unfortunately no.
Did rest-server help you today? Did it make you happy in any way?
Of course. 😃
The text was updated successfully, but these errors were encountered: