Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inability to configure full request timeout leads to DoS / resource leak #35

Closed
Shnatsel opened this issue Jan 7, 2020 · 19 comments · Fixed by #47
Closed

Inability to configure full request timeout leads to DoS / resource leak #35

Shnatsel opened this issue Jan 7, 2020 · 19 comments · Fixed by #47

Comments

@Shnatsel
Copy link

Shnatsel commented Jan 7, 2020

attohttpc currently does not allow specifying a timeout for the entire request (i.e. until the request body is finished transferring), which means an established connection will hang indefinitely if the remote host doesn't stops replying. This leads to a resource leak, which may cause denial of service. This post explains the problem in detail: https://medium.com/@nate510/don-t-use-go-s-default-http-client-4804cb19f779

While it's not necessary to set such timeout by default (most other HTTP clients don't), it must be possible to configure it for attohttpc to be usable for e.g. building reliable services that query a remiote API over HTTP, or for defending against DoS attacks where the remote host sends 1 byte per second, keeping the connection open indefinitely.

@Shnatsel Shnatsel changed the title Inability to configure read timeout leads to DoS / resource leak Inability to configure full request timeout leads to DoS / resource leak Jan 7, 2020
@Shnatsel
Copy link
Author

Shnatsel commented Jan 7, 2020

I have updated the description to clarify the attack vectors and a better approach to mitigating them.

@adamreichold
Copy link
Contributor

adamreichold commented Jan 7, 2020

A read timeout should be easy to add, but a full request timeout is a complicated thing using a blocking design, i.e. one needs to spawn a separate thread that forcibly closes the underlying TCP stream so that all blocking calls return when the timeout expires.

But also not that the DoS problem is mainly one for the server, i.e. a client will probably not have a problem keeping single TCP connection alive, a server which does that on behalf of a lot of clients certainly can get into trouble. But honestly, I don't think attohttpc is a very reasonable choice for such a situation and one would probably prefer a HTTP client that is integrated with the I/O framework of the server itself which will make it reasonably simple to solve things like full request timeouts using e.g. Futures. Or to quote the documentation:

The intended use is for projects that have HTTP needs where performance is not critical or when HTTP is not the main purpose of the application.

@Shnatsel
Copy link
Author

Shnatsel commented Jan 7, 2020

I think there is a way to fit it into a blocking design: you can simply configure the read timeout to min(read_timeout, time_until_request_deadline)

@Shnatsel
Copy link
Author

Shnatsel commented Jan 7, 2020

The drawback is that this timeout needs to be reconfigured once in a while, but if it's done e.g. once every 250 milliseconds it's not going to interfere with performance and still provide adequate precision.

@adamreichold
Copy link
Contributor

adamreichold commented Jan 7, 2020

I think there is a way to fit it into a blocking design: you can simply configure the read timeout to min(read_timeout, time_until_request_deadline)

While such a scheme can be made to work in principle, it is quite brittle as it needs to be threaded through the layers of blocking reads hence complicated code significantly which is why I think that spawning a separate thread would more in line with this crate's goals for simplicity but actually I think the problem is probably not really relevant for this crate's intended use case.

@sbstp
Copy link
Owner

sbstp commented Jan 8, 2020

The Golang bug described in the blog post has been fixed since, but I actually ran into another variant of it at work and it's actually a pretty nasty bug.

Since this library does not do connection pooling in the background, it's not as dangerous as the users are likely to figure out that their code is blocked on something. But if you use it in a thread pool in a crawler type of application it could easily exhaust the whole thread pool quickly.

I think the min trick can be implemented somewhat easily if we do it in the BaseStream struct since every call to read will eventually end up there, though the various layers of buffering might obscure this a bit.

@adamreichold
Copy link
Contributor

adamreichold commented Jan 8, 2020

I think the min trick can be implemented somewhat easily if we do it in the BaseStream struct since every call to read will eventually end up there, though the various layers of buffering might obscure this a bit.

I opened #39 implementing a request timeout (actually a deadline) by setting read and write timeouts before every blocking call into the base stream. Personally, we should do only one thing, either the request timeout or the read timeout as the read timeout is not really necessary if the request timeout is used and the request timeout is costly enough as it is as we have to apply this each time we have to make a blocking call.

@sbstp
Copy link
Owner

sbstp commented Jan 8, 2020

We already use buffering to avoid performing too many syscalls, so while this will probably hurt performance, I don't think it will hurt it too much. We could run benchmarks with and without it.

@sbstp
Copy link
Owner

sbstp commented Jan 8, 2020

We discussed about this a bit in #39. In the end we decided to go for a read timeout and a connect timeout. It's simpler and has less gotchas. If there's really a need for a request timeout, we can look into it in the future. In the mean time, it's still possible to use a background thread with a channel to create a request timeout. Now that we have a connect & read timeout, the connection attempt in the background thread cannot be leaked.

@Shnatsel
Copy link
Author

Shnatsel commented Jan 8, 2020

If there's really a need for a request timeout, we can look into it in the future.

Yes, there is indeed a class of DoS attacks that cannot be defended against without it.

Consider an image hosting service like imgur.com: users enter URLs of images they want to host, and the HTTP client fetches the images and puts them into storage. If a user enters a URL to a malicious host that replies more often than the read timeout, but keeps replying indefinitely, this means that the connection will be open indefinitely. If an attacker keeps getting the service to open such connections, this will eventually lead to the exhaustion of the thread pool or network ports on the client.

@adamreichold
Copy link
Contributor

If an attacker keeps getting the service to open such connections, this will eventually lead to the exhaustion of the thread pool or network ports on the client.

There are many HTTP client implementation better suited to implementing such a service, specifically those that tie in with the underlying I/O framework, e.g. Tokio and Reqwest or Actix and AWC.

@sbstp
Copy link
Owner

sbstp commented Jan 8, 2020

Alright I'll leave this issue open because the attack you've described is legit.

I think we could perhaps protect ourselves from it by using a deadline in the BaseStream but without setting the deadline remaining time as the socket read timeout. So when a call to read is made we just check the if the deadline has been reached knowing that the call to read will eventually timeout because of the default read timeout. Just gotta make sure that timeout is not infinite.

@adamreichold
Copy link
Contributor

I think we could perhaps protect ourselves from it by using a deadline in the BaseStream but without setting the deadline remaining time as the socket read timeout.

I think it would be better to avoid putting to much high-level policy into the crate. We could in the same vein as the connect and read timeouts, just expose the underlying TCP stream in the API so that users can do whatever they wish, e.g. clone the stream and have another thread check the deadline and shutdown the stream when it is reached.

@adamreichold
Copy link
Contributor

But I have to question whether this attack scenario is really applicable here. Why can't we say that attohttpc should not be used as the HTTP client for such a service? If all HTTP client implementations have to implement all possible features, then there would be no need for a crate such as this one in the first place. Being able to keep things simple by not trying to handle every use case can be a virtue.

@Shnatsel
Copy link
Author

Shnatsel commented Jan 8, 2020

The problem with reqwest/actix/etc is that they use async I/O and sit on an enormous pile of unsafe code that has never been audited. And async I/O is a reliability nightmare, especially on top of Rust's relatively immature async I/O stack, and is ridiculously hard to debug.

To wit: the latest release of reqwest deadlocks 6% of the time on a basic smoke test, while actix is still unsound and the maintainers refuse to fix it.

You only really need async I/O on the server side, where you have to keep 10,000 connections open at the same time. The client usually has no such requirements. So there is absolutely a market for a feature-rich HTTP client library with synchronous I/O and little unsafe code.

@sbstp
Copy link
Owner

sbstp commented Jan 8, 2020

@Shnatsel what's your take on how we should implement the request timeout? Should one exist by default or should we just allow the user to set one and leave the default to be "no timeout". I'm just wondering what would happen for instance if someone tries to download a really large file using this library and runs into the default request timeout.

@Shnatsel
Copy link
Author

Shnatsel commented Jan 8, 2020

I do not think having a default timeout for the entire request is reasonable. If you're avoiding DoS in e.g. an API querying situation, you need it set to something like a minute, maybe even 10 seconds, which would be ridiculously low for all other use cases.

I think it's reasonable to set a connection timeout by default - but the OS already does that (unix, windows), and the lower timeout takes efect, so there's no need to do that in the client.

It is also reasonable to set a read timeout so that if the remote server goes away, the connection would be dropped. You don't want a user of something like wget to stare at the console for hours just because the remote host suddenly went away, the sane thing to do is to fail and let them retry. HTTP long-polling is not very common these days, and if you're doing that you'll know to increase a connection timeout anyway.

@adamreichold
Copy link
Contributor

To wit: the latest release of reqwest deadlocks 6% of the time on a basic smoke test, while actix is still unsound and the maintainers refuse to fix it.

And until the last release this crate panicked on decoding UTF-8 text which might not be memory corruption but is definitely a denial of service issue.

So there is absolutely a market for a feature-rich HTTP client library with synchronous I/O and little unsafe code.

But why does that statement imply that this crate needs to stop being what it is and drop its design rationale on the floor. Why not fork this into exahttpc to do that?

@Shnatsel
Copy link
Author

Shnatsel commented Jan 8, 2020

And until the last release this crate panicked on decoding UTF-8 text which might not be memory corruption but is definitely a denial of service issue.

Fortunately panics are quite easy to plan for - you can catch and handle them, unlike deadlocks. They are also fairly trivial to debug, unlike deadlocks. And panics are basically the worst thing you can expect out of safe Rust without async I/O, save for a panic-while-panicking that will abort the process.

But why does that statement imply that this crate needs to stop being what it is and drop its design rationale on the floor.

Not necessarily. I just wanted to demonstrate that there are many common use cases that are currently not served well by any existing Rust HTTP client. And it looked like some of them could be served by attohttpc if it supported request timeouts, so I've opened this issue.

If you choose not to support these use cases, that's a valid design decision and I will respect it. But it would be a kind of a shame, because those use cases will remain without any library to serve them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants