Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a AsyncHTTPClient request with infinite timeout #2742

Open
polygon opened this issue Sep 12, 2019 · 4 comments
Open

Create a AsyncHTTPClient request with infinite timeout #2742

polygon opened this issue Sep 12, 2019 · 4 comments

Comments

@polygon
Copy link
Contributor

polygon commented Sep 12, 2019

Is there a defined way to create a request using AsyncHTTPClient with an infinite timeout? I am currently using the simple httpclient implementation (should also not change) and I am requesting a page that streams an infinite amount of data. I am using streaming_callback to process the chunks of data.

With Python 3.6.4 and Tornado 5.0.2 I set request_timeout=0 on the fetch method call which worked. However, I recently upgraded to Python 3.7.4 and newest Tornado 6. When I set request_timeout=0 here, the future also never completes. However, the client will not even send a HTTP request to the destination. An example of this is the following:

from tornado.httpclient import AsyncHTTPClient
import asyncio

URL='http://localhost:8000/test'

async def test():
    ahc = AsyncHTTPClient()
    result = await ahc.fetch(URL, streaming_callback=chunker, request_timeout=0)

def chunker(data):
    print(data)

asyncio.get_event_loop().run_until_complete(test())

The program will not complete, but on the local server, there will not be a request, either. And, subsequently, chunker will also never be called. I can work around this by setting request_timeout=float('inf') but this feels really hacky and I fear that there may be a timeout at some point in the future anyways.

So question is if there is a way to never timeout the fetch request.

@bdarnell
Copy link
Member

There's not currently a supported way to set an infinite timeout. If there were, it would be some special value (maybe None) instead of zero. If the Future doesn't resolve with a timeout of zero, that's a bug - you should get an immediate timeout error.

I'm kind of surprised that float('inf') works, but setting a large finite timeout (a day? a year?) is fine, and that's what I'd do.

@polygon
Copy link
Contributor Author

polygon commented Sep 17, 2019

Is there an interest to add this kind of infinite timeout? I assume changing the meaning of None for the timeout now would break a lot of existing code. Would you consider a pull request that introduces an additional parameter that disables the request timeout completely?

While setting the timeout to a large value would be a workaround, I am afraid of limitations of the underlying operating system in that regard. After all, testing a year-long timeout is not practically possible, so having a way to disable the timeout completely seems like the cleanest solution to me.

@ploxiln
Copy link
Contributor

ploxiln commented Sep 18, 2019

The relevant code in master (and the 6.0 branch) for simple_httpclient:

            timeout = min(self.request.connect_timeout, self.request.request_timeout)
            if timeout:
                self._timeout = self.io_loop.add_timeout(
                    self.start_time + timeout,
                    functools.partial(self._on_timeout, "while connecting"),
                )
                stream = await self.tcp_client.connect(

There is no else: branch to the if timeout:, so that's why timeouts set to 0 cause nothing to happen, it just skips the stuff. I'd call that a bug.

Currently you can't set None, it will raise an exception from min(). So there is currently no meaning for None that we have to worry about, it just causes exceptions. We are free to add a meaning.

There's also another place in the code where it is clear that None is not allowed (but you won't run into this unless you have more than max_clients - default 10 - concurrent requests):

        if not len(self.active) < self.max_clients:
            assert request.connect_timeout is not None
            assert request.request_timeout is not None
            timeout_handle = self.io_loop.add_timeout(
                self.io_loop.time()
                + min(request.connect_timeout, request.request_timeout),
                functools.partial(self._on_timeout, key, "in request queue"),
            )

@bdarnell
Copy link
Member

Currently you can't set None, it will raise an exception from min(). So there is currently no meaning for None that we have to worry about, it just causes exceptions. We are free to add a meaning.

HTTPRequest(url, request_timeout=None) does have a meaning - it means "use the default". The logic around defaults and None is tricky and I don't think we can overload None with a different meaning without causing confusion.

We could create a special singleton object that would mean "no timeout", but I'm not sure it's worth it. Just setting a large but finite timeout seems like a better approach to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants