Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unexceptionally long timeout #633

Open
tcrasset opened this issue Jul 19, 2024 · 3 comments
Open

unexceptionally long timeout #633

tcrasset opened this issue Jul 19, 2024 · 3 comments

Comments

@tcrasset
Copy link

tcrasset commented Jul 19, 2024

In my app, I'd like to timeout after a certain amount of seconds if my bucket cannot be reached, and fall back on a local copy of my file.

This is useful on deployments where egress to external sites is heavily firewalled.

However, passing in all the timeout information I could find by looking at the doc (notably GCSFileSystem.session_kwargs), and GCSFileSystem.timeout (even though it's not documented), does not solve the problem.

It hangs for a few minutes before throwing an exception.

I think the timeout is not passed everywhere it should, especially in GCSFile.
But modifying the code of GCSFile to accept a timeout as argument did not work either.

It hangs on this line especially:

self._details = self.fs.info(self.path, generation=self.generation)

It runs in a asyncio loop, which know next to nothing about, so I can't dig.

The traceback is below, to reproduce, simply add an unroutable IP to /etc/hosts, and run the code snippet.

To reproduce

from aiohttp import ClientTimeout
import fsspec

timeout = 1
gcsfs_kwargs = {
    "session_kwargs": {"timeout": ClientTimeout(total=timeout)},
    "timeout": timeout,
}

filename = "gs://your-gcs-bucket/your-file.json"

filesystem, _ = fsspec.url_to_fs(filename, **gcsfs_kwargs)

with filesystem.open(filename) as f:
    print(f.read())
❯ cat /etc/hosts
127.0.0.1	localhost
10.255.255.1 storage.googleapis.com

# The following lines are desirable for IPv6 capable hosts
::1     ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters

Traceback

ConnectionAbortedError: SSL handshake is taking longer than 60.0 seconds: aborting the connection
  File "aiohttp/connector.py", line 1025, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)
  File "uvloop/loop.pyx", line 2084, in create_connection
  File "uvloop/loop.pyx", line 2079, in uvloop.loop.Loop.create_connection

ClientConnectorError: Cannot connect to host storage.googleapis.com:443 ssl:default [None]
  File "gcsfs/retry.py", line 126, in retry_request
    return await func(*args, **kwargs)
  File "gcsfs/core.py", line 426, in _request
    async with self.session.request(
  File "aiohttp/client.py", line 1197, in __aenter__
    self._resp = await self._coro
  File "aiohttp/client.py", line 581, in _request
    conn = await self._connector.connect(
  File "aiohttp/connector.py", line 544, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "aiohttp/connector.py", line 944, in _create_connection
    _, proto = await self._create_direct_connection(req, traces, timeout)
  File "aiohttp/connector.py", line 1257, in _create_direct_connection
    raise last_exc
  File "aiohttp/connector.py", line 1226, in _create_direct_connection
    transp, proto = await self._wrap_create_connection(
  File "aiohttp/connector.py", line 1033, in _wrap_create_connection
    raise client_error(req.connection_key, exc) from exc
@martindurant
Copy link
Member

The retry handler is here: https://github.com/fsspec/gcsfs/blob/main/gcsfs/retry.py#L125
where retries is a class attribute on GCSFileSystem.

The timeout argument is passed to asyn.sync only on a small number of calls, and is limited by what asyncio can do (i.e., running coroutines cannot be interrupted). What you probably actually want, are to add keys to the session_kwargs passed to aiohttp (see https://docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession ).

Would welcome a PR to add to the documentation.

@tcrasset
Copy link
Author

tcrasset commented Jul 24, 2024

The retry handler is here: main/gcsfs/retry.py#L125 where retries is a class attribute on GCSFileSystem.

The timeout argument is passed to asyn.sync only on a small number of calls, and is limited by what asyncio can do (i.e., running coroutines cannot be interrupted). What you probably actually want, are to add keys to the session_kwargs passed to aiohttp (see docs.aiohttp.org/en/stable/client_reference.html#aiohttp.ClientSession ).

Would welcome a PR to add to the documentation.

I might misunderstand what you're saying, but I did try passing the timeout key to aiohttp.ClientSession using session_kwargs. They don't take effect. The other parameters read_timeout and conn_timeout are deprecated in favor of the aforementioned timeout which needs a aiohttp.ClientTimeout instance.

By default, I see that retries=6, so if the timeout parameter was taken into account, it would take more or less 32 seconds (or 64 seconds if I'm off by one) + 6 times 1 second timeout, but I am seeing minutes of wait time.

>>> import random
>>> sum(random.random() + 2 ** (retry - 1) for retry in range (1, 6))
32.684668857397504

@martindurant
Copy link
Member

32s would be the additional wait time between attempted connection, so it still depends on how long it takes for each attempt to time out, which I think might be 300s.

Can you please confirm: after using "session_kwargs": {"timeout": ClientTimeout(total=timeout)}, followed by any remote operation, do you have

In [5]: filesystem.session.timeout
Out[5]: ClientTimeout(total=1, connect=None, sock_read=None, sock_connect=None)

Also, I see you are using uvloop - does it have its own configuration too maybe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants