Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle timeouts #1952

Merged
merged 4 commits into from
Mar 1, 2021
Merged

Handle timeouts #1952

merged 4 commits into from
Mar 1, 2021

Conversation

lhoestq
Copy link
Member

@lhoestq lhoestq commented Feb 26, 2021

As noticed in #1939, timeouts were not properly handled when loading a dataset.
This caused the connection to hang indefinitely when working in a firewalled environment cc @stas00

I added a default timeout, and included an option to our offline environment for tests to be able to simulate both connection errors and timeout errors (previously it was simulating connection errors only).

Now networks calls don't hang indefinitely.
The default timeout is set to 10sec (we might reduce it).

@stas00
Copy link
Contributor

stas00 commented Feb 26, 2021

I never said the calls were hanging indefinitely, what we need is quite different - in the firewalled env with a network, there should be no network calls or they should fail instantly.

To make this work I suppose on top of this PR we need:

  1. DATASETS_OFFLINE env var to force set timeout to 0 globally (or to 0.0001 if 0 has a special meaning of no timeout)
  2. DATASETS_OFFLINE should guard against failing network calls and not fail the program if it has all the data it needs locally.

Bottom line - if the logic wants to check online if the local file matches online dataset name, let it go wild, but it should fail instantly, recover and use the local file - if one is specified explicitly or cache if there is one. And only if neither was found only then assert.

I hope this makes sense and is doable.

I have started on the same approach for transformers huggingface/transformers#10407

Thank you, @lhoestq

@lhoestq
Copy link
Member Author

lhoestq commented Feb 26, 2021

Yes that was the first step to add DATASETS_OFFLINE :)

With this PR, if a request times out (which couldn't happen before because no time out was set), it falls back on the local files with no error.

As you said, setting the timeout to something like 1e-16 makes the requests fail instantly, which is one step forward. One last thing left is to disable request retries and everything will be instant !

@stas00
Copy link
Contributor

stas00 commented Feb 26, 2021

Ah, fantastic. Thank you for elucidating that this PR is part of a bigger master plan!

@lhoestq
Copy link
Member Author

lhoestq commented Mar 1, 2021

Merging this one, then I'll open a new PR for the DATASETS_OFFLINE env var :)

@lhoestq lhoestq merged commit 7e958e7 into master Mar 1, 2021
@lhoestq lhoestq deleted the handle-timeouts branch March 1, 2021 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants