Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessary logging noise for non-authenticated GCS access #25464

Closed
rsepassi opened this issue Feb 3, 2019 · 5 comments
Closed

Unnecessary logging noise for non-authenticated GCS access #25464

rsepassi opened this issue Feb 3, 2019 · 5 comments
Assignees
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 1.12 Issues related to TF 1.12 type:bug Bug

Comments

@rsepassi
Copy link

rsepassi commented Feb 3, 2019

System information

  • Ubuntu 16.04
  • Python 3.6
  • TF 1.12

Describe the current behavior

Unnecessary access to GCP metadata endpoint, 10 retries, and lots of logging.

Describe the expected behavior

Should not be trying to access GCP metadata endpoint.

Code to reproduce the issue

$ python
>> tf.gfile.Exists("gs://tfds-data")  # this is a public bucket, doesn’t need any auth
2019-02-03 02:51:58.175696: I tensorflow/core/platform/cloud/retrying_utils.cc:73] The operation failed and will be automatically retried in 0.12376 seconds (attempt 1 out of 10), caused by: Unavailable: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'

...(more of the same)

2019-02-03 02:52:05.330143: I tensorflow/core/platform/cloud/retrying_utils.cc:73] The operation failed and will be automatically retried in 1.29527 seconds (attempt 10 out of 10), caused by: Unavailable: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'
2019-02-03 02:52:06.626410: W tensorflow/core/platform/cloud/google_auth_provider.cc:157] All attempts to get a Google authentication bearer token failed, returning an empty token. Retrieving token from files failed with "Not found: Could not locate the credentials file.". Retrieving token from GCE failed with "Aborted: All 10 retry attempts failed. The last failure: Unavailable: Error executing an HTTP request: libcurl code 6 meaning 'Couldn't resolve host name', error details: Couldn't resolve host 'metadata'".

(Linking issue tensorflow/datasets#38)

@jvishnuvardhan jvishnuvardhan added type:bug Bug TF 1.12 Issues related to TF 1.12 stat:awaiting tensorflower Status - Awaiting response from tensorflower labels Feb 4, 2019
tensorflow-copybara pushed a commit that referenced this issue Feb 7, 2019
Currently, when attempting to fetch an object from GCS, TF will attempt to
detect the GCE metadata service, and (if available) use the metadata service
for fetching auth tokens.

However, in many environments (such as Colab), all attempts to reach the
metadata service will fail by timing out. TF will valiantly retry, leading to
a situation where calling `tf.io.gfile.exists('gs://some-public-bucket')` will
take almost 700s (!) to finish all retries. Once these retries complete, we
attempt the request with an empty bearer token, and the request succeeds.

Once the request succeeds, TF sets an indefinite expiration time, meaning that
an interactive user can't (say) call `gcloud auth` and try again.

This change addresses this problem by adding a new hook for completely
skipping the GCE credential fetch, in the form of the `$NO_GCE_CHECK`
environment variable. This already exists in other Google auth libraries, eg
the Java client:
  https://github.com/googleapis/google-auth-library-java/blob/999de3b11de320354a8ff80a8dc906723d708cf4/oauth2_http/java/com/google/auth/oauth2/DefaultCredentialsProvider.java#L79
When set to any value (even the empty string), the google auth provider
completely skips attempts to talk to the GCE metadata service. In addition, we
don't set an indefinite expiration time in this case, so that future attempts
to fetch credentials aren't skipped.

Fixes #25463. (At least, provides the hook for Colab to use.)
Offers one potential solution to #25464.

PiperOrigin-RevId: 232800897
@craigcitro
Copy link
Contributor

My fix in a252fe8 adds a hook to avoid the "unnecessary access" part of this bug; we still get noisier logs than I'd like. I'm going to repurpose this issue for "make logs less chatty for expected retries".

@craigcitro craigcitro changed the title Unnecessary GCS metadata access Unnecessary logging noise for non-authenticated GCS access Feb 7, 2019
@tensorflowbutler tensorflowbutler removed the stat:awaiting tensorflower Status - Awaiting response from tensorflower label Feb 7, 2019
@sachinprasadhs sachinprasadhs self-assigned this May 20, 2021
@sachinprasadhs
Copy link
Contributor

@rsepassi Can you please confirm if the issue is being solved with the above comment.

@sachinprasadhs sachinprasadhs added the stat:awaiting response Status - Awaiting response from author label May 20, 2021
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 27, 2021
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author TF 1.12 Issues related to TF 1.12 type:bug Bug
Projects
TensorFlow 2.0
  
Awaiting triage
Development

No branches or pull requests

5 participants