Skip to content

Commit

Permalink
Add a hook to skip checking for GCE credentials.
Browse files Browse the repository at this point in the history
Currently, when attempting to fetch an object from GCS, TF will attempt to
detect the GCE metadata service, and (if available) use the metadata service
for fetching auth tokens.

However, in many environments (such as Colab), all attempts to reach the
metadata service will fail by timing out. TF will valiantly retry, leading to
a situation where calling `tf.io.gfile.exists('gs://some-public-bucket')` will
take almost 700s (!) to finish all retries. Once these retries complete, we
attempt the request with an empty bearer token, and the request succeeds.

Once the request succeeds, TF sets an indefinite expiration time, meaning that
an interactive user can't (say) call `gcloud auth` and try again.

This change addresses this problem by adding a new hook for completely
skipping the GCE credential fetch, in the form of the `$NO_GCE_CHECK`
environment variable. This already exists in other Google auth libraries, eg
the Java client:
  https://github.com/googleapis/google-auth-library-java/blob/999de3b11de320354a8ff80a8dc906723d708cf4/oauth2_http/java/com/google/auth/oauth2/DefaultCredentialsProvider.java#L79
When set to any value (even the empty string), the google auth provider
completely skips attempts to talk to the GCE metadata service. In addition, we
don't set an indefinite expiration time in this case, so that future attempts
to fetch credentials aren't skipped.

Fixes #25463. (At least, provides the hook for Colab to use.)
Offers one potential solution to #25464.

PiperOrigin-RevId: 232800897
  • Loading branch information
craigcitro authored and tensorflower-gardener committed Feb 7, 2019
1 parent dd1d9f6 commit a252fe8
Show file tree
Hide file tree
Showing 3 changed files with 52 additions and 7 deletions.
32 changes: 27 additions & 5 deletions tensorflow/core/platform/cloud/google_auth_provider.cc
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,10 @@ constexpr char kGoogleAuthTokenForTesting[] = "GOOGLE_AUTH_TOKEN_FOR_TESTING";
// The environment variable which can override '~/.config/gcloud' if set.
constexpr char kCloudSdkConfig[] = "CLOUDSDK_CONFIG";

// The environment variable used to skip attempting to fetch GCE credentials
// if set.
constexpr char kNoGceCheck[] = "NO_GCE_CHECK";

// The default path to the gcloud config folder, relative to the home folder.
constexpr char kGCloudConfigFolder[] = ".config/gcloud/";

Expand Down Expand Up @@ -146,10 +150,23 @@ Status GoogleAuthProvider::GetToken(string* t) {
}

auto token_from_files_status = GetTokenFromFiles();
auto token_from_gce_status =
token_from_files_status.ok() ? Status::OK() : GetTokenFromGce();
if (token_from_files_status.ok()) {
*t = current_token_;
return Status::OK();
}

bool skip_gce_check = std::getenv(kNoGceCheck) != nullptr;
Status token_from_gce_status;
if (skip_gce_check) {
token_from_gce_status =
Status(error::CANCELLED,
strings::StrCat("GCE check skipped due to presence of $",
kNoGceCheck, " environment variable."));
} else {
token_from_gce_status = GetTokenFromGce();
}

if (token_from_files_status.ok() || token_from_gce_status.ok()) {
if (token_from_gce_status.ok()) {
*t = current_token_;
return Status::OK();
}
Expand All @@ -165,8 +182,13 @@ Status GoogleAuthProvider::GetToken(string* t) {
// so return an empty token instead of failing.
*t = "";

// From now on, always return the empty token.
expiration_timestamp_sec_ = UINT64_MAX;
// We only want to keep returning our empty token if we've tried and failed
// the (potentially slow) task of detecting GCE.
if (skip_gce_check) {
expiration_timestamp_sec_ = 0;
} else {
expiration_timestamp_sec_ = UINT64_MAX;
}
current_token_ = "";

return Status::OK();
Expand Down
2 changes: 1 addition & 1 deletion tensorflow/core/platform/cloud/google_auth_provider.h
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ class GoogleAuthProvider : public AuthProvider {
/// Gets the bearer token from Google Compute Engine environment.
Status GetTokenFromGce() EXCLUSIVE_LOCKS_REQUIRED(mu_);

/// Gets the bearer token from the systen env variable, for testing purposes.
/// Gets the bearer token from the system env variable, for testing purposes.
Status GetTokenForTesting() EXCLUSIVE_LOCKS_REQUIRED(mu_);

std::unique_ptr<OAuthClient> oauth_client_;
Expand Down
25 changes: 24 additions & 1 deletion tensorflow/core/platform/cloud/google_auth_provider_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -69,9 +69,10 @@ class GoogleAuthProviderTest : public ::testing::Test {
void TearDown() override { ClearEnvVars(); }

void ClearEnvVars() {
unsetenv("GOOGLE_APPLICATION_CREDENTIALS");
unsetenv("CLOUDSDK_CONFIG");
unsetenv("GOOGLE_APPLICATION_CREDENTIALS");
unsetenv("GOOGLE_AUTH_TOKEN_FOR_TESTING");
unsetenv("NO_GCE_CHECK");
}
};

Expand Down Expand Up @@ -238,4 +239,26 @@ TEST_F(GoogleAuthProviderTest, NothingAvailable) {
EXPECT_EQ("", token);
}

TEST_F(GoogleAuthProviderTest, NoGceCheckEnvironmentVariable) {
setenv("NO_GCE_CHECK", "", 1);
auto oauth_client = new FakeOAuthClient;

FakeEnv env;
// If the env var above isn't respected, attempting to fetch a token
// from GCE will segfault (as the metadata client is null).
GoogleAuthProvider provider(std::unique_ptr<OAuthClient>(oauth_client),
nullptr, &env);

string token;
TF_EXPECT_OK(provider.GetToken(&token));
EXPECT_EQ("", token);

// We also want to confirm that our empty token has a short expiration set: we
// now set a testing token, and confirm that it's returned instead of our
// empty token.
setenv("GOOGLE_AUTH_TOKEN_FOR_TESTING", "newToken", 1);
TF_EXPECT_OK(provider.GetToken(&token));
EXPECT_EQ("newToken", token);
}

} // namespace tensorflow

0 comments on commit a252fe8

Please sign in to comment.