Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Google Cloud Function - Error: Could not load the default credentials. #798

Open
timhj opened this issue Oct 1, 2019 · 127 comments
Open

Google Cloud Function - Error: Could not load the default credentials. #798

timhj opened this issue Oct 1, 2019 · 127 comments

Comments

@timhj
Copy link

@timhj timhj commented Oct 1, 2019

This has only recently started happening for me. Some cloud functions which are using the Cloud Vision API have started failing due to an Auth error. The failures seem random, with requests working sometimes and other times not. As there is no explicit Auth happening (it's the Node JS GCF runtime for an existing project, it's not clear what could be the issue).

Errors look like this:

A 2019-10-01T03:17:00.907Z PROD-PDF-XXXX 718747420330136 
    Unhandled rejection PROD-PDF-XXXX 718747420330136
E 2019-10-01T03:17:00.910Z PROD-PDF-XXXX 718747420330136 
    Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
    at GoogleAuth.getApplicationDefaultAsync (/srv/functions/node_modules/google-auth-library/build/src/auth/googleauth.js:161:19)
    at process._tickCallback (internal/process/next_tick.js:68:7) PROD-PDF-XXXX 718747420330136 

Triggering code is:

if(statusUpdateResult.affectedRows < 1){
        // to avoid race conditions from concurrent competing threads, lets make sure this thread 
        // was the one that updated the status first
        let concurrentThreadError = Error('Not the first to update status to processing so exiting');
        console.error(concurrentThreadError);
        return concurrentThreadError;
      } else {
        // status updated and ready for processing
        console.log(`Updated pdf id ${pdfRecord.pdf_id} to queuing_vision.`);
      }
      
      // queue the file for processing by vision API
      const gcsSourceUri = `gs://${bucketName}/${fileName}`;
      const gcsDestinationUri = `gs://${ocrJSONBucket}/${fileName}.json`;

      const inputConfig = {
        // Supported mime_types are: 'application/pdf' and 'image/tiff'
        mimeType: 'application/pdf',
        gcsSource: {
          uri: gcsSourceUri,
        },
      };
      const outputConfig = {
        gcsDestination: {
          uri: gcsDestinationUri,
        },
      };
      const features = [{type: 'DOCUMENT_TEXT_DETECTION'}];
      const request = {
        requests: [
          {
            inputConfig: inputConfig,
            features: features,
            outputConfig: outputConfig,
          },
        ],
      };
      console.log(processingOperation);
      var processingOperation;
      try {
        processingOperation = visionClient.asyncBatchAnnotateFiles(request); 
      } catch(processingError) {
        console.error(processingError);
        return processingError;
      }
      console.log(processingOperation);

The unhandled rejection is happening inside the Vision request try/catch block so there's nowhere further to debug for me, hope someone can help or is getting the same issue. This used to work without issue.

Environment details

  • Google Cloud Functions, Node JS 10 (Beta) Platform
  • NPM is building with "@google-cloud/vision" : ">=1.2.0" so likely latest version

Steps to reproduce

  • Using a GCF function on the Node JS 10 runtime, in a project with all api/auth enabled.
  • Make a request to visionClient.asyncBatchAnnotateFiles.
@timhj

This comment has been minimized.

Copy link
Author

@timhj timhj commented Oct 1, 2019

This issue is intermittent and because it's not catchable, it's resulting in 'out-of-whack' errors in the database. Without deploying anything differently, the code now 'just works' on the same retries.

@timhj

This comment has been minimized.

Copy link
Author

@timhj timhj commented Oct 1, 2019

The vision API request may have been failing behind the scenes and causing this issue from a request to output json into a non-existent GCS Bucket... As the issue is intermittent, I'm not sure. So will watch an see what happens.

@dusty

This comment has been minimized.

Copy link

@dusty dusty commented Oct 5, 2019

I'm having the same problem with various libraries inside cloud functions.

This example is listening for a bucket onFinalize event and sending a single http task to a cloud tasks queue with the payload.

It seems that my function works fine when its first deployed. Then after some time, (perhaps) after it scales to zero and then is re-triggered it always fails until I redeploy it.

Screen Shot 2019-10-05 at 8 18 47 AM

I'm using the nodejs10 runtime with "@google-cloud/tasks": "^1.4.0".

Code

const { v2beta3 } = require('@google-cloud/tasks')
const client = new v2beta3.CloudTasksClient()

const queue = body => {
  return client.createTask({
    parent: process.env.QUEUE_URL,
    task: {
      httpRequest: {
        httpMethod: 'POST',
        url: process.env.TASK_URL,
        headers: { 'Content-Type': 'application/json' },
        body: Buffer.from(JSON.stringify(body)),
        oidcToken: { serviceAccountEmail: process.env.SERVICE_ACCOUNT_EMAIL }
      }
    }
  })
}

exports.default = async file => {
  await queue(file)
  console.info(`DONE: ${file.name} queued`)
}
@merlinnot

This comment has been minimized.

Copy link

@merlinnot merlinnot commented Oct 10, 2019

I also experience this issue for quite some time now. It's happening in my project all the time, let me know if I can be of any help to debug it.

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Oct 10, 2019

@merlinnot what type of authentication are you using in your project, and what APIs specifically.

@merlinnot

This comment has been minimized.

Copy link

@merlinnot merlinnot commented Oct 10, 2019

Firestore, BigQuery, Debugger, ...

Given the stack traces, the error observed originates here:

if (!isGCE) {
// We failed to find the default credentials. Bail out with an error.
throw new Error(
'Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.'
);
}

As you can see, it is thrown if and only if the value of isGCE variable is falsy. The value is a result of a call of _checkIsGCE function:

async _checkIsGCE() {
if (this.checkIsGCE === undefined) {
this.checkIsGCE = await gcpMetadata.isAvailable();
}
return this.checkIsGCE;
}

This function in turn calls isAvailable function from google-metadata library:
https://github.com/googleapis/gcp-metadata/blob/25bc11657001cb6b3807543377d74bafe126ea62/src/index.ts#L121-L142

As you can see, it depends on metadataAccessor function:
https://github.com/googleapis/gcp-metadata/blob/25bc11657001cb6b3807543377d74bafe126ea62/src/index.ts#L49

This function makes an HTTP request to http://169.254.169.254/computeMetadata/v1/ here:
https://github.com/googleapis/gcp-metadata/blob/25bc11657001cb6b3807543377d74bafe126ea62/src/index.ts#L66

I see no other way for this error to occur other than a requests to this service fail.

@merlinnot

This comment has been minimized.

Copy link

@merlinnot merlinnot commented Oct 10, 2019

I'm currently redeploying all of the functions with additional logging enabled (DEBUG_AUTH). Will post here as soon as I have a hit.

@merlinnot

This comment has been minimized.

Copy link

@merlinnot merlinnot commented Oct 10, 2019

In the last 24 hrs I had 71,092 occurrences of this error, but it was last seen 5 hrs ago... I thought I'll be able to provide you more information straight away, this error used to happen all the time.

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Oct 10, 2019

@merlinnot as you noticed, I've deployed a version of gcp-metadata with a debug option. I'd double check that your package-lock.json has gcp-metadata@3.2.0, at which point we should get a better picture of what's happening the next time you run into issues.

@davedc

This comment has been minimized.

Copy link

@davedc davedc commented Oct 11, 2019

@bcoe Experiencing the same issue on a few functions on our side.

I can confirm @merlinnot's suspicion that requests to the metadata service is failing.

Screen Shot 2019-10-11 at 6 22 04 pm

@merlinnot

This comment has been minimized.

Copy link

@merlinnot merlinnot commented Oct 11, 2019

It's back :)

{ FetchError: network timeout at: http://metadata.google.internal./computeMetadata/v1/instance
at Timeout.<anonymous> (/srv/functions/node_modules/node-fetch/lib/index.js:1448:13)
at ontimeout (timers.js:436:11)
at tryOnTimeout (timers.js:300:5)
at listOnTimeout (timers.js:263:5)
at Timer.processTimers (timers.js:223:10)
message:
'network timeout at: http://metadata.google.internal./computeMetadata/v1/instance',
type: 'request-timeout',
config:
{ url:
'http://metadata.google.internal./computeMetadata/v1/instance',
headers: { 'Metadata-Flavor': 'Google' },
retryConfig:
{ noResponseRetries: 0,
currentRetryAttempt: 0,
retry: 3,
retryDelay: 100,
httpMethodsToRetry: [Array],
statusCodesToRetry: [Array] },
responseType: 'text',
timeout: 3000,
params: [Object: null prototype] {},
paramsSerializer: [Function: paramsSerializer],
validateStatus: [Function: validateStatus],
method: 'GET' } }

Here's a timeline for the last 30 days:
Screen Shot 2019-10-11 at 11 38 13

And for the last 7 days:
Screen Shot 2019-10-11 at 11 38 25

@BluebambooSRL

This comment has been minimized.

Copy link

@BluebambooSRL BluebambooSRL commented Oct 11, 2019

Same problem...

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Oct 11, 2019

@merlinnot @BluebambooSRL thank you, this gives us some valuable forensic information for the engineering team 👍

@smashah

This comment has been minimized.

Copy link

@smashah smashah commented Oct 14, 2019

Any remedy for this? I'm experiencing it in same circumstance (vision API in GCF (Node v8))

Edit: Odd thing is that I didn't change any deps or code. Just ran firebase deploy to update some unrelated code and then it started happening.

@merlinnot

This comment has been minimized.

Copy link

@merlinnot merlinnot commented Oct 14, 2019

I think it's not related to dependencies. In my case redeployments also change the behavior of these errors: sometimes I have more, sometimes I have less (see the chart above), where the number of executions per day is rather stable.

I wild guess would be that it just depends on which node in the underlying infrastructure the code lands? Maybe a re-re-deploying would help in your case?

@smashah

This comment has been minimized.

Copy link

@smashah smashah commented Oct 14, 2019

@merlinnot Yes I just redeployed it with ^@google-cloud/vision@1.5.0 (before it was 1.4.0) and it started working again.

@bcoe bcoe added the external label Oct 14, 2019
@davedc

This comment has been minimized.

Copy link

@davedc davedc commented Oct 15, 2019

Sadly, at europe-west2 seems like all underlying nodes have this issue? Redeploying the function a bunch of times has not really alleviated things for us.

@antonioallen

This comment has been minimized.

Copy link

@antonioallen antonioallen commented Oct 19, 2019

Unfortunately, I'm running into this issue as well. It's happening pretty consistently for me at the moment. It just started after a recent full deploy of all my cloud functions. I'm receiving Error: Could not load the default credentials. followed by Unhandled error Error: Can't set headers after they are sent.(mostly likely from "Ignoring exception from a finished function" ). A bit out of ideas on this one. I'll keep poking at it.

For me it's happening at:
GoogleAuth.getApplicationDefaultAsync (/srv/node_modules/@google-cloud/logging/node_modules/google-auth-library/build/src/auth/googleauth.js:161:19)

Screen Shot 2019-10-19 at 9 26 47 AM

Screen Shot 2019-10-19 at 9 28 47 AM

@antonioallen

This comment has been minimized.

Copy link

@antonioallen antonioallen commented Oct 19, 2019

Commenting out all logging logger.debug() within the functions seems to fix the issue for me. But... no logs. Wonder why the auth is failing for it now.

@edi

This comment has been minimized.

Copy link

@edi edi commented Oct 19, 2019

Same here. I have more apps .. none of them are failing with this error.

But one of them ( after recently upgrading my functions dependencies ), started resulting in the same error as above.

Weird thing is, given there are two functions, only one triggers this error.

I have no outside libraries or network requests, no APIs being used, simple firestore document triggers and updates.

So while I’m using the latest version of everything, only one function out of the two is randomly failing.

Had 11 fails during past 5 days. My client is losing revenue based on those fails though, so it’s a bit worrying.

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Oct 21, 2019

@ollydixon this thread is specifically related to authentication issues with cloud functions, which I think is potentially related to something specifically happening within this environment.

Could I bother you to open a new issue, with more specifics about the environment you're running in, and the steps you're using to bootstrap your application.

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Oct 21, 2019

@edi, @davedc, @smasha 👋 sorry about your frustration, I've raised an internal issue with the Cloud Functions folks (which is why this is labeled external), and am going to follow up again today.

@claytongulick

This comment has been minimized.

Copy link

@claytongulick claytongulick commented Feb 5, 2020

Same issue. I'm ditching all stackdriver debugging and logging and just using system logs and slack integration because of this.

@mohshraim

This comment has been minimized.

Copy link

@mohshraim mohshraim commented Feb 7, 2020

using cloud functions from a year, its my first time facing this problem..
what i did is just splitting some functionality to another help.js file other than index.js
if help.js have any work related to DB operations (get,update...etc) i got the mentioned error...
so i keep the logic on help.js and move the update function to main.js and error gone!!!!
Hope this help others
still this a problem i dont have the correct answer for

@jhnnyk

This comment has been minimized.

Copy link

@jhnnyk jhnnyk commented Feb 7, 2020

I was getting this error until I found this post:
https://groups.google.com/forum/#!topic/firebase-talk/W4nB1Ykw7tM

returning the main chain seems to have resolved this error for me

tflare added a commit to tflare/attendance-management-backend that referenced this issue Feb 9, 2020
…ls.対応

admin.credential.applicationDefault()だと以下が出る。

Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
    at GoogleAuth.getApplicationDefaultAsync (/srv/node_modules/google-auth-library/build/src/auth/googleauth.js:160:19)
    at <anonymous>
    at process._tickDomainCallback (internal/process/next_tick.js:229:7)

以下によるとキーを指定して回避するしかないようなので、キーを指定するように修正
googleapis/google-auth-library-nodejs#798 (comment)
@discobeta

This comment has been minimized.

Copy link

@discobeta discobeta commented Feb 13, 2020

Try export GOOGLE_APPLICATION_CREDENTIALS=~/.config/path-to-credential.json

@bdaz

This comment has been minimized.

Copy link

@bdaz bdaz commented Feb 18, 2020

+1, I've started seeing this in my Cloud Functions recently as well. It's sporadic.

@elihorne

This comment has been minimized.

Copy link

@elihorne elihorne commented Feb 20, 2020

I've started to see this as well. The error began 10 days ago.

Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
at GoogleAuth.getApplicationDefaultAsync (/srv/node_modules/google-auth-library/build/src/auth/googleauth.js:160:19)
at
at process._tickDomainCallback (internal/process/next_tick.js:229:7)

@ivoberger

This comment has been minimized.

Copy link

@ivoberger ivoberger commented Feb 20, 2020

We had this issue recently in 2 Firebase projects when trying to write to Firestore from a FB cloud function on nodejs 8. Also appeared without an obvious reason. In our case the reason seems to have been a cron job CF we added. After removing that the error didn't appear again.

@elihorne

This comment has been minimized.

Copy link

@elihorne elihorne commented Feb 20, 2020

I have a job that has been behind a cron job for more than a year without issue until these last 10 days where the credentials error began. It requests a remote resource, writes to cloud storage, and then updates a database. This is where the error is occurring. It also uses firebase queue if that helps at all.

I've updated all my packages, refreshed my credentials, nothing seems to work.

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Feb 20, 2020

@elihorne could you share the basic structure of the code, I believe there's a chance we've debugged the root cause. If you're instantiating a client, e.g.,

const {VisionClient} = require('@google-cloud/vision');
const vision = new Vision();
exports.myHandler = async (req, res) => {
  await vision.doSomething();
  res.send(result);
}

instead do this:

const {VisionClient} = require('@google-cloud/vision');
let vision;
exports.myHandler = async (req, res) => {
  if (!vision) {
    vision = new Vision();
  }
  await vision.doSomething();
  res.send(result);
}

☝️ when you new client libraries, this causes an HTTP request to be made to detect the environment, on cold starts this request might not be made with CPU/memory available.

@elihorne

This comment has been minimized.

Copy link

@elihorne elihorne commented Feb 20, 2020

@bcoe is the important part:

if (!vision) { vision = new Vision(); }

or that it's wrapped in an async function?

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Feb 20, 2020

@elihorne it's important that the async work has finished before you call res.send. Both parts are important:

  1. it's important no async work happens in the global scope (new Vision() currently creates async work, we're fixing this).
  2. as soon as res.send() is called, you may lose CPU/memory, so hold off on that until critical async work is finished.
@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Feb 20, 2020

@elihorne please let me know if this works, because it's good confirmation that we're on the right track for a fix.

@DallasHoff

This comment has been minimized.

Copy link

@DallasHoff DallasHoff commented Feb 21, 2020

@bcoe I was calling new CloudTasksClient() in the global scope. I have just made the suggested change, and I will return with an update on whether the errors stop happening.

@juloko

This comment has been minimized.

Copy link

@juloko juloko commented Feb 21, 2020

Same here, with Firestore:

Got an error Error: Could not load the default credentials. Browse to https://cloud.google.com/docs/authentication/getting-started for more information.
at GoogleAuth.getApplicationDefaultAsync (C:\Users\Ocara\Desktop\Traderox\functions\node_modules\google-auth-library\build\src\auth\googleauth.js:161:19)
at processTicksAndRejections (internal/process/task_queues.js:97:5)
at async GoogleAuth.getClient (C:\Users\Ocara\Desktop\Traderox\functions\node_modules\google-auth-library\build\src\auth\googleauth.js:503:17)
at async GrpcClient._getCredentials (C:\Users\Ocara\Desktop\Traderox\functions\node_modules\google-gax\build\src\grpc.js:108:24)
at async GrpcClient.createStub (C:\Users\Ocara\Desktop\Traderox\functions\node_modules\google-gax\build\src\grpc.js:229:23)
127.0.0.1 - - [21/Feb/2020:14:14:50 +0000] "GET /getMarket HTTP/1.1" 504 - "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.116 Safari/537.36"

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Feb 21, 2020

@juloko this looks like its a program you are running on your local machine, and is not running in Cloud Functions? (I'm gathering this based on the C:\Users\Ocara.

Have you followed the instructions here for authentication?

@DallasHoff

This comment has been minimized.

Copy link

@DallasHoff DallasHoff commented Feb 23, 2020

@bcoe Update: Since making the change discussed above, I have not experienced the error. 🙌

@calclavia

This comment has been minimized.

Copy link

@calclavia calclavia commented Feb 24, 2020

I'm running into this issue when using Firebase cloud functions. It seems to happen on some cloud functions that are deployed, but goes away if I re-deploy the functions.

@omersadika

This comment has been minimized.

Copy link

@omersadika omersadika commented Feb 24, 2020

I'm getting the same issue for firebase functions. It's randomly happening sometimes after I deploy. The way to solve it is by redeploying the function / deleting and deploying until it's working. This is destroying productivity for my team, we're spending hours on dealing with it.

@kthaas

This comment has been minimized.

Copy link

@kthaas kthaas commented Feb 25, 2020

We are also having this issue. We have about 65 Firebase / Google cloud functions in deployment. Our services will randomly fail at some points with this failure to load default credentials issue and the only remediation we have found is a redeployment.

@vmal

This comment has been minimized.

Copy link

@vmal vmal commented Feb 25, 2020

This issue has been there for a long time and its affecting our production system so much. Important demos are being missed because this issue randomly shows up. Today has been way worse where even after multiple deletion and deploys i got the same error. This is so ******* frustrating!

@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Feb 26, 2020

@vmal @kthaas @omersadika @calclavia, we believe we've isolated this problem to creating a client in the global scope outside of your function see: function execution lifetime.

If your code looks like this:

const {CloudTasksClient} = require('@google-cloud/task');
// CloudTasksClient, could be Storage, or Vision, or Speech.
const task = new CloudTasksClient();
export.helloWorld = (req, res) => {
  res.send('hello world');
}

Instead write:

const {CloudTasksClient} = require('@google-cloud/task');
// CloudTasksClient, could be Storage, or Vision, or Speech.
let task;
export.helloWorld = async (req, res) => {
  if (!task) {
    task = new CloudTasksClient();
  }
  await task.doSomething(); // or storage.doSomething(), etc.
  res.send('hello world');
}

There is work in motion to shield folks from this issue. But, as described in the execution timeline, the core of the problem is that asynchronous work cannot happen outside of the request cycle of your function.

@vmal

This comment has been minimized.

Copy link

@vmal vmal commented Feb 26, 2020

@vmal @kthaas @omersadika @calclavia, we believe we've isolated this problem to creating a client in the global scope outside of your function see: function execution lifetime.

If your code looks like this:

const {CloudTasksClient} = require('@google-cloud/task');
// CloudTasksClient, could be Storage, or Vision, or Speech.
const task = new CloudTasksClient();
export.helloWorld = (req, res) => {
  res.send('hello world');
}

Instead write:

const {CloudTasksClient} = require('@google-cloud/task');
// CloudTasksClient, could be Storage, or Vision, or Speech.
let task;
export.helloWorld = async (req, res) => {
  if (!task) {
    task = new CloudTasksClient();
  }
  await task.doSomething(); // or storage.doSomething(), etc.
  res.send('hello world');
}

There is work in motion to shield folks from this issue. But, as described in the execution timeline, the core of the problem is that asynchronous work cannot happen outside of the request cycle of your function.

@bcoe Thank you, seems to have fixed our issue for now since we made sure that we avoid any global scopes for functions using google libarary.

@googleapis googleapis locked as resolved and limited conversation to collaborators Feb 26, 2020
@bcoe

This comment has been minimized.

Copy link
Contributor

@bcoe bcoe commented Feb 26, 2020

👋 I'm locking this issue, because I believe we've isolated the underlying problem, which was instantiating (newing) @google-cloud/ client libraries in the global scope.

This is because the libraries create asynchronous work, which is then immediately throttled, because it happens outside of a function's execution context. I've documented this behavior here and here.

To avoid this behavior, you simply need to modify your function to instantiate client libraries inside of your handler function, something like this:

const {CloudTasksClient} = require('@google-cloud/task');
// CloudTasksClient, could be Storage, or Vision, or Speech.
const task = new CloudTasksClient();
export.helloWorld = (req, res) => {
  res.send('hello world');
}

Instead write:

const {CloudTasksClient} = require('@google-cloud/task');
// CloudTasksClient, could be Storage, or Vision, or Speech.
let task;
export.helloWorld = async (req, res) => {
  if (!task) {
    task = new CloudTasksClient();
  }
  await task.doSomething(); // or storage.doSomething(), etc.
  res.send('hello world');
}

Note: make sure you are caching the instantiation, as in my example, you don't want to create a new client per request.

We also have work in progress here to shield folks from the bug, which we're treating as a high priority.

If you think your issue is different than this one, please don't hesitate to open a new issue.

@bcoe bcoe pinned this issue Feb 26, 2020
@vishald123 vishald123 unpinned this issue Feb 27, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
You can’t perform that action at this time.