Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Larger Safety Windows around Expiration Timestamp? #133

Closed
stephenplusplus opened this issue Apr 13, 2017 · 21 comments
Closed

Larger Safety Windows around Expiration Timestamp? #133

stephenplusplus opened this issue Apr 13, 2017 · 21 comments
Assignees
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release.

Comments

@stephenplusplus
Copy link
Contributor

From @MarkHerhold on March 27, 2017 0:7

Insert calls to BigQuery randomly result in auth errors.

I have a job that is responsible for sending data in bulk to BigQuery by using .insert([ 'lots', 'of', 'items' ]) repeatedly in synchronous fashion. This process goes on for hours at a time, calling insert after the prior operation finishes.

After a few hours, my calls to BigQuery abruptly fail. I'm not sure if this issue is time-related or the call itself randomly fails. It does seem to occur after a few hours. This happens repeatedly.

Error:

Unhandled promise rejection (rejection id: 1): ApiError: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.

Environment details

  • OS: OSX El Capitan
  • Node.js version: 7.7.3
  • npm version: 4.1.2
  • @google-cloud/bigquery version: 0.8.0

Steps to reproduce

The above description tells it all, but I'm working on a script. I'll update this if I can find a way to reproduce in a more predictable manner.

Copied from original issue: googleapis/google-cloud-node#2139

@stephenplusplus
Copy link
Contributor Author

stephenplusplus commented Apr 13, 2017

Before every request, we do a token grab, refreshing if necessary (that work is done by google-auth-library).

With this large amount of requests, maybe we need a larger safety window around the expiration timestamp for when the auth client makes the grab. @jmdobry does that sound possible? (please re-tag if someone else is closer to the auth library code)

@stephenplusplus stephenplusplus changed the title Random Auth Errors with BigQuery Larger Safety Windows around Expiration Timestamp? Apr 28, 2017
@merlinnot
Copy link

merlinnot commented Jun 14, 2017

I came across the same issue. It's not a big deal for me because of a fault-tolerant design of my app, but I think it should have very high priority.

Error: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
 at /user_code/node_modules/@google-cloud/pubsub/node_modules/grpc/src/node/src/client.js:442:17

@simonprickett
Copy link

I might be seeing the same thing - I'm using cloud functions to stream records into BigQuery in chunks of about 2,000 and when I run several instances of my function (which receive their records to put in BQ via PubSub) I am seeing this sporadically (like one run of the function fails like this and others don't):

Error sending record to BigQuery: Error: Could not load the default credentials. Browse to https://developers.google.com/accounts/docs/application-default-credentials for more information.

I was also seeing a DNS quota run down issue which seems strange as my code doesn't make any HTTP calls, just whatever the pub sub and big query libraries do:

Error: quota exceeded (DNS resolutions : per 100 seconds); check and increase your quota at https://console.cloud.google.com/iam-admin/quotas?project=MYPROJECTIDHERE&service=cloudfunctions.googleapis.com&usage=ALL. Function killed.

I have upped the DNS resolution limit to try and address at least this second issue, but I don't believe my code to be the cause of it.

My code uses the async package inside a cloud function to write records to BigQuery, returning from the cloud function when it's done... looks like (interval is an object that contains an array of 2000 or so reading records):

exports.uaProcessInterval = function(event, callback) {
    const message = event.data;
    const async = require('async');

    if (message.data) {
        const messageData = JSON.parse(Buffer.from(message.data, 'base64').toString());
	const interval = messageData.interval;

        async.map(
            interval.readings, 
            function(reading, cb) {
                // Process individual reading.
		const bigQuery = require('@google-cloud/bigquery')(),
			  bigQueryDataset = bigQuery.dataset('MYDATASET'),
			  bigQueryTable = bigQueryDataset.table('MYTABLE'),
			  record = {
                              // Various properties of "reading" object that was passed
			  };

                          bigQueryTable.insert(record, function(err, apiResponse) {
                              if (err) {
                                  console.log(`Error sending record to BigQuery: ${err}`);
                              } else {
				 console.log(`Sent record to BigQuery: ${recordStr}`);
                              }
  
                              // Done with this reading.
                             cb();
                         });
            },
            function(err, results) {
                // Everything is done.
                console.log('All readings processed');
                callback();
	    }
    );
}

I'm wondering if my DNS issue was to do with where I am require BigQuery and maybe should move that out of the function.

@simonprickett
Copy link

Well, moving my require for BigQuery out of the function that async.map calls for each record that I want to write into BigQuery made the "could not load the default credentials" error go away, but has caused lots of these new errors instead:

Error sending record to BigQuery: Error: read ECONNRESET

and the DNS quota issue continues.

Error: quota exceeded (DNS resolutions : per 100 seconds); check and increase your quota at https://console.cloud.google.com/iam-admin/quotas?project=MYPROJECTID&service=cloudfunctions.googleapis.com&usage=ALL. Function killed.

Revised code that is doing this (for some invocations of the cloud function but not all of them) looks like this:

exports.uaProcessInterval = function(event, callback) {
    const message = event.data,
          async = require('async'),
          bigQuery = require('@google-cloud/bigquery')();

    if (message.data) {
        const messageData = JSON.parse(Buffer.from(message.data, 'base64').toString()),
	            interval = messageData.interval,
              bigQueryDataset = bigQuery.dataset('MYDATASET'),
              bigQueryTable = bigQueryDataset.table('MYTABLE');

        async.map(
            interval.readings, 
            function(reading, cb) {
                // Process individual reading.
			          const record = {
                  // Various properties of "reading" object that was passed
			          };

                bigQueryTable.insert(record, function(err, apiResponse) {
                  if (err) {
                    console.log(`Error sending record to BigQuery: ${err}`);
                  } else {
				            console.log(`Sent record to BigQuery: ${recordStr}`);
                  }
  
                  // Done with this reading.
                  cb();
                });
            },
            function(err, results) {
                // Everything is done.
                console.log('All readings processed');
                callback();
	           }
        );
    } else {
        callback();
    }
}

@lukesneeringer
Copy link

Hi all, thanks for reporting.

Question: Are any of you getting this error while using service accounts specifically? (We have a theory we are trying to disprove.)

@simonprickett
Copy link

@lukesneeringer I am using Cloud Functions, so whatever they run as... I just changed my code to slow down the async calls to limit them to 5 concurrent calls. I no longer see the default credentials error but I do see DNS issues reported:

Error: quota exceeded (DNS resolutions : per 100 seconds); check and increase your quota at https://console.cloud.google.com/iam-admin/quotas?project=MY_PROJECT&service=cloudfunctions.googleapis.com&usage=ALL. Function killed.

I changed code thus to only do max 5 parallel per cloud function invocation... there are several cloud function instances running concurrently putting data into same BQ dataset and table like this:

exports.uaProcessInterval = function(event, callback) {
	const message = event.data,
		  async = require('async'),
		  moment = require('moment'),
		  bigQuery = require('@google-cloud/bigquery')();

	const messageData = JSON.parse(Buffer.from(message.data, 'base64').toString());
	const interval = messageData.interval;

	const bigQueryDataset = bigQuery.dataset('MYPROJECT'),
		  bigQueryTable = bigQueryDataset.table('MYTABLE');

	// interval.readings may contain 2500 records, and several instances of this
	// function may be running concurrently against the same BQ table and dataset.
	async.mapLimit(
		interval.readings, 
		5,
		function(reading, cb) {
			const record = {
				// Verious fields from reading
			};

			// Attempt to persist individual reading.
			bigQueryTable.insert(record, function(err, apiResponse) {
	 			const recordStr = JSON.stringify(record);
	 			
	 			if (err) {
					console.log(`Error sending record to BigQuery: ${err} - record was ${recordStr}`);
					console.log(JSON.stringify(apiResponse));
				} else {
					console.log(`Sent record to BigQuery: ${recordStr}`);
				}

				// Done with this reading.
				cb();
	 		});
		},
		function(err, results) {
			// Everything is done.
			console.log('All readings processed for interval');
			callback();
		}
	);
}```

@merlinnot
Copy link

Any update on this, please? I can provide every debugging data you need, it happens at least few times every five minutes in my app.

@Zechnophobe
Copy link

Hello, I am not 100% sure this is related, but I have a service that is using google pubsub. The worker pulling messages was in the middle of pulling a batch of around 600k (1 at a time), when suddenly failed with the same messaging here - Unhandled promise rejection (rejection id: 1): Error: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential.

Sadly for me, my code did not tolerate that, and the rest of the messages were left waiting until they were stale and we had to remove them. Does this at least match the profile of what is being investigated here?

@ismailbaskin
Copy link

I'm publishing cloud pubsub on cloud functions and having trouble with this error frequently. Is there any workaround solution you can recommend?

@lukesneeringer lukesneeringer added the priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. label Sep 6, 2017
@bantini
Copy link
Contributor

bantini commented Sep 8, 2017

@merlinnot @ismailbaskin Can you elaborate on what you were doing, preferably with some code?

@merlinnot
Copy link

Sure. I've implemented some syncing mechanism on top of GCF and PubSub. It creates a tree structure.

The first function fetches credentials from all of the servers, then it publishes a PubSub message with server hostname and credentials.

The second function checks which applications on each server should be updated. For each application, there are multiple users to synchronize, it publishes a "request" to sync data for a specific user.

And so on, and so forth.

There are seven levels of this structure since server's API is kinda tricky to use that way. The 6th function regularly throws this error:

Error: Request had invalid authentication credentials. Expected OAuth 2 access token, login cookie or other valid authentication credential. See https://developers.google.com/identity/sign-in/web/devconsole-project.
    at /user_code/node_modules/@google-cloud/pubsub/node_modules/grpc/src/node/src/client.js:554:15

It's not such a big issue for me, data is synced next time and no one gets hurt. But I get this error at least once every five minutes, so it shouldn't be that hard to track down for a Googler with access to my project and Google's internal logs.

@lukesneeringer
Copy link

it shouldn't be that hard to track down for a Googler with access to my project and Google's internal logs.

None of us have access to user data (Google is really protective of that stuff for obvious reasons). That said, we will do our best to reproduce and solve. :-)

@briangranatir
Copy link

briangranatir commented Sep 21, 2017

I'll give you access to my user data if it'll help :)

Seeing this issue with Datastore and Pubsub too

@bantini
Copy link
Contributor

bantini commented Sep 27, 2017

I am trying to replicate the issue by inserting random data into BigQuery, similar to how @simonprickett implemented his method, varying the limit. However, the error is too sporadic to be diagnosed (blocks of errors that comes up randomly every few hours). Is there anyone who is getting the errors more frequently? Any help to replicate the issue more frequently will be great. Perhaps someone using Pubsub can share some code?

@simonprickett
Copy link

@bantini I got fed up with this and for this and some other reasons (lack of support for environment variables being one) have taken my code out of Cloud Functions and into AppEngine Flex instead, where so far, it's working OK.

@merlinnot
Copy link

@bantini I get this error every five minutes from more than two months :) Maybe the issue occures when in the same time there are multiple invocations of a function which publishes some data to PubSub?

@bantini
Copy link
Contributor

bantini commented Sep 28, 2017

@merlinnot What is the authentication mechanism that you are using?

@merlinnot
Copy link

merlinnot commented Sep 29, 2017

@bantini Standard env variables from GCF:

import * as PubSub from '@google-cloud/pubsub'

const pubSub = PubSub()

@JustinBeckwith
Copy link
Contributor

Fixed in #242

@vizsatiz
Copy link

I guys, I am currently facing the same issue with Pub/Sub. What is the fix and how can I use that in my code? Will just updating my "@google-cloud/pubsub" do?

Any help will be appreciated :)

@Justkant
Copy link

Just stumbled upon this issue, a DNS cache (axios/axios#1878) could help mitigate issues like that.
More links from the issue:

@JustinBeckwith JustinBeckwith self-assigned this Feb 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release.
Projects
None yet
Development

No branches or pull requests