Unable to authenticate using 'credentials' option #249

sercanersoy · 2020-10-02T11:19:19Z

Hi,

I am writing a Spark (2.4.3) application using Scala (2.11.12) which reads data from a BigQuery table with the help of spark-bigquery-with-dependencies package (0.17.2). The application does not run on the Google Cloud machines, therefore it has to authenticate. Below is my code:

spark.read
  .option("parentProject", "xxx")
  .option("credentials", "xxx")
  .bigquery("xxx")
  .limit(10)
  .show()

I mustn't use credentialsFile option. Therefore, I converted my json credentials file to a Base64 string and I am passing it to credentials option. But this is what I got:

java.io.IOException: The Application Default Credentials are not available. They are available if running in Google Compute Engine. Otherwise, the environment variable GOOGLE_APPLICATION_CREDENTIALS must be defined pointing to a file defining the credentials. See https://developers.google.com/accounts/docs/application-default-credentials for more information.

The program is unable to infer credentials from credentials option. Actually I think it does not even trying to do so, it just directly looks at the GOOGLE_APPLICATION_CREDENTIALS and crashes if it does not exist. I mustn't use an external json credential file in my app hence I mustn't use an environment variable, too.

I would be appreciated if you could help. Thanks by now!

The text was updated successfully, but these errors were encountered:

davidrabinowitz · 2020-10-02T16:14:54Z

Can you please try the following:

spark.conf.set("credentials", "<SERVICE_ACCOUNT_JSON_IN_BASE64>")
spark.read
  .option("parentProject", "xxx")
  .bigquery("xxx")
  .limit(10)
  .show()

It appears that the previous code that had relied on Java streams had issues, reverting to simpler code structure, added additional tests.

davidrabinowitz · 2020-10-06T20:27:43Z

Fixed by PR #250

gbougeard · 2020-10-10T15:13:53Z

Hi,

not sure it's the best place to post that but I'm getting the following error using 0.17.3 :

15:08:29.771 DEBUG c.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase - GHFS.configure
15:08:29.771 DEBUG c.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase - GHFS_ID = GHFS/1.6.1-hadoop2
15:08:29.779 DEBUG com.google.cloud.hadoop.util.CredentialConfiguration - Using service account credentials
15:08:29.779 DEBUG com.google.cloud.hadoop.util.CredentialConfiguration - Getting service account credentials from meta data service.
15:08:29.779 DEBUG com.google.cloud.hadoop.util.CredentialFactory - getCredentialFromMetadataServiceAccount()
[info] - should analyse data and persist summary *** FAILED *** (3 minutes, 0 seconds)
[info]   java.io.IOException: Error getting access token from metadata server at: http://metadata/computeMetadata/v1/instance/service-accounts/default/token
[info]   at com.google.cloud.hadoop.util.CredentialFactory.getCredentialFromMetadataServiceAccount(CredentialFactory.java:208)
[info]   at com.google.cloud.hadoop.util.CredentialConfiguration.getCredential(CredentialConfiguration.java:70)
[info]   at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.configure(GoogleHadoopFileSystemBase.java:1825)
[info]   at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:1012)
[info]   at com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystemBase.initialize(GoogleHadoopFileSystemBase.java:975)

I'm setting up like that and my job will run on AWS EMR (only writes on BQ, no read):

 sparkSession.conf.set("credentials", base64credentials) 
 sparkSession.conf.set("parentProject", projectId)       
 sparkSession.conf.set("project", projectId)             
 sparkSession.conf.set("dataset", dataset)               
 sparkSession.conf.set("table", s"$dataset:$table")      
 sparkSession.conf.set("temporaryGcsBucket", bucket)     
 sparkSession.conf.set("maxParallelism", nbExecutors)

davidrabinowitz · 2020-10-13T16:24:18Z

@gbougeard Can you please validate that the service account JSON fields conforms with the one from https://cloud.google.com/iam/docs/creating-managing-service-account-keys ? especially the token_uri field?

gbougeard · 2020-10-13T16:28:58Z

@gbougeard Can you please validate that the service account JSON fields conforms with the one from https://cloud.google.com/iam/docs/creating-managing-service-account-keys ? especially the token_uri field?

here is an extract of my service account in json:

 "auth_uri": "https://accounts.google.com/o/oauth2/auth",
  "token_uri": "https://oauth2.googleapis.com/token",
  "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",

davidrabinowitz self-assigned this Oct 2, 2020

davidrabinowitz added a commit to davidrabinowitz/spark-bigquery-connector that referenced this issue Oct 5, 2020

Code simplification to solve Issue GoogleCloudDataproc#249

2447449

davidrabinowitz added a commit to davidrabinowitz/spark-bigquery-connector that referenced this issue Oct 5, 2020

Code simplification to solve Issue GoogleCloudDataproc#249

312b7b4

davidrabinowitz added a commit that referenced this issue Oct 6, 2020

Code simplification to solve Issue #249 (#250)

83984a7

It appears that the previous code that had relied on Java streams had issues, reverting to simpler code structure, added additional tests.

davidrabinowitz closed this as completed Oct 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to authenticate using 'credentials' option #249

Unable to authenticate using 'credentials' option #249

sercanersoy commented Oct 2, 2020

davidrabinowitz commented Oct 2, 2020

davidrabinowitz commented Oct 6, 2020

gbougeard commented Oct 10, 2020

davidrabinowitz commented Oct 13, 2020

gbougeard commented Oct 13, 2020

Unable to authenticate using 'credentials' option #249

Unable to authenticate using 'credentials' option #249

Comments

sercanersoy commented Oct 2, 2020

davidrabinowitz commented Oct 2, 2020

davidrabinowitz commented Oct 6, 2020

gbougeard commented Oct 10, 2020

davidrabinowitz commented Oct 13, 2020

gbougeard commented Oct 13, 2020