Skip to content

GKE Authentication not possible with user ADC and project ID set in either connection or gcloud config #20426

@jobegrabber

Description

@jobegrabber

Apache Airflow Provider(s)

google

Versions of Apache Airflow Providers

apache-airflow-providers-google==6.1.0

Apache Airflow version

2.1.4

Operating System

Debian GNU/Linux 11 (bullseye)

Deployment

Docker-Compose

Deployment details

At my company we're developing our Airflow DAGs in local environments based on Docker Compose.
To authenticate against the GCP, we don't use service accounts and their keys, but instead use our user credentials and set them up as Application Default Credentials (ADC), i.e. we run

$ gcloud auth login
$ gcloud gcloud auth application-default login

We also set the default the Project ID in both gcloud and Airflow connections, i.e.

$ gcloud config set project $PROJECT
$ # run the following inside the Airflow Docker container
$ airflow connections delete google_cloud_default
$ airflow connections add google_cloud_default \
      --conn-type=google_cloud_platform \
      --conn-extra='{"extra__google_cloud_platform__project":"$PROJECT"}'

What happened

It seems that due to this part in base_google.py, when the Project ID is set in either the Airflow connections or gcloud config, gcloud auth (specifically gcloud auth activate-refresh-token) will not be executed.

This results in e.g. gcloud container clusters get-credentials in the GKEStartPodOperator to fail, since You do not currently have an active account selected:

[2021-12-20 15:21:12,059] {credentials_provider.py:295} INFO - Getting connection using `google.auth.default()` since no key file is defined for hook.
[2021-12-20 15:21:12,073] {logging_mixin.py:109} WARNING - /usr/local/lib/python3.8/site-packages/google/auth/_default.py:70 UserWarning: Your application has authenticated using end user credentials from Google Cloud SDK without a quota project. You might receive a "quota exceeded" or "API not enabled" error. We recommend you rerun `gcloud auth application-default login` and make sure a quota project is added. Or you can use service accounts instead. For more information about service accounts, see https://cloud.google.com/docs/authentication/
[2021-12-20 15:21:13,863] {process_utils.py:135} INFO - Executing cmd: gcloud container clusters get-credentials REDACTED --zone europe-west1-b --project REDACTED
[2021-12-20 15:21:13,875] {process_utils.py:139} INFO - Output:
[2021-12-20 15:21:14,522] {process_utils.py:143} INFO - ERROR: (gcloud.container.clusters.get-credentials) You do not currently have an active account selected.
[2021-12-20 15:21:14,522] {process_utils.py:143} INFO - Please run:
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - 
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO -   $ gcloud auth login
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - 
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - to obtain new credentials.
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - 
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - If you have already logged in with a different account:
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - 
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO -     $ gcloud config set account ACCOUNT
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - 
[2021-12-20 15:21:14,523] {process_utils.py:143} INFO - to select an already authenticated account to use.
[2021-12-20 15:21:14,618] {taskinstance.py:1463} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1165, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1283, in _prepare_and_execute_task_with_callbacks
    result = self._execute_task(context, task_copy)
  File "/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1313, in _execute_task
    result = task_copy.execute(context=context)
  File "/usr/local/lib/python3.8/site-packages/airflow/providers/google/cloud/operators/kubernetes_engine.py", line 355, in execute
    execute_in_subprocess(cmd)
  File "/usr/local/lib/python3.8/site-packages/airflow/utils/process_utils.py", line 147, in execute_in_subprocess
    raise subprocess.CalledProcessError(exit_code, cmd)
subprocess.CalledProcessError: Command '['gcloud', 'container', 'clusters', 'get-credentials', 'REDACTED', '--zone', 'europe-west1-b', '--project', 'REDACTED']' returned non-zero exit status 1.

If we set the environment variable GOOGLE_APPLICATION_CREDENTIALS, gcloud auth activate-service-account is run which only works with proper service account credentials, not user credentials.

What you expected to happen

From my POV, it should work to

  1. have the Project ID set in the gcloud config and/or Airflow variables and still be able to use user credentials with GCP Operators,
  2. set GOOGLE_APPLICATION_CREDENTIALS to a file containing user credentials and be able to use these credentials with GCP Operators.

Item 1 was definitely possible in Airflow 1.

How to reproduce

See Deployment Details. In essence:

  • Run Airflow within Docker Compose (but it's not only Docker Compose that is affected, as far as I can see).
  • Use user credentials with gcloud; gcloud auth login, gcloud auth application-default login
  • Configure project ID in gcloud config (mounted in the Docker container) and/or Airflow connection
  • Run GKEStartOperator

Anything else

Currently, the only workaround (apart from using service accounts) seems to be to not set a default project in either the gcloud config or google_cloud_platform connections.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions