Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to configure Google Secrets Manager in 2.3.4 #25968

Closed
2 tasks done
aspain opened this issue Aug 25, 2022 · 6 comments · Fixed by #25970
Closed
2 tasks done

Unable to configure Google Secrets Manager in 2.3.4 #25968

aspain opened this issue Aug 25, 2022 · 6 comments · Fixed by #25970
Labels
area:core kind:bug This is a clearly a bug

Comments

@aspain
Copy link
Contributor

aspain commented Aug 25, 2022

Apache Airflow version

2.3.4

What happened

I am attempting to configure a Google Secrets Manager secrets backend using the gcp_keyfile_dict param in a .env file with the following ENV Vars:

AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
AIRFLOW__SECRETS__BACKEND_KWARGS='{"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "gcp_keyfile_dict": <json-keyfile>}'

In previous versions including 2.3.3 this worked without issue

After upgrading to Astro Runtime 5.0.8 I get the following error taken from the scheduler container logs. The scheduler, webserver, and triggerer are continually restarting

Traceback (most recent call last):
  File "/usr/local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/usr/local/lib/python3.9/site-packages/airflow/__init__.py", line 35, in <module>
    from airflow import settings
  File "/usr/local/lib/python3.9/site-packages/airflow/settings.py", line 35, in <module>
    from airflow.configuration import AIRFLOW_HOME, WEBSERVER_CONFIG, conf  # NOQA F401
  File "/usr/local/lib/python3.9/site-packages/airflow/configuration.py", line 1618, in <module>
    secrets_backend_list = initialize_secrets_backends()
  File "/usr/local/lib/python3.9/site-packages/airflow/configuration.py", line 1540, in initialize_secrets_backends
    custom_secret_backend = get_custom_secret_backend()
  File "/usr/local/lib/python3.9/site-packages/airflow/configuration.py", line 1523, in get_custom_secret_backend
    return _custom_secrets_backend(secrets_backend_cls, **alternative_secrets_config_dict)
TypeError: unhashable type: 'dict'

What you think should happen instead

Containers should remain healthy and the secrets backend should successfully be added

How to reproduce

astro dev init a fresh project

Dockerfile:
FROM quay.io/astronomer/astro-runtime:5.0.8

.env file:

AIRFLOW__SECRETS__BACKEND=airflow.providers.google.cloud.secrets.secret_manager.CloudSecretManagerBackend
AIRFLOW__SECRETS__BACKEND_KWARGS='{"connections_prefix": "airflow-connections", "variables_prefix": "airflow-variables", "gcp_keyfile_dict": <service-acct-json-keyfile>}'

astro dev start

Operating System

macOS 11.6.8

Versions of Apache Airflow Providers

apache-airflow-providers-google 8.1.0

Deployment

Astronomer

Deployment details

No response

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@aspain aspain added area:core kind:bug This is a clearly a bug labels Aug 25, 2022
@potiuk
Copy link
Member

potiuk commented Aug 25, 2022

@pdebelak - I think this is caused by the LRU cache introduced in #25556 - is it possible you take a look and see if it can be fixed/workarounded ?

@potiuk
Copy link
Member

potiuk commented Aug 25, 2022

I believe the problem is that dict-indeed is not hashable, and you can pass the dict as parameter of the secret backend configuration.

For now, I don't see an easy workaround other than using gcp_key_path and putting the key in the same path in your workers - would that be a feasible workaround for now @aspain ?

@aspain
Copy link
Contributor Author

aspain commented Aug 25, 2022

With an Astronomer project I don't have access to the workers (other than locally) and would have to include the keyfile in my repository the project deploys from, ideally the keyfile would not need to be in the repository

In my local environment I am using a .env file but not pushing it to the repo, and in the Astro UI I am able to add environment variables

@pdebelak
Copy link
Contributor

Yes, this is related to the new lru_cache in 2.3.4, I didn't realize this would break in this way. There isn't an easy workaround. We might need to revert that change in this case and add a test to make sure we don't break it in the same way again.

@pdebelak
Copy link
Contributor

I see a fix for this that I will PR, but I don't see a workaround for version 2.3.4 if you have a AIRFLOW__SECRETS__BACKEND_KWARGS containing a nested dictionary.

@potiuk
Copy link
Member

potiuk commented Aug 25, 2022

yeah. There is no easy workaround I could see for that one. I will raise it to the release mgmt team (we have one more bug that might make us do 2.3.5 before we reelase 2.4.0. In the meantime @pdebelak - looking forward to a fix :D

pdebelak added a commit to pdebelak/airflow that referenced this issue Aug 26, 2022
anja-istenic pushed a commit to anja-istenic/airflow that referenced this issue Aug 29, 2022
Also a chapter was added to recommend taking a backup before
the migration.

Based on discussions and user input from apache#25866, apache#24526

Closes: apache#24526

Improve cleanup of temporary files in CI (apache#25957)

After recent change in Paralell execution, we start to have
infrequent "no space left on device" message - likely caused by
the /tmp/ generated files clogging the filesystem from multiple
runs. We could fix it by simply running cleanup after parallel
job always, but this is not good due to diagnostics needed
when debugging parallel runs locally so we need to have
a way to skip /tmp files deletion.

This PR fixes the problem twofold:

* cleanup breeze instructions which is run at the beginning of
  every job cleans also /tmp file
* the parallel jobs cleans after themselvs unless skipped.

Properly check the existence of missing mapped TIs (apache#25788)

The previous implementation of missing indexes was not correct. Missing indexes
were being checked every time that `task_instance_scheduling_decision` was called.
The missing tasks should only be revised after expanding of last resort for mapped tasks have been done. If we find that a task is in schedulable state and has already been expanded, we revise its indexes and ensure they are complete. Missing indexes are marked as removed.
This implementation allows the revision to be done in one place

Co-authored-by: Tzu-ping Chung <uranusjr@gmail.com>

Fix dataset_event_manager resolution (apache#25943)

Appears `__init__` is not invoked as part of `_run_raw_task` due to the way TI is refreshed from db.  Centralize dataset manager instantiation instead.

Fix unhashable issue with secrets.backend_kwargs and caching (apache#25970)

Resolves apache#25968

Fix response schema for list-mapped-task-instance (apache#25965)

update areActiveRuns, fix states (apache#25962)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core kind:bug This is a clearly a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants