Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Environment variable AIRFLOW__DATABASE__SQL_ALCHEMY_CONNECT_ARGS not parsed correctly #36207

Closed
2 tasks done
giantZorg opened this issue Dec 13, 2023 · 5 comments · Fixed by #36526
Closed
2 tasks done

Comments

@giantZorg
Copy link

giantZorg commented Dec 13, 2023

Apache Airflow version

2.7.3

What happened

Upon setting the environment variable AIRFLOW__DATABASE__SQL_ALCHEMY_CONNECT_ARGS to pass arguments as described in https://airflow.apache.org/docs/apache-airflow/stable/howto/set-up-database.html#setting-up-a-postgresql-database, the following error occurs:

File "/home/airflow/.local/bin/airflow", line 5, in <module> from airflow.__main__ import main File "/home/airflow/.local/lib/python3.10/site-packages/airflow/__init__.py", line 68, in <module> settings.initialize() File "/home/airflow/.local/lib/python3.10/site-packages/airflow/settings.py", line 546, in initialize configure_orm() File "/home/airflow/.local/lib/python3.10/site-packages/airflow/settings.py", line 238, in configure_orm connect_args = conf.getimport("database", "sql_alchemy_connect_args") File "/home/airflow/.local/lib/python3.10/site-packages/airflow/configuration.py", line 1199, in getimport raise AirflowConfigException( airflow.exceptions.AirflowConfigException: The object could not be loaded. Please check "sql_alchemy_connect_args" key in "database" section. Current value: "{"timeout": 120}".

I tracked the issue to the following code snippet:

https://github.com/apache/airflow/blob/8fbacb8a5fb4168a335ad080a6b806fee3d85737/airflow/settings.py#L237C1-L243C1

Relevant line 238:

connect_args = conf.getimport("database", "sql_alchemy_connect_args")

After changing line 238 to

connect_args = conf.getjson("database", "sql_alchemy_connect_args")

the error was resolved. It also seems reasonable, as these arguments should not import modules but parameters to be passed on in line 242.

If this is indeed a bug, I'm happy to create a pull request. I checked the current development version, the code is still the same as in version 2.7.3 that I use.

What you think should happen instead

The connection arguments should be parsed correctly, the DB connection should be created.

How to reproduce

Set the environment variable to {"timeout": 30} as described in https://airflow.apache.org/docs/apache-airflow/stable/configurations-ref.html#sql-alchemy-connect-args

Operating System

Debian GNU/Linux 11 (bullseye)

Versions of Apache Airflow Providers

apache-airflow-providers-amazon==8.10.0
apache-airflow-providers-celery==3.4.1
apache-airflow-providers-cncf-kubernetes==7.8.0
apache-airflow-providers-common-sql==1.8.0
apache-airflow-providers-daskexecutor==1.1.0
apache-airflow-providers-docker==3.8.0
apache-airflow-providers-elasticsearch==5.1.0
apache-airflow-providers-ftp==3.6.0
apache-airflow-providers-google==10.11.0
apache-airflow-providers-grpc==3.3.0
apache-airflow-providers-hashicorp==3.5.0
apache-airflow-providers-http==4.6.0
apache-airflow-providers-imap==3.4.0
apache-airflow-providers-microsoft-azure==8.1.0
apache-airflow-providers-mysql==5.4.0
apache-airflow-providers-odbc==4.1.0
apache-airflow-providers-openlineage==1.2.0
apache-airflow-providers-postgres==5.7.1
apache-airflow-providers-redis==3.4.0
apache-airflow-providers-sendgrid==3.3.0
apache-airflow-providers-sftp==4.7.0
apache-airflow-providers-slack==8.3.0
apache-airflow-providers-snowflake==5.1.0
apache-airflow-providers-sqlite==3.5.0
apache-airflow-providers-ssh==3.8.1

Deployment

Other 3rd-party Helm chart

Deployment details

Airflow Helm Chart (User Community) https://github.com/airflow-helm/charts/tree/main/charts/airflow using the Airflow image https://hub.docker.com/layers/apache/airflow/2.7.3-python3.10/images/sha256-af4c1ab9472d754a8247e9a44d922db67422584e4a88b5339012fe145171572a?context=explore

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

@giantZorg giantZorg added area:core kind:bug This is a clearly a bug needs-triage label for new issues that we didn't triage yet labels Dec 13, 2023
Copy link

boring-cyborg bot commented Dec 13, 2023

Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.

@giantZorg
Copy link
Author

giantZorg commented Dec 13, 2023

After reading the docs again, it does state Import path for connect args in SqlAlchemy., so the conf.getimport is consistent with the docs. But the example value of '{"timeout": 30}' is rather misleading, at least to me.

Import path for connect args in SqlAlchemy. Defaults to an empty dict. -> How does a path default to an empty dict?

@potiuk potiuk added kind:documentation good first issue and removed needs-triage label for new issues that we didn't triage yet labels Dec 13, 2023
@potiuk
Copy link
Member

potiuk commented Dec 13, 2023

Feel free to make a PR with documentation fix - then you will join 2700 contributors to airlfow (many of them noticed a problem in doc and fix it). It's super easy - just click "suggest change on this page" button and PR will be opened for you. It's actually easier than opening issue.

Marked it also as good first issue so in case you will not decide to give back to the community by improving docs, someone might fix it if they see the label.

@giantZorg
Copy link
Author

@potiuk I'll gladly make a pull request, can you assign the issue to me?

Just to clarify, the intended behaviour is to import the settings from the module specified by the env var as the code does, not to specify them directly in the env var?

@potiuk
Copy link
Member

potiuk commented Dec 14, 2023

The example is wrong in config section apparentlyy, yes. so it should be fixed in airflow/config_remplates/confg.yml to make the description correctly explaining that it should be module.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants