-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Description
Apache Airflow version
2.10.4
If "Other Airflow 2 version" selected, which one?
Observed also on 2.10.2
What happened?
One of the tests in our CI is creating a DagBag. After adding a DAG that's scheduled on a DatasetAlias, the bag cannot be created. Confusingly, this issue does not occur in other settings/environments (e.g. dag.test, local pytest, non-CI Airflow run in Docker).
Relevant stack trace
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: dataset_alias
[SQL: SELECT dataset_alias.id, dataset_alias.name
FROM dataset_alias
WHERE dataset_alias.name = ?
LIMIT ? OFFSET ?]
[parameters: ('bar', 1, 0)]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
[2024-12-08T16:08:22.227+0000] {variable.py:357} ERROR - Unable to retrieve variable from secrets backend (MetastoreBackend). Checking subsequent secrets backend.
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: variable
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/models/variable.py", line 353, in get_variable_from_secrets
var_val = secrets_backend.get_variable(key=key)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", line 97, in wrapper
return func(*args, session=session, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/secrets/metastore.py", line 66, in get_variable
return MetastoreBackend._fetch_variable(key=key, session=session)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/api_internal/internal_api_call.py", line 139, in wrapper
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/utils/session.py", line 94, in wrapper
return func(*args, **kwargs)
File "/home/airflow/.local/lib/python3.9/site-packages/airflow/secrets/metastore.py", line 84, in _fetch_variable
var_value = session.scalar(select(Variable).where(Variable.key == key).limit(1))
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1747, in scalar
return self.execute(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/orm/session.py", line 1717, in execute
result = conn._execute_20(statement, params or {}, execution_options)
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1710, in _execute_20
return meth(self, args_10style, kwargs_10style, execution_options)
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 334, in _execute_on_connection
return connection._execute_clauseelement(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1577, in _execute_clauseelement
ret = self._execute_context(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1953, in _execute_context
self._handle_dbapi_exception(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 2134, in _handle_dbapi_exception
util.raise_(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 211, in raise_
raise exception
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1910, in _execute_context
self.dialect.do_execute(
File "/home/airflow/.local/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 736, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: variable
[SQL: SELECT variable.val, variable.id, variable."key", variable.description, variable.is_encrypted
FROM variable
WHERE variable."key" = ?
LIMIT ? OFFSET ?]
[parameters: ('latest_processed_instrument_file_path', 1, 0)]
(Background on this error at: https://sqlalche.me/e/14/e3q8)
What you think should happen instead?
DagBag creation should not fail on a one-shot parse pass of the dags folder regardless of order.
How to reproduce
- Obtain a recent docker image, such as:
apache/airflow:2.10.4-python3.11 - Spin up a container and open a docker shell.
- Add the following DAG to the
dagsfolder:
from pendulum import datetime
from airflow import DAG
from airflow.datasets import DatasetAlias
with DAG(
dag_id="foo",
start_date=datetime(2000, 1, 1),
schedule=[
DatasetAlias("bar"),
],
catchup=False,
):
pass- Open a python shell, and run:
from airflow.models import DagBag
DagBag(include_examples=False)Operating System
Rocky Linux 9.3
Versions of Apache Airflow Providers
(Irrelevant)
Deployment
Other Docker-based deployment
Deployment details
Dockerfile-ci:
FROM apache/airflow:2.10.2-python3.9
# Install system packages
USER root
RUN apt-get update \
&& apt-get install -y --no-install-recommends build-essential vim strace iproute2 git \
pkg-config libxml2-dev libxmlsec1-dev libxmlsec1-openssl \
&& apt-get autoremove -yqq --purge \
&& apt-get clean \
&& rm -rf /var/lib/apt/lists/*
USER airflow
ci script:
script:
- git config --global --add safe.directory $PWD
- pip install uv pre-commit-uv --upgrade
- uv pip install -e .[dev] --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.10.2/constraints-3.9.txt"
- uv tool install pre-commit --force --with pre-commit-uv --force-reinstall
- export PIP_USER=false && pre-commit install --install-hooks
- pre-commit run --all-files --show-diff-on-failureAnything else?
Presumably, this issue isn't present in a "running" Airflow instance where there is at least one DAG that outputs a DatasetAlias, which causes the necessary tables to be created, and then the 2nd parse of the alias-scheduled DAG succeeds.
I believe this happens because the docker image doesn't come with an initialized sqlite database, an issue that surfaces when the DagBag is built.
As a workaround, I tried adding an airflow db migrate step to the image creation, and while this did help with the manual test explained above, the CI kept failing. Adding airflow db migrate to the CI script itself similarly did not help.
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct