Skip to content

Fix async engine missing pool_recycle and pool_pre_ping configuration#65276

Open
kaxil wants to merge 1 commit intoapache:mainfrom
astronomer:fix/async-engine-pool-config
Open

Fix async engine missing pool_recycle and pool_pre_ping configuration#65276
kaxil wants to merge 1 commit intoapache:mainfrom
astronomer:fix/async-engine-pool-config

Conversation

@kaxil
Copy link
Copy Markdown
Member

@kaxil kaxil commented Apr 15, 2026

The async SQLAlchemy engine was created with zero pool configuration while the sync engine got pool_size, pool_recycle, pool_pre_ping, and max_overflow from [database] config.

How this is hit

The async engine serves all Execution API requests that go through async session dependencies. The API server (gunicorn) creates the async engine at startup via _configure_async_session(). Without pool health settings:

  • pool_recycle=-1 (SQLAlchemy default) -- connections are never recycled, so connections that exceed PostgreSQL's idle_in_transaction_session_timeout or pgbouncer's server_idle_timeout sit dead in the pool
  • pool_pre_ping=False (SQLAlchemy default) -- dead connections are never detected before checkout, so the first query on a stale connection fails with a closed-connection error

The sync engine has had these settings since day one. The async engine missed them when _configure_async_session() was extracted from configure_orm() in PR #51920.

Fix

Read the same [database] config section values (pool_size, pool_recycle, pool_pre_ping, max_overflow) for the async engine. Also respect SQL_ALCHEMY_POOL_ENABLED=False by using NullPool, matching the sync engine behavior for pgbouncer setups that require it.

The create_async_metadata_engine() signature gains an optional engine_args parameter (defaulting to None for backward compatibility with existing airflow_local_settings.py overrides). Upstream PR should update the cluster-policies.rst docs to show the new signature.

@kaxil kaxil requested review from dstandish and uranusjr April 15, 2026 01:52
@kaxil kaxil added this to the Airflow 3.2.2 milestone Apr 15, 2026
Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

Just one comment

Comment thread airflow-core/src/airflow/settings.py
@kaxil kaxil force-pushed the fix/async-engine-pool-config branch from 822ad26 to 8df1808 Compare April 15, 2026 10:42
Comment thread airflow-core/src/airflow/settings.py Outdated
@kaxil kaxil force-pushed the fix/async-engine-pool-config branch from 8df1808 to 0e78bfa Compare April 15, 2026 13:18
The async SQLAlchemy engine was created without any pool health
settings while the sync engine got pool_size, pool_recycle,
pool_pre_ping, and max_overflow from [database] config. This meant
dead connections from PostgreSQL idle timeouts or pgbouncer disconnects
were never detected by the async pool.

Read the same [database] config values for the async engine. Also
respect SQL_ALCHEMY_POOL_ENABLED=False by using NullPool, matching
the sync engine behavior.
@kaxil kaxil force-pushed the fix/async-engine-pool-config branch from 0e78bfa to 1c730ac Compare April 15, 2026 16:14
Copy link
Copy Markdown
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can have a 'common' function that builds the 'common' part only of the engine args.

And then let the async/sync specific parts like you have now.

But that's not super important, if you want to go with full duplication that's not a big deal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants