Skip to content

AIP-103: Adding ability for per task state key retention from operators#66699

Open
amoghrajesh wants to merge 13 commits into
apache:mainfrom
astronomer:aip-103-4c-per-key-expiry-task-sdk
Open

AIP-103: Adding ability for per task state key retention from operators#66699
amoghrajesh wants to merge 13 commits into
apache:mainfrom
astronomer:aip-103-4c-per-key-expiry-task-sdk

Conversation

@amoghrajesh
Copy link
Copy Markdown
Contributor

Built on top of #66463, so only last commit is relevant.

Why?

The global [state_store] default_retention_days config applies one retention window to every task state key. Some keys have meaningfully different lifetimes, ie: a submitted job ID is useful for the life of a run, while a short lived lock key might need only hours. Adding an ability to express per-key retention without changing the global default for everything.

Current behaviour

All task state keys written via PUT /state/ti/{id}/{key} receive expires_at = now + default_retention_days, regardless of the individual key's intended lifetime as done in: #66463.

Current behaviour

All task state keys written via PUT /state/ti/{id}/{key} receive expires_at = now + default_retention_days, regardless of the individual key's intended lifetime.

Proposed change

Adds an optional retention_days field to TaskStatePutBody. The operator-supplied value always takes precedence over the global config:

  • Positive -- expire this key in N days, overriding the global default.
  • 0 -- never expire this key (expires_at = NULL), regardless of the global default.
  • None / Omitted -- fall back to [state_store] default_retention_days as before.

Changes span across the full stack: TaskStatePutBody, BaseStateBackend.set() / aset(), MetastoreStateBackend._set_task_state() / _aset_task_state(), and the execution API route.

User implications / backcompat

None. BaseStateBackend.set() and aset() gain a new retention_days: int | None = None keyword argument.

Usage: backend.set(..., retention_days=7) to override the retention_days for that field.

Testing

Using this dag to set a couple of task_states:

from datetime import datetime

from airflow.sdk import task, DAG

with DAG(dag_id="my_dag_for_task_state_retention_days", schedule=None, start_date=datetime(2022, 3, 4)) as dag:

    @task
    def t1(context):
        task_state = context["task_state"]

        task_state.set("short_lived_key", "short_lived_value", retention_days=1)
        task_state.set("long_lived_key", "long_lived_value", retention_days=25)
        task_state.set("no_lifetime_key", "forever_alive", retention_days=0)
        task_state.set("default_lifetime_key", "default_lifetime_value")


    t1()

Output:

image
  • short_lived_key has expiry as tomorrow (may 12)
  • long_lived_key has expiry as 05 June (25 days later)
  • no_lifetime_key has no lifetime, ie: doesn't expire
  • default_lifetime_key has lifetime as 30 days (2026-06-10 12:33:41) - default

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg Bot added area:API Airflow's REST/HTTP API area:CLI area:ConfigTemplates area:db-migrations PRs with DB migration area:Scheduler including HA (high availability) scheduler area:task-sdk labels May 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:CLI area:ConfigTemplates area:db-migrations PRs with DB migration area:Scheduler including HA (high availability) scheduler area:task-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants