Skip to content

Add asset and task state UI#67292

Draft
bbovenzi wants to merge 1 commit into
apache:mainfrom
astronomer:feat-task-state-ui
Draft

Add asset and task state UI#67292
bbovenzi wants to merge 1 commit into
apache:mainfrom
astronomer:feat-task-state-ui

Conversation

@bbovenzi
Copy link
Copy Markdown
Contributor

@bbovenzi bbovenzi commented May 21, 2026

Add CRUDs action for Asset and Task State

Asset State:

  • Create a new tab navigation on an asset page to switch between events and asset state

Task State:

  • Move xcoms and task state into a "Storage" tab with an xcom and task state sub-tabs. Xcoms url is preserved.
Screenshot 2026-05-20 at 6 04 12 PM Screenshot 2026-05-20 at 6 04 35 PM
Was generative AI tooling used to co-author this PR?
  • Yes Claude Sonnet 4.6

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

Copy link
Copy Markdown
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two thins related to editing a task state:

  1. When I try to edit a task state, this is how the popup looks like:
Image

Is it possible to pretty print here too?

  1. Delete persists correct value to DB but the edit dialog later shows then old value
image

Comment on lines +46 to +89
const getColumns = ({ assetId, translate }: ColumnsProps): Array<ColumnDef<AssetStateResponse>> => [
{
accessorKey: "key",
cell: ({ row: { original } }) => <Text>{original.key}</Text>,
header: translate("assetState.columns.key"),
},
{
accessorKey: "value",
cell: ({ row: { original } }) => {
let parsed: unknown;

try {
parsed = JSON.parse(original.value);
} catch {
// not JSON — render as plain text
}
const isJsonObject = parsed !== null && parsed !== undefined && typeof parsed === "object";

return isJsonObject ? (
<RenderedJsonField collapsed content={parsed as object} enableClipboard={false} />
) : (
<TruncatedText text={original.value} />
);
},
enableSorting: false,
header: translate("assetState.columns.value"),
},
{
accessorKey: "updated_at",
cell: ({ row: { original } }) => <Time datetime={original.updated_at} />,
header: translate("assetState.columns.updatedAt"),
},
{
accessorKey: "actions",
cell: ({ row: { original } }) => (
<Flex justifyContent="end">
<EditAssetStateButton assetId={assetId} stateKey={original.key} />
<DeleteAssetStateButton assetId={assetId} stateKey={original.key} />
</Flex>
),
enableSorting: false,
header: "",
},
];
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asset states do not expire theoritically, so is it worth adding a column for all rows saying "Expiry" - "Never"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get this. Why would we add a column that always says expires never?

@amoghrajesh
Copy link
Copy Markdown
Contributor

One other thing I noticed about adding a task / asset state is, UI allows adding things like incomplete jsons, something like:

curl --location --request PUT 'http://localhost:28080/api/v2/dags/my_dag/dagRuns/manual__2026-05-22T07:59:31.188183+00:00/taskInstances/t1/states/job_id' \
--header 'Content-Type: application/json' \
--header 'Authorization: ••••••' \
--data '{
    "value": "incomplete
}'

The API responds with 422 error but UI is sending it as:

curl 'http://localhost:28080/api/v2/dags/my_dag/dagRuns/manual__2026-05-22T07:59:31.188183+00:00/taskInstances/t1/states/abcd?map_index=-1' \
  -X 'PUT' \
  --data-raw '{"value":"{\"abcd\": \"a}"}'

I think that the UI should validate the form field before constructing the JSON body to send


export const TaskStatePage = () => {
const { dagId = "~", mapIndex = "-1", runId = "~", taskId = "~" } = useParams();
const parsedMapIndex = mapIndex === "-1" || mapIndex === "~" ? -1 : parseInt(mapIndex, 10);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When no mapIndex is in the URL, useParams() returns undefined, not "-1". non-mapped tasks could get the wrong value.

Copy link
Copy Markdown
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the awesome work, @bbovenzi!

Looking really nice, I left some comments on the PRs for issues after testing these many things, you have my dags but the last one for mapped is here:

from __future__ import annotations

import json
import random
from datetime import datetime, timezone

from airflow.sdk import DAG, task

TABLES = ["orders", "customers", "products"]

with DAG(
    dag_id="example_task_state_mapped",
    schedule=None,
    start_date=datetime(2026, 1, 1),
    catchup=False,
    tags=["example", "aip-103", "task-state", "mapped"],
    doc_md=__doc__,
) as dag:

    @task
    def get_tables() -> list[str]:
        """Return the list of tables to process."""
        return TABLES

    @task
    def process_table(table: str, **context) -> dict:
        """Process one table — each mapped instance gets its own task state."""
        ts = context["task_state"]
        map_index = context["task_instance"].map_index

        row_count = random.randint(100, 10000)
        result = {
            "table": table,
            "map_index": map_index,
            "row_count": row_count,
            "processed_at": datetime.now(tz=timezone.utc).isoformat(timespec="seconds"),
        }

        ts.set("table", table)
        ts.set("status", "complete")
        ts.set("row_count", str(row_count))
        ts.set("result", json.dumps(result))

        print(f"[map_index={map_index}] Processed {table}: {row_count} rows")
        return result

    tables = get_tables()
    process_table.expand(table=tables)

Task State — Spark DAG

  • All keys visible after a completed run (job_id, submitted_at, status, poll_result, completed_at)
  • poll_result JSON is pretty-printed
  • job_id shows Never in Expires At, other keys show a date
  • After retry-reattach: same job_id persists, status updates to complete
  • Delete a single key — row gone, others intact
  • Edit a key — new value shows immediately
  • Clear all — table goes empty

Asset State — Watermark DAG

  • First run: watermark, total_runs=1, last_run_summary appear on asset detail page
  • Subsequent runs: total_runs increments, watermark advances, prev_watermark matches previous run
  • Consumer DAG fires automatically after each producer run
  • Clear asset state then re-trigger: total_runs=1, prev_watermark=null

Mapped Tasks — Mapped DAG (example_task_state_mapped)

  • Trigger DAG — 3 mapped instances run (map_index 0, 1, 2 for orders/customers/products)
  • Each mapped TI shows its own table, row_count, result in Storage tab — no bleed between instances
  • Switching between map_index 0/1/2 in the UI shows different state values
  • Clear single instance (map_index=0) — only that instance's state is gone, others intact
  • Clear all (all_map_indices=true) — state wiped across all 3 instances

@amoghrajesh
Copy link
Copy Markdown
Contributor

I found another bug related to the core API where editing a task state field overwrote the expiry to NEVER and also we didn't provide an option for users to set expiry for a task state when creating a new one. For the task state Storage tab, here's how the UI should call the API:

Adding a new task state:

image

So the modal for adding a task state ^ will need a new field for expiry_date - I imagine a datetime picker along with something that can serve three expiry options:

  1. "default" pre-selected
  2. Maybe a radio button for "never expire"
  3. Datetime picker for selecting a datetime for expiry

Once a value is picked, call the PUT /states/{key} with:

  • {"value": "...", "expires_at": "default"} — apply server default retention
  • {"value": "...", "expires_at": null} — never expire
    {"value": "...", "expires_at": "2026-06-01T00:00:00Z"} — specific datetime from the date picker

Editing an existing task state:
Edit existing key (value only) now can call the PATCH /states/{key} with {"value": "..."}. Expiry is always preserved, no expiry field needed.

This is being fixed in #67319

@amoghrajesh amoghrajesh moved this from Backlog to In progress in AIP-103: Task State Management May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:translations area:UI Related to UI/UX. For Frontend Developers. translation:default

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

2 participants