Skip to content

Validate task identity token claims with a typed schema#63604

Open
henry3260 wants to merge 6 commits intoapache:mainfrom
henry3260:token-schema
Open

Validate task identity token claims with a typed schema#63604
henry3260 wants to merge 6 commits intoapache:mainfrom
henry3260:token-schema

Conversation

@henry3260
Copy link
Contributor

@henry3260 henry3260 commented Mar 14, 2026

What

  • Replaced TIToken.claims from a raw dict[str, Any] to a typed TIClaims model.
  • Added explicit JWT claim fields in TIClaims:
    • required: sub, exp, iat, nbf
    • optional: scope, aud, iss, jti
Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg bot added area:API Airflow's REST/HTTP API area:task-sdk labels Mar 14, 2026
@henry3260 henry3260 marked this pull request as ready for review March 14, 2026 17:15
Copilot AI review requested due to automatic review settings March 14, 2026 17:15
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a typed, validated schema for Task Identity Token (TI) JWT claims in the Execution API, replacing the previous “raw dict” approach to make claim access and validation more explicit and reliable.

Changes:

  • Add TIClaims (typed Pydantic model) and update TIToken.claims to use it.
  • Update JWT bearer authentication to validate decoded JWT claims against TIClaims.
  • Adjust unit tests and Cadwyn version bundle metadata to reflect the schema change.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
airflow-core/src/airflow/api_fastapi/execution_api/datamodels/token.py Introduces TIClaims and makes TIToken.claims typed.
airflow-core/src/airflow/api_fastapi/execution_api/security.py Validates decoded JWT claims into TIClaims and uses typed access to scope.
airflow-core/src/airflow/api_fastapi/execution_api/app.py Updates in-process auth override to construct TIClaims.
airflow-core/src/airflow/api_fastapi/execution_api/versions/v2026_03_31.py Adds a version-change note describing the TIToken.claims schema change.
airflow-core/src/airflow/api_fastapi/execution_api/versions/init.py Registers the new version change in the bundle.
airflow-core/tests/unit/api_fastapi/execution_api/conftest.py Updates mocked JWT bearer token to use TIClaims.
airflow-core/tests/unit/api_fastapi/execution_api/test_security.py Adds TIClaims behavior test and updates token construction to typed claims.
airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_task_instances.py Updates mocked validator claims to include required JWT time claims and aligns expected error text.

You can also share your feedback on Copilot code review. Take the survey.

@Vamsi-klu
Copy link
Contributor

Hey @henry3260, nice improvement — replacing the raw dict[str, Any] claims with a typed Pydantic model makes the token structure self-documenting and catches malformed tokens earlier. A few things I wanted to flag:

The Cadwyn version change may not have any effect. TIToken is an internal security dependency injected via FastAPI's DI system — it's not an API request or response model. Cadwyn's schema() migration instructions only apply to models that appear in route signatures. Since TIToken never shows up in any endpoint's request/response body, ValidateTaskIdentityTokenClaims would be inert. All other version changes in the codebase operate on actual API models (DagRun, TIRunContext, etc.). I'd suggest removing the version change to avoid giving the impression that TIToken is part of the versioned API surface.

TokenScope duplicates TokenType in security.py. The PR introduces TokenScope = Literal["execution", "workload"] in token.py, but security.py:86 already has TokenType = Literal["execution", "workload"] with VALID_TOKEN_TYPES = frozenset(get_args(TokenType)). If someone adds a new token type, they'd need to update both. Worth unifying these into a single source of truth.

The error message for invalid scopes changes. Previously, an invalid scope produced a clean "Invalid token scope" message from require_auth. Now it's caught earlier in JWTBearer.__call__ and produces a Pydantic ValidationError dump, which is less human-readable. Not a blocker since it's still a 403, but worth being aware of for debugging.

A couple of test gaps:

  • No test for the primary validation benefit — a sub that's not a valid UUID being rejected by the Pydantic UUID type. That's arguably the most valuable thing the typed schema adds.
  • No test for TIClaims rejecting incomplete claim sets (e.g., missing exp). PyJWT would catch this first in production, but a unit test for the model itself would document the contract.

The core change is solid and the extra="allow" config is the right call for forward compatibility. Just needs the version change reconsidered and the type duplication cleaned up.

@henry3260
Copy link
Contributor Author

henry3260 commented Mar 15, 2026

Hey @henry3260, nice improvement — replacing the raw dict[str, Any] claims with a typed Pydantic model makes the token structure self-documenting and catches malformed tokens earlier. A few things I wanted to flag:

The Cadwyn version change may not have any effect. TIToken is an internal security dependency injected via FastAPI's DI system — it's not an API request or response model. Cadwyn's schema() migration instructions only apply to models that appear in route signatures. Since TIToken never shows up in any endpoint's request/response body, ValidateTaskIdentityTokenClaims would be inert. All other version changes in the codebase operate on actual API models (DagRun, TIRunContext, etc.). I'd suggest removing the version change to avoid giving the impression that TIToken is part of the versioned API surface.

TokenScope duplicates TokenType in security.py. The PR introduces TokenScope = Literal["execution", "workload"] in token.py, but security.py:86 already has TokenType = Literal["execution", "workload"] with VALID_TOKEN_TYPES = frozenset(get_args(TokenType)). If someone adds a new token type, they'd need to update both. Worth unifying these into a single source of truth.

The error message for invalid scopes changes. Previously, an invalid scope produced a clean "Invalid token scope" message from require_auth. Now it's caught earlier in JWTBearer.__call__ and produces a Pydantic ValidationError dump, which is less human-readable. Not a blocker since it's still a 403, but worth being aware of for debugging.

A couple of test gaps:

  • No test for the primary validation benefit — a sub that's not a valid UUID being rejected by the Pydantic UUID type. That's arguably the most valuable thing the typed schema adds.
  • No test for TIClaims rejecting incomplete claim sets (e.g., missing exp). PyJWT would catch this first in production, but a unit test for the model itself would document the contract.

The core change is solid and the extra="allow" config is the right call for forward compatibility. Just needs the version change reconsidered and the type duplication cleaned up.

I think we still need to add the Cadwyn version changes. If we modify any files in airflow-core/src/airflow/api_fastapi/execution_api/datamodels/, we need to handle the API migration.

@henry3260
Copy link
Contributor Author

I think it's unrelated CI failure :(

@arnoldmr01
Copy link
Contributor

LGTM. With my comprehension, I think the response convert is not necessary in this case cause the TIClaim already contained those required field. Thanks for your work!!

from airflow.api_fastapi.execution_api.datamodels.token import TIToken


class ValidateTaskIdentityTokenClaims(VersionChange):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the name of this class should be changed, because it only handles versioning and describes upgrade/downgrade behavior, rather than validating the TIToken. The name can be ChangeTITokenClaimsToTIClaims or other better names.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:task-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants