Skip to content

Fix JWT token not refreshed when token expires between auth check and reissue middleware#67944

Open
GayathriSrividya wants to merge 3 commits into
apache:mainfrom
GayathriSrividya:fix/jwt-token-refresh-67939
Open

Fix JWT token not refreshed when token expires between auth check and reissue middleware#67944
GayathriSrividya wants to merge 3 commits into
apache:mainfrom
GayathriSrividya:fix/jwt-token-refresh-67939

Conversation

@GayathriSrividya
Copy link
Copy Markdown
Contributor

@GayathriSrividya GayathriSrividya commented Jun 3, 2026

closes: #67939

Problem

Long-running tasks fail with repeated 403 errors when their JWT token expires while a heartbeat request is in-flight. The race condition is:

  1. Security middleware validates the token — it is still valid (or within its leeway).
  2. Request processing starts.
  3. The token's exp boundary is crossed during processing.
  4. JWTReissueMiddleware.dispatch calls avalidated_claims(token, {}) — this now raises ExpiredSignatureError.
  5. The exception is caught by the outer except Exception block, logged as a warning, and no Refreshed-API-Token header is set.
  6. The client receives a 403 with no refreshed token, so it cannot update its Bearer token.
  7. After MAX_FAILED_HEARTBEATS consecutive failures the supervisor kills the task.

Fix

When avalidated_claims raises ExpiredSignatureError inside JWTReissueMiddleware, retry the validation with a 60-second grace leeway (REISSUE_GRACE_LEEWAY). If the token is within that grace window its claims are extracted and a fresh replacement token is issued and returned in the Refreshed-API-Token response header.

The signature and all other claims are still fully verified; only the expiry window is relaxed in this specific code path. The new token is generated from the same claims (same sub, scope, ti_id), so there is no privilege escalation.

The client's _update_auth hook already updates the Bearer token from Refreshed-API-Token before raising on 4xx/5xx, so the next retry uses the fresh token and succeeds.

Changes

  • airflow-core/src/airflow/api_fastapi/auth/tokens.py: add extra_leeway: float = 0 keyword argument to validated_claims and avalidated_claims; pass it on top of self.leeway to jwt.decode.
  • airflow-core/src/airflow/api_fastapi/execution_api/app.py: add REISSUE_GRACE_LEEWAY = 60 class constant; catch ExpiredSignatureError on the inner avalidated_claims call and retry with the grace leeway.
  • airflow-core/tests/unit/api_fastapi/execution_api/versions/head/test_router.py: add regression test test_just_expired_token_is_reissued_within_grace_period.

@ashb
Copy link
Copy Markdown
Member

ashb commented Jun 3, 2026

I'm initially skeptical about this. The current behaviour is to issue a new token when it is 80% through it's validity.

Comment thread airflow-core/src/airflow/api_fastapi/execution_api/app.py Outdated
Gayathri Srividya Rajavarapu added 3 commits June 3, 2026 23:19
When a task's JWT token expires while a request is being processed, the
JWTReissueMiddleware was unable to refresh it: the strict avalidated_claims
call raised ExpiredSignatureError, so no Refreshed-API-Token header was
added to the response.

The supervisor retries heartbeats on non-fatal errors; without a refreshed
token in the response the client never updates its Bearer token, leading to
consecutive 403 failures that eventually kill the task.

Fix: when avalidated_claims raises ExpiredSignatureError in the reissue
path, retry with a 60-second grace leeway so tokens that expired during
request processing can still receive a replacement. The signature and all
other claims are still fully verified; only the expiry window is relaxed in
this specific code path.

closes: apache#67939
@GayathriSrividya GayathriSrividya force-pushed the fix/jwt-token-refresh-67939 branch from 5b55472 to 8c2ca2e Compare June 3, 2026 17:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:task-sdk

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Task fails to refresh the JWT token with LocalExecutor

2 participants