Refuse secrets-backend fallback on Execution-API authz deny#66575
Merged
potiuk merged 3 commits intoMay 19, 2026
Conversation
ExecutionAPISecretsBackend.get_connection / get_variable returned None on every ErrorResponse, conflating "not found" with "explicitly denied". The secrets-backend dispatcher then fell through to the next backend (typically EnvironmentVariablesBackend, which performs no authz checks) on a 401/403 from the Execution API -- letting tasks read secrets the Execution API had just denied them. The audit calls this a Type C gap: the authz control fires, but its rejection result is treated as a miss and routed around. Three-part fix: 1. New `ErrorType.PERMISSION_DENIED` distinct from `API_SERVER_ERROR`. 2. `ConnectionOperations.get` and `VariableOperations.get` map the API server's 401/403 to `ErrorResponse(PERMISSION_DENIED, ...)` instead of re-raising as a generic `ServerResponseError`. 404 still maps to `*_NOT_FOUND`; other statuses still raise so the existing API_SERVER_ERROR translation in `handle_requests` keeps working. 3. `ExecutionAPISecretsBackend` (sync + async, connection + variable variants) now raises `PermissionError` on `PERMISSION_DENIED`. The surrounding `except Exception:` blocks explicitly re-raise `PermissionError` so the secrets-backend dispatcher sees it. NOT_FOUND types continue to return `None` (allow fallthrough); other ErrorResponses also continue to return `None` (preserve existing recovery behaviour for transient errors). Tests added: - `client.connections.get` and `client.variables.get` return `ErrorResponse(PERMISSION_DENIED)` on 401 and 403 (parametrised). - `ExecutionAPISecretsBackend.get_connection` / `get_variable` / `aget_connection` / `aget_variable` raise `PermissionError` when the response is `PERMISSION_DENIED`, with the resource and key in the message. Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-017).
e522ad9 to
5063192
Compare
Member
Author
|
I'd love to get this one merged — and would love it in 3.2.2 if it's not too late. cc @vatsrahul1001 (3.2.2 RM) Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting |
Contributor
|
LGTM! ready for maintainer review |
vatsrahul1001
approved these changes
May 18, 2026
kaxil
reviewed
May 18, 2026
kaxil
reviewed
May 18, 2026
kaxil
reviewed
May 18, 2026
kaxil
reviewed
May 18, 2026
Member
As-is, it isn't ready, check the comments above |
The earlier change made ExecutionAPISecretsBackend raise on 401/403, but the dispatcher loops in airflow.sdk.execution_time.context and the airflow-core get_*_from_secrets paths catch Exception and silently fall through to the next backend — so the deny was still being swallowed. Introduce AirflowSecretsBackendAccessDenied (subclass of PermissionError) so the dispatchers can special-case the authoritative deny without mis-treating an incidental OSError-family PermissionError from inside an unrelated backend. Patch the three task-SDK dispatcher loops and the two airflow-core dispatcher loops to re-raise it before the generic except. Add TestDispatcherRefusesFallbackOnDeny with three end-to-end tests that insert a spy backend after ExecutionAPISecretsBackend and assert the spy is never called once the first backend raises the deny — pinning the dispatcher behaviour, not just the backend's. Also hoist the repeated imports in test_secrets.py to module top per review feedback.
Contributor
Backport successfully created: v3-2-testNote: As of Merging PRs targeted for Airflow 3.X In matter of doubt please ask in #release-management Slack channel.
|
vatsrahul1001
pushed a commit
that referenced
this pull request
May 19, 2026
…ny (#66575) (#67173) * Refuse secrets-backend fallback on Execution-API authz deny ExecutionAPISecretsBackend.get_connection / get_variable returned None on every ErrorResponse, conflating "not found" with "explicitly denied". The secrets-backend dispatcher then fell through to the next backend (typically EnvironmentVariablesBackend, which performs no authz checks) on a 401/403 from the Execution API -- letting tasks read secrets the Execution API had just denied them. The audit calls this a Type C gap: the authz control fires, but its rejection result is treated as a miss and routed around. Three-part fix: 1. New `ErrorType.PERMISSION_DENIED` distinct from `API_SERVER_ERROR`. 2. `ConnectionOperations.get` and `VariableOperations.get` map the API server's 401/403 to `ErrorResponse(PERMISSION_DENIED, ...)` instead of re-raising as a generic `ServerResponseError`. 404 still maps to `*_NOT_FOUND`; other statuses still raise so the existing API_SERVER_ERROR translation in `handle_requests` keeps working. 3. `ExecutionAPISecretsBackend` (sync + async, connection + variable variants) now raises `PermissionError` on `PERMISSION_DENIED`. The surrounding `except Exception:` blocks explicitly re-raise `PermissionError` so the secrets-backend dispatcher sees it. NOT_FOUND types continue to return `None` (allow fallthrough); other ErrorResponses also continue to return `None` (preserve existing recovery behaviour for transient errors). Tests added: - `client.connections.get` and `client.variables.get` return `ErrorResponse(PERMISSION_DENIED)` on 401 and 403 (parametrised). - `ExecutionAPISecretsBackend.get_connection` / `get_variable` / `aget_connection` / `aget_variable` raise `PermissionError` when the response is `PERMISSION_DENIED`, with the resource and key in the message. Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-017). * Refuse secrets-backend fallback at the dispatcher, not only the backend The earlier change made ExecutionAPISecretsBackend raise on 401/403, but the dispatcher loops in airflow.sdk.execution_time.context and the airflow-core get_*_from_secrets paths catch Exception and silently fall through to the next backend — so the deny was still being swallowed. Introduce AirflowSecretsBackendAccessDenied (subclass of PermissionError) so the dispatchers can special-case the authoritative deny without mis-treating an incidental OSError-family PermissionError from inside an unrelated backend. Patch the three task-SDK dispatcher loops and the two airflow-core dispatcher loops to re-raise it before the generic except. Add TestDispatcherRefusesFallbackOnDeny with three end-to-end tests that insert a spy backend after ExecutionAPISecretsBackend and assert the spy is never called once the first backend raises the deny — pinning the dispatcher behaviour, not just the backend's. Also hoist the repeated imports in test_secrets.py to module top per review feedback. * Hoist AirflowSecretsBackendAccessDenied imports to module top (cherry picked from commit 2b8c805) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
vatsrahul1001
pushed a commit
that referenced
this pull request
May 20, 2026
…ny (#66575) (#67173) * Refuse secrets-backend fallback on Execution-API authz deny ExecutionAPISecretsBackend.get_connection / get_variable returned None on every ErrorResponse, conflating "not found" with "explicitly denied". The secrets-backend dispatcher then fell through to the next backend (typically EnvironmentVariablesBackend, which performs no authz checks) on a 401/403 from the Execution API -- letting tasks read secrets the Execution API had just denied them. The audit calls this a Type C gap: the authz control fires, but its rejection result is treated as a miss and routed around. Three-part fix: 1. New `ErrorType.PERMISSION_DENIED` distinct from `API_SERVER_ERROR`. 2. `ConnectionOperations.get` and `VariableOperations.get` map the API server's 401/403 to `ErrorResponse(PERMISSION_DENIED, ...)` instead of re-raising as a generic `ServerResponseError`. 404 still maps to `*_NOT_FOUND`; other statuses still raise so the existing API_SERVER_ERROR translation in `handle_requests` keeps working. 3. `ExecutionAPISecretsBackend` (sync + async, connection + variable variants) now raises `PermissionError` on `PERMISSION_DENIED`. The surrounding `except Exception:` blocks explicitly re-raise `PermissionError` so the secrets-backend dispatcher sees it. NOT_FOUND types continue to return `None` (allow fallthrough); other ErrorResponses also continue to return `None` (preserve existing recovery behaviour for transient errors). Tests added: - `client.connections.get` and `client.variables.get` return `ErrorResponse(PERMISSION_DENIED)` on 401 and 403 (parametrised). - `ExecutionAPISecretsBackend.get_connection` / `get_variable` / `aget_connection` / `aget_variable` raise `PermissionError` when the response is `PERMISSION_DENIED`, with the resource and key in the message. Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-017). * Refuse secrets-backend fallback at the dispatcher, not only the backend The earlier change made ExecutionAPISecretsBackend raise on 401/403, but the dispatcher loops in airflow.sdk.execution_time.context and the airflow-core get_*_from_secrets paths catch Exception and silently fall through to the next backend — so the deny was still being swallowed. Introduce AirflowSecretsBackendAccessDenied (subclass of PermissionError) so the dispatchers can special-case the authoritative deny without mis-treating an incidental OSError-family PermissionError from inside an unrelated backend. Patch the three task-SDK dispatcher loops and the two airflow-core dispatcher loops to re-raise it before the generic except. Add TestDispatcherRefusesFallbackOnDeny with three end-to-end tests that insert a spy backend after ExecutionAPISecretsBackend and assert the spy is never called once the first backend raises the deny — pinning the dispatcher behaviour, not just the backend's. Also hoist the repeated imports in test_secrets.py to module top per review feedback. * Hoist AirflowSecretsBackendAccessDenied imports to module top (cherry picked from commit 2b8c805) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
vatsrahul1001
pushed a commit
that referenced
this pull request
May 20, 2026
…ny (#66575) (#67173) * Refuse secrets-backend fallback on Execution-API authz deny ExecutionAPISecretsBackend.get_connection / get_variable returned None on every ErrorResponse, conflating "not found" with "explicitly denied". The secrets-backend dispatcher then fell through to the next backend (typically EnvironmentVariablesBackend, which performs no authz checks) on a 401/403 from the Execution API -- letting tasks read secrets the Execution API had just denied them. The audit calls this a Type C gap: the authz control fires, but its rejection result is treated as a miss and routed around. Three-part fix: 1. New `ErrorType.PERMISSION_DENIED` distinct from `API_SERVER_ERROR`. 2. `ConnectionOperations.get` and `VariableOperations.get` map the API server's 401/403 to `ErrorResponse(PERMISSION_DENIED, ...)` instead of re-raising as a generic `ServerResponseError`. 404 still maps to `*_NOT_FOUND`; other statuses still raise so the existing API_SERVER_ERROR translation in `handle_requests` keeps working. 3. `ExecutionAPISecretsBackend` (sync + async, connection + variable variants) now raises `PermissionError` on `PERMISSION_DENIED`. The surrounding `except Exception:` blocks explicitly re-raise `PermissionError` so the secrets-backend dispatcher sees it. NOT_FOUND types continue to return `None` (allow fallthrough); other ErrorResponses also continue to return `None` (preserve existing recovery behaviour for transient errors). Tests added: - `client.connections.get` and `client.variables.get` return `ErrorResponse(PERMISSION_DENIED)` on 401 and 403 (parametrised). - `ExecutionAPISecretsBackend.get_connection` / `get_variable` / `aget_connection` / `aget_variable` raise `PermissionError` when the response is `PERMISSION_DENIED`, with the resource and key in the message. Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-017). * Refuse secrets-backend fallback at the dispatcher, not only the backend The earlier change made ExecutionAPISecretsBackend raise on 401/403, but the dispatcher loops in airflow.sdk.execution_time.context and the airflow-core get_*_from_secrets paths catch Exception and silently fall through to the next backend — so the deny was still being swallowed. Introduce AirflowSecretsBackendAccessDenied (subclass of PermissionError) so the dispatchers can special-case the authoritative deny without mis-treating an incidental OSError-family PermissionError from inside an unrelated backend. Patch the three task-SDK dispatcher loops and the two airflow-core dispatcher loops to re-raise it before the generic except. Add TestDispatcherRefusesFallbackOnDeny with three end-to-end tests that insert a spy backend after ExecutionAPISecretsBackend and assert the spy is never called once the first backend raises the deny — pinning the dispatcher behaviour, not just the backend's. Also hoist the repeated imports in test_secrets.py to module top per review feedback. * Hoist AirflowSecretsBackendAccessDenied imports to module top (cherry picked from commit 2b8c805) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
vatsrahul1001
pushed a commit
that referenced
this pull request
May 21, 2026
…ny (#66575) (#67173) * Refuse secrets-backend fallback on Execution-API authz deny ExecutionAPISecretsBackend.get_connection / get_variable returned None on every ErrorResponse, conflating "not found" with "explicitly denied". The secrets-backend dispatcher then fell through to the next backend (typically EnvironmentVariablesBackend, which performs no authz checks) on a 401/403 from the Execution API -- letting tasks read secrets the Execution API had just denied them. The audit calls this a Type C gap: the authz control fires, but its rejection result is treated as a miss and routed around. Three-part fix: 1. New `ErrorType.PERMISSION_DENIED` distinct from `API_SERVER_ERROR`. 2. `ConnectionOperations.get` and `VariableOperations.get` map the API server's 401/403 to `ErrorResponse(PERMISSION_DENIED, ...)` instead of re-raising as a generic `ServerResponseError`. 404 still maps to `*_NOT_FOUND`; other statuses still raise so the existing API_SERVER_ERROR translation in `handle_requests` keeps working. 3. `ExecutionAPISecretsBackend` (sync + async, connection + variable variants) now raises `PermissionError` on `PERMISSION_DENIED`. The surrounding `except Exception:` blocks explicitly re-raise `PermissionError` so the secrets-backend dispatcher sees it. NOT_FOUND types continue to return `None` (allow fallthrough); other ErrorResponses also continue to return `None` (preserve existing recovery behaviour for transient errors). Tests added: - `client.connections.get` and `client.variables.get` return `ErrorResponse(PERMISSION_DENIED)` on 401 and 403 (parametrised). - `ExecutionAPISecretsBackend.get_connection` / `get_variable` / `aget_connection` / `aget_variable` raise `PermissionError` when the response is `PERMISSION_DENIED`, with the resource and key in the message. Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-017). * Refuse secrets-backend fallback at the dispatcher, not only the backend The earlier change made ExecutionAPISecretsBackend raise on 401/403, but the dispatcher loops in airflow.sdk.execution_time.context and the airflow-core get_*_from_secrets paths catch Exception and silently fall through to the next backend — so the deny was still being swallowed. Introduce AirflowSecretsBackendAccessDenied (subclass of PermissionError) so the dispatchers can special-case the authoritative deny without mis-treating an incidental OSError-family PermissionError from inside an unrelated backend. Patch the three task-SDK dispatcher loops and the two airflow-core dispatcher loops to re-raise it before the generic except. Add TestDispatcherRefusesFallbackOnDeny with three end-to-end tests that insert a spy backend after ExecutionAPISecretsBackend and assert the spy is never called once the first backend raises the deny — pinning the dispatcher behaviour, not just the backend's. Also hoist the repeated imports in test_secrets.py to module top per review feedback. * Hoist AirflowSecretsBackendAccessDenied imports to module top (cherry picked from commit 2b8c805) Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The Execution API's authorization decision was being routed around by the
secrets-backend dispatcher:
ExecutionAPISecretsBackend.get_connection/get_variablereturnedNoneon everyErrorResponse, conflating "notfound" with "explicitly denied". And even with the backend patched to raise
on a 401/403, the dispatcher loops in
airflow/sdk/execution_time/context.py,airflow/models/connection.py,and
airflow/models/variable.pycatchExceptionand continue to thenext backend — so the deny would still have been swallowed silently.
Scope of the gap
In the default worker chain
(
[EnvironmentVariablesBackend, ExecutionAPISecretsBackend]),ExecutionAPISecretsBackendis last — there is no later backend for thedispatcher to fall through to, and the gap does not manifest. The
fall-through-leak scenario requires a non-default configuration: a
reordering, or a custom
[secrets] backendplaced afterExecutionAPISecretsBackend. Even there, silently downgrading anauthoritative deny to "next backend, please" is wrong on principle: the
audit calls this a Type C gap — the authz control fires, but its
rejection is treated as a miss and routed around.
Fix (four parts)
1. New
ErrorType.PERMISSION_DENIEDDistinct from
API_SERVER_ERRORso callers can dispatch on the cause.2. Client maps 401/403 to
PERMISSION_DENIEDConnectionOperations.getandVariableOperations.gettranslate the APIserver's 401/403 to
ErrorResponse(PERMISSION_DENIED, ...)instead ofre-raising as a generic
ServerResponseError. 404 still maps to*_NOT_FOUND; other statuses still raise so the existingAPI_SERVER_ERRORtranslation inhandle_requestskeeps working.3. New
AirflowSecretsBackendAccessDenied(PermissionError)Distinct, dispatcher-aware exception class. Subclasses
PermissionErrorso any existing
except PermissionError:handler still matches, but isnarrow enough that the dispatcher can re-raise only this signal
without accidentally promoting an incidental filesystem
OSError-familyPermissionErrorfrom inside an unrelated backend.ExecutionAPISecretsBackend(sync + async, connection + variablevariants) raises this on
PERMISSION_DENIED. OtherErrorResponsetypes (
*_NOT_FOUND, transientAPI_SERVER_ERROR,GENERIC_ERROR)continue to return
Noneso existing recovery paths keep working.4. Dispatchers honour the deny
The three task-SDK dispatcher loops in
airflow/sdk/execution_time/context.py(_get_connection,_async_get_connection,_get_variable) and the two airflow-coredispatcher loops in
airflow/models/connection.pyandairflow/models/variable.pynow catchAirflowSecretsBackendAccessDeniedand re-raise it before thegeneric
except Exception:fall-through.Tests
client.connections.get/client.variables.getreturnErrorResponse(PERMISSION_DENIED)on 401 and 403 (parametrised).ExecutionAPISecretsBackend.get_connection/get_variable/aget_connection/aget_variableraiseAirflowSecretsBackendAccessDeniedwhen the response isPERMISSION_DENIED.TestDispatcherRefusesFallbackOnDenyinsert a spy backend AFTERExecutionAPISecretsBackendand assert it is never called oncethe first backend raises the deny — pinning the dispatcher's
re-raise behaviour, not just the backend's. Covers
_get_connection,_get_variable, and_async_get_connection.Reported by
L3 ASVS sweep — apache/tooling-agents#24 (FINDING-017).
Was generative AI tooling used to co-author this PR?
Generated-by: Claude Code (Opus 4.7) following the guidelines