fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id by anmolxlight · Pull Request #64749 · apache/airflow

anmolxlight · 2026-04-05T21:48:08Z

fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id

Problem

When using EksPodOperator with aws_conn_id pointing to a cross-account IAM role (via AssumeRole), pods fail with 401 Unauthorized:

pods "simple-http-server" is forbidden: User "" cannot create resource "pods" in API group "" in the namespace "default"

The audit log shows an empty user identity: "user":{}.

Root Cause

The kubeconfig exec plugin COMMAND template in EksHook had two critical fragility points:

stderr merged into stdout via 2>&1 — Python warnings, deprecation notices, or log output from eks_get_token contaminated the stdout that bash token parsing relies on. This caused the last_line extraction to grab the wrong line, producing empty/invalid timestamp and token values.
No token validation — If parsing failed, a malformed ExecCredential JSON with an empty token was sent to the EKS API server, resulting in 401 with an empty user identity.

Same-account usage worked by accident because default MWAA execution role credentials were already in the environment, so eks_get_token produced valid output regardless of credential file sourcing.

Changes

`airflow/providers/amazon/aws/hooks/eks.py`

Redirect stderr to /dev/null (2>/dev/null) instead of merging with stdout (2>&1) to ensure clean token output for bash parsing
Add token validation: exit with error if token extraction fails
Add error messages to stderr for debugging credential issues

`tests/unit/amazon/aws/hooks/test_eks.py`

Add test_command_template_redirects_stderr: verifies stderr is redirected to /dev/null and not merged with stdout
Add test_command_template_validates_token: verifies the token validation check and error exit

Testing

# Verify the COMMAND template structure
python -c "
import sys
sys.path.insert(0, 'providers/amazon/src')
from airflow.providers.amazon.aws.hooks.eks import COMMAND
assert '2>/dev/null' in COMMAND
assert '2>&1' not in COMMAND
assert 'if [ -z \"\$token\" ]' in COMMAND
assert 'exit 1' in COMMAND
print('All checks passed')
"

Verification

To verify the fix works with cross-account AssumeRole:

Set up two AWS accounts: Account A (MWAA) and Account B (EKS)
Create an IAM role in Account B that trusts Account A's execution role
Create a connection in MWAA using Account B's role ARN
Run an EksPodOperator task with aws_conn_id set to the cross-account connection
Verify the pod is created successfully without 401 errors

The kubeconfig exec plugin COMMAND template in EksHook had two critical fragility points that caused 401 Unauthorized when using cross-account AssumeRole credentials: 1. stderr was merged into stdout via 2>&1, so any Python warnings, deprecation notices, or log output from eks_get_token contaminated the stdout that bash token parsing relies on. This caused the last_line extraction to grab the wrong line, producing empty/ invalid timestamp and token values. 2. No validation that the token was successfully extracted. If parsing failed, a malformed ExecCredential JSON with an empty token was sent to the EKS API server, resulting in 401 with an empty user identity in the audit logs ("user":{}). Same-account usage worked by accident because default MWAA execution role credentials were already in the environment, so eks_get_token produced valid output regardless of credential file sourcing. Changes: - Redirect stderr to /dev/null (2>/dev/null) instead of merging with stdout (2>&1) to ensure clean token output for bash parsing - Add token validation: exit with error if token extraction fails - Add error messages to stderr for debugging credential issues - Add unit tests verifying the COMMAND template structure Fixes apache#64657

boring-cyborg · 2026-04-05T21:48:13Z

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide (https://github.com/apache/airflow/blob/main/contributing-docs/README.rst)
Here are some useful points:

Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
Be sure to read the Airflow Coding style.
Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
Apache Airflow is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://s.apache.org/airflow-slack

o-nikolas · 2026-04-08T01:03:24Z

providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py


            if [ "$status" -ne 0 ]; then
-                printf '%s' "$output" >&2
+                printf 'eks_get_token failed with exit code %s' "$status" >&2


Should we not pipe stderr output above to a durable location (perhaps something in /tmp) instead of /dev/null and then combine it with the stdout here? The status code alone is not very helpful.

Copilot

Pull request overview

Fixes EksPodOperator authorization failures in cross-account AssumeRole setups by making the EKS kubeconfig exec-plugin token generation/parsing more robust.

Changes:

Adjust the EksHook kubeconfig exec COMMAND to avoid stderr/stdout mixing and add an explicit empty-token failure path.
Add unit tests asserting the COMMAND template no longer merges stderr into stdout and that it contains token validation logic.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File	Description
`providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py`	Updates the kubeconfig exec-plugin shell template to avoid stdout contamination and fail fast on empty token extraction.
`providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py`	Adds string-based assertions over the `COMMAND` template to guard against regressions in redirection and validation logic.

Copilot · 2026-04-10T19:59:24Z

providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py

+            # Redirect stderr to /dev/null to prevent Python warnings, deprecation
+            # notices, or other log output from contaminating stdout. The token
+            # output must be the ONLY thing on stdout for bash parsing to work.
            output=$({python_executable} -m airflow.providers.amazon.aws.utils.eks_get_token \
-                --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args} 2>&1)
+                --cluster-name {eks_cluster_name} --sts-url '{sts_url}' {args} 2>/dev/null)



Redirecting eks_get_token stderr to /dev/null discards useful error output (stack traces, botocore messages) and makes kubeconfig exec failures hard to debug. Since the original parsing issue was caused by merging stderr into stdout (2>&1), consider removing the redirection entirely (let stderr pass through) or capture stderr separately and only surface it on non-zero exit, while keeping stdout clean for parsing.

Copilot · 2026-04-10T19:59:24Z

providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py


            if [ "$status" -ne 0 ]; then
-                printf '%s' "$output" >&2
+                printf 'eks_get_token failed with exit code %s' "$status" >&2


On non-zero exit you now only print the exit code, but not the captured stdout from eks_get_token. This is a regression in diagnostics compared to printing the output, and it will make credential/STS issues much harder to troubleshoot. Consider also emitting $output (and/or captured stderr if you keep it) when status != 0.

Suggested change

printf 'eks_get_token failed with exit code %s' "$status" >&2

printf 'eks_get_token failed with exit code %s' "$status" >&2

if [ -n "$output" ]; then

printf '. Output was: %s' "$output" >&2

fi

Copilot · 2026-04-10T19:59:24Z

providers/amazon/src/airflow/providers/amazon/aws/hooks/eks.py

+                printf 'Failed to extract token from eks_get_token output. ' >&2
+                printf 'Output was: %s' "$output" >&2


The empty-token branch prints the full eks_get_token output to stderr. That output includes the EKS bearer token (see eks_get_token.py printing token: {access_token}), so this will leak credentials into task logs when parsing fails. Please redact the token before logging (or omit the output entirely) to avoid exposing bearer tokens.

Suggested change

printf 'Failed to extract token from eks_get_token output. ' >&2

printf 'Output was: %s' "$output" >&2

printf 'Failed to extract token from eks_get_token output.' >&2

Copilot · 2026-04-10T19:59:25Z

providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py

+        """Verify COMMAND template redirects stderr to /dev/null to prevent
+        Python warnings/log output from contaminating stdout and breaking
+        bash token parsing. This is critical for cross-account AssumeRole
+        scenarios where the kubeconfig exec plugin must produce a clean token."""
+        from airflow.providers.amazon.aws.hooks.eks import COMMAND
+
+        # Verify stderr is redirected to /dev/null, not merged with stdout
+        assert "2>/dev/null" in COMMAND, (
+            "COMMAND must redirect stderr to /dev/null to prevent output contamination"
+        )


This test hard-codes that stderr must be redirected to /dev/null. The actual requirement for correctness is that stderr must not be merged into stdout (i.e., avoid 2>&1) so stdout remains parseable; discarding stderr is an implementation choice and reduces debuggability. Consider relaxing the assertion to only require that 2>&1 is not present, so future changes can keep stderr visible while still fixing the parsing issue.

Suggested change

"""Verify COMMAND template redirects stderr to /dev/null to prevent

Python warnings/log output from contaminating stdout and breaking

bash token parsing. This is critical for cross-account AssumeRole

scenarios where the kubeconfig exec plugin must produce a clean token."""

from airflow.providers.amazon.aws.hooks.eks import COMMAND

# Verify stderr is redirected to /dev/null, not merged with stdout

assert "2>/dev/null" in COMMAND, (

"COMMAND must redirect stderr to /dev/null to prevent output contamination"

)

"""Verify COMMAND template keeps stderr separate from stdout so

Python warnings/log output cannot contaminate stdout and break

bash token parsing. This is critical for cross-account AssumeRole

scenarios where the kubeconfig exec plugin must produce a clean token."""

from airflow.providers.amazon.aws.hooks.eks import COMMAND

# Verify stderr is not merged with stdout. Whether stderr is discarded

# or left visible is an implementation choice.

Copilot · 2026-04-10T19:59:25Z

providers/amazon/tests/unit/amazon/aws/hooks/test_eks.py

+        # Verify it exits with error on empty token
+        assert "exit 1" in COMMAND or 'exit "$' in COMMAND, (
+            "COMMAND must exit with error when token extraction fails"


assert "exit 1" in COMMAND or 'exit "$' in COMMAND can pass even if the empty-token validation never exits, because 'exit "$' matches the existing exit "$status" earlier in the script. Tighten this assertion to specifically verify that the empty-token block exits (e.g., by checking ordering relative to if [ -z "$token" ] or matching the exit 1 inside that block).

Suggested change

# Verify it exits with error on empty token

assert "exit 1" in COMMAND or 'exit "$' in COMMAND, (

"COMMAND must exit with error when token extraction fails"

# Verify the empty-token validation block exits with an error

assert 'if [ -z "$token" ]; then\n exit 1\nfi' in COMMAND, (

"COMMAND must exit with error in the empty-token validation block when token extraction fails"

anmolxlight requested a review from o-nikolas as a code owner April 5, 2026 21:48

boring-cyborg bot added area:providers provider:amazon AWS/Amazon - related issues labels Apr 5, 2026

eladkal requested review from ferruzzi and vincbeck April 7, 2026 05:42

o-nikolas reviewed Apr 8, 2026

View reviewed changes

kaxil requested a review from Copilot April 10, 2026 19:55

Copilot AI reviewed Apr 10, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id#64749

fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id#64749
anmolxlight wants to merge 1 commit intoapache:mainfrom
anmolxlight:fix/eks-pod-operator-cross-account-401

anmolxlight commented Apr 5, 2026

Uh oh!

boring-cyborg bot commented Apr 5, 2026

Uh oh!

o-nikolas Apr 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Copilot AI Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		printf 'Failed to extract token from eks_get_token output. ' >&2
		printf 'Output was: %s' "$output" >&2

Conversation

anmolxlight commented Apr 5, 2026

fix: EksPodOperator 401 with cross-account AssumeRole via aws_conn_id

Problem

Root Cause

Changes

airflow/providers/amazon/aws/hooks/eks.py

tests/unit/amazon/aws/hooks/test_eks.py

Testing

Verification

Uh oh!

boring-cyborg bot commented Apr 5, 2026

Uh oh!

o-nikolas Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

`airflow/providers/amazon/aws/hooks/eks.py`

`tests/unit/amazon/aws/hooks/test_eks.py`