Skip to content

FIX: Fixed output encoding in WinRMTrigger for WinRMOperator in deferred mode#64154

Open
dabla wants to merge 6 commits intoapache:mainfrom
dabla:fix/output-encoding-winrm-trigger
Open

FIX: Fixed output encoding in WinRMTrigger for WinRMOperator in deferred mode#64154
dabla wants to merge 6 commits intoapache:mainfrom
dabla:fix/output-encoding-winrm-trigger

Conversation

@dabla
Copy link
Contributor

@dabla dabla commented Mar 24, 2026

WinRM deferrable mode breaks with non-UTF8 output encodings
8fc49f4
providers/microsoft/winrm/src/airflow/providers/microsoft/winrm/triggers/winrm.py | providers/microsoft/winrm/src/airflow/providers/microsoft/winrm/operators/winrm.py
The new deferrable WinRM trigger base64-encodes stdout/stderr and immediately decodes those bytes using the user-supplied output_encoding. Base64 data is ASCII; decoding it with encodings like UTF-16 produces non-ASCII characters. When the operator resumes, _decode() calls base64.standard_b64decode on those strings, which raises because the string is not ASCII. This means deferrable mode will crash for non-ASCII output encodings (e.g., UTF-16LE output from Windows), a regression introduced by this commit.


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@eladkal eladkal force-pushed the fix/output-encoding-winrm-trigger branch from 8884b18 to f5879fb Compare March 24, 2026 16:53
@eladkal
Copy link
Contributor

eladkal commented Mar 24, 2026

tests are failing :(

@dabla
Copy link
Contributor Author

dabla commented Mar 24, 2026

tests are failing :(

Don't seem to be related to the changes in PR weird

@dabla dabla closed this Mar 25, 2026
@dabla dabla reopened this Mar 25, 2026
@dabla dabla force-pushed the fix/output-encoding-winrm-trigger branch 2 times, most recently from a2ea719 to 8259da3 Compare March 25, 2026 17:11
@dabla dabla force-pushed the fix/output-encoding-winrm-trigger branch from 8259da3 to 021700a Compare March 25, 2026 17:14
Copy link
Contributor

@bugraoz93 bugraoz93 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good to me! I have a nit related to how we ignore the 0 collected tests. Not blocking as it shouldn't cause any issues immediately

output_bytes = output.encode("ascii")
else:
output_bytes = output
except UnicodeEncodeError as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had similar problems in Docker and K8s provider, can you take a look here? https://github.com/apache/airflow/pull/62632/changes

I suspect it is the same root cause, you get chars that are not JSON serializable in logs? In both Docker and K8s the fix was to replace incorrect surrogates, can happen if reads are in bytes and chunks are only "half" unicode chars (e.g. 2-byte char and you read only 1 byte at the end of a line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants