Skip to content

Surface remote-log upload failures via structured warnings#66571

Merged
vatsrahul1001 merged 2 commits into
apache:mainfrom
potiuk:fix/tasksdk-remote-log-load-warning
May 19, 2026
Merged

Surface remote-log upload failures via structured warnings#66571
vatsrahul1001 merged 2 commits into
apache:mainfrom
potiuk:fix/tasksdk-remote-log-load-warning

Conversation

@potiuk
Copy link
Copy Markdown
Member

@potiuk potiuk commented May 7, 2026

Summary

upload_to_remote() silently returned in three failure paths (handler load failure, log-path resolution exception, handler.upload exception) with no signal to the operator that logs were not reaching the remote system. The supervisor's _upload_logs already logs the outermost exception, but the inner paths inside upload_to_remote() were silent — so a misconfigured remote handler or a transient remote-system outage would degrade silently while local-only logs continued.

Fix

Add a dedicated airflow.logging.remote structlog logger and emit a log.warning at each of the three failure paths with the TI id and the underlying error string. No behaviour change otherwise — failures still fall through softly so a bad remote handler doesn't abort the task lifecycle.

Reported by

L3 ASVS sweep — apache/tooling-agents#24 (FINDING-004).


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (Opus 4.7)

Generated-by: Claude Code (Opus 4.7) following the guidelines

`upload_to_remote()` silently returned in three failure paths (handler
load failure, log-path resolution exception, handler.upload exception)
with no signal to the operator that logs were not reaching the remote
system. The supervisor's `_upload_logs` already logs the outermost
exception, but the inner paths inside `upload_to_remote()` were silent
— so a misconfigured remote handler or a transient remote-system
outage would degrade silently while local-only logs continued.

Add a dedicated `airflow.logging.remote` structlog logger and emit a
`log.warning` at each of the three failure paths with the TI id and
the underlying error string. No behaviour change otherwise — failures
still fall through softly so a bad remote handler doesn't abort the
task lifecycle.

Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-004).
@potiuk potiuk force-pushed the fix/tasksdk-remote-log-load-warning branch from c33b0a4 to 313bc16 Compare May 17, 2026 19:39
@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented May 17, 2026

I'd love to get this one merged — and would love it in 3.2.2 if it's not too late. cc @vatsrahul1001 (3.2.2 RM)


Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting

@potiuk potiuk added this to the Airflow 3.2.2 milestone May 17, 2026
@potiuk potiuk added the backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch label May 17, 2026
Comment thread task-sdk/src/airflow/sdk/log.py Outdated
Copy link
Copy Markdown
Member

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking nit

Comment thread task-sdk/src/airflow/sdk/log.py Outdated
Comment thread task-sdk/src/airflow/sdk/log.py Outdated
Comment thread task-sdk/src/airflow/sdk/log.py Outdated
@vatsrahul1001
Copy link
Copy Markdown
Contributor

@potiuk can you address comments?

Copy link
Copy Markdown
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also have a test pls?

Comment thread task-sdk/src/airflow/sdk/log.py Outdated
Comment thread task-sdk/src/airflow/sdk/log.py Outdated
Comment thread task-sdk/src/airflow/sdk/log.py Outdated
- Use ti.id directly (available since 3.0) instead of getattr fallback.
- Pass exc_info=exc to structlog so the traceback is preserved on
  path-resolution and upload failures.
- Trim past-tense from the docstring comment.
- Add tests covering the three failure paths and the silent
  success / no-path cases.
@potiuk
Copy link
Copy Markdown
Member Author

potiuk commented May 19, 2026

Addressed in 9c39993:

  • @Lee-W — switched to str(ti.id) (3 places) and exc_info=exc for both except paths so the traceback is preserved.
  • @amoghrajesh — trimmed the past-tense sentence from the comment; also added task-sdk/tests/task_sdk/test_log.py with 5 tests for upload_to_remote() covering the three failure paths plus silent-no-path and silent-success.

Drafted-by: Claude Code (Opus 4.7); reviewed by @potiuk before posting

@vatsrahul1001 vatsrahul1001 merged commit ef87426 into apache:main May 19, 2026
113 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

Backport successfully created: v3-2-test

Note: As of Merging PRs targeted for Airflow 3.X
the committer who merges the PR is responsible for backporting the PRs that are bug fixes (generally speaking) to the maintenance branches.

In matter of doubt please ask in #release-management Slack channel.

Status Branch Result
v3-2-test PR Link

vatsrahul1001 pushed a commit that referenced this pull request May 19, 2026
#66571) (#67172)

* Surface remote-log upload failures via structured warnings

`upload_to_remote()` silently returned in three failure paths (handler
load failure, log-path resolution exception, handler.upload exception)
with no signal to the operator that logs were not reaching the remote
system. The supervisor's `_upload_logs` already logs the outermost
exception, but the inner paths inside `upload_to_remote()` were silent
— so a misconfigured remote handler or a transient remote-system
outage would degrade silently while local-only logs continued.

Add a dedicated `airflow.logging.remote` structlog logger and emit a
`log.warning` at each of the three failure paths with the TI id and
the underlying error string. No behaviour change otherwise — failures
still fall through softly so a bad remote handler doesn't abort the
task lifecycle.

Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-004).

* Address review feedback on remote-log upload warnings

- Use ti.id directly (available since 3.0) instead of getattr fallback.
- Pass exc_info=exc to structlog so the traceback is preserved on
  path-resolution and upload failures.
- Trim past-tense from the docstring comment.
- Add tests covering the three failure paths and the silent
  success / no-path cases.
(cherry picked from commit ef87426)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
vatsrahul1001 pushed a commit that referenced this pull request May 20, 2026
#66571) (#67172)

* Surface remote-log upload failures via structured warnings

`upload_to_remote()` silently returned in three failure paths (handler
load failure, log-path resolution exception, handler.upload exception)
with no signal to the operator that logs were not reaching the remote
system. The supervisor's `_upload_logs` already logs the outermost
exception, but the inner paths inside `upload_to_remote()` were silent
— so a misconfigured remote handler or a transient remote-system
outage would degrade silently while local-only logs continued.

Add a dedicated `airflow.logging.remote` structlog logger and emit a
`log.warning` at each of the three failure paths with the TI id and
the underlying error string. No behaviour change otherwise — failures
still fall through softly so a bad remote handler doesn't abort the
task lifecycle.

Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-004).

* Address review feedback on remote-log upload warnings

- Use ti.id directly (available since 3.0) instead of getattr fallback.
- Pass exc_info=exc to structlog so the traceback is preserved on
  path-resolution and upload failures.
- Trim past-tense from the docstring comment.
- Add tests covering the three failure paths and the silent
  success / no-path cases.
(cherry picked from commit ef87426)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
vatsrahul1001 pushed a commit that referenced this pull request May 20, 2026
#66571) (#67172)

* Surface remote-log upload failures via structured warnings

`upload_to_remote()` silently returned in three failure paths (handler
load failure, log-path resolution exception, handler.upload exception)
with no signal to the operator that logs were not reaching the remote
system. The supervisor's `_upload_logs` already logs the outermost
exception, but the inner paths inside `upload_to_remote()` were silent
— so a misconfigured remote handler or a transient remote-system
outage would degrade silently while local-only logs continued.

Add a dedicated `airflow.logging.remote` structlog logger and emit a
`log.warning` at each of the three failure paths with the TI id and
the underlying error string. No behaviour change otherwise — failures
still fall through softly so a bad remote handler doesn't abort the
task lifecycle.

Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-004).

* Address review feedback on remote-log upload warnings

- Use ti.id directly (available since 3.0) instead of getattr fallback.
- Pass exc_info=exc to structlog so the traceback is preserved on
  path-resolution and upload failures.
- Trim past-tense from the docstring comment.
- Add tests covering the three failure paths and the silent
  success / no-path cases.
(cherry picked from commit ef87426)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
vatsrahul1001 pushed a commit that referenced this pull request May 21, 2026
#66571) (#67172)

* Surface remote-log upload failures via structured warnings

`upload_to_remote()` silently returned in three failure paths (handler
load failure, log-path resolution exception, handler.upload exception)
with no signal to the operator that logs were not reaching the remote
system. The supervisor's `_upload_logs` already logs the outermost
exception, but the inner paths inside `upload_to_remote()` were silent
— so a misconfigured remote handler or a transient remote-system
outage would degrade silently while local-only logs continued.

Add a dedicated `airflow.logging.remote` structlog logger and emit a
`log.warning` at each of the three failure paths with the TI id and
the underlying error string. No behaviour change otherwise — failures
still fall through softly so a bad remote handler doesn't abort the
task lifecycle.

Reported by the L3 ASVS sweep at apache/tooling-agents#24 (FINDING-004).

* Address review feedback on remote-log upload warnings

- Use ti.id directly (available since 3.0) instead of getattr fallback.
- Pass exc_info=exc to structlog so the traceback is preserved on
  path-resolution and upload failures.
- Trim past-tense from the docstring comment.
- Add tests covering the three failure paths and the silent
  success / no-path cases.
(cherry picked from commit ef87426)

Co-authored-by: Jarek Potiuk <jarek@potiuk.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:task-sdk backport-to-v3-2-test Mark PR with this label to backport to v3-2-test branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants