Skip to content

Fix timeout_with_traceback crashes on Windows and non-main threads#63664

Open
Gautam-Bharadwaj wants to merge 3 commits intoapache:mainfrom
Gautam-Bharadwaj:fix-timeout-with-traceback
Open

Fix timeout_with_traceback crashes on Windows and non-main threads#63664
Gautam-Bharadwaj wants to merge 3 commits intoapache:mainfrom
Gautam-Bharadwaj:fix-timeout-with-traceback

Conversation

@Gautam-Bharadwaj
Copy link
Contributor

@Gautam-Bharadwaj Gautam-Bharadwaj commented Mar 15, 2026

The timeout_with_traceback context manager in airflow.utils.db uses signal.SIGALRM to raise a TimeoutException.

However, signal.SIGALRM does not exist on Windows, causing an AttributeError. Furthermore, even on Unix, signal.signal() can only be called from the main thread, raising a ValueError inside worker threads. This completely breaks logic wrapped in this context in non-standard environments.

I have added safe exception handling around the signal initialization. If signal.SIGALRM fails due to OS constraints or thread context, it logs a warning and proceeds gracefully as a no-op timeout, instead of crashing the process.


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@potiuk
Copy link
Member

potiuk commented Mar 16, 2026

@Gautam-Bharadwaj This PR has a few issues that need to be addressed before it can be reviewed — please see our Pull Request quality criteria.

Issues found:

  • Testing Requirements: This is a bug-fix PR with no regression tests added. The checklist requires a test that fails without the fix and passes with it. The fix handles two distinct scenarios — signal.SIGALRM missing on Windows (AttributeError) and signal.signal() called from a non-main thread (ValueError) — yet neither scenario has a corresponding test. Without tests, the fix cannot be verified and could silently regress. At minimum, a parametrized test that patches signal.signal to raise AttributeError and ValueError respectively, verifying the context manager does not raise and logs a warning, is needed.

Note: Your branch is 21 commits behind main. Some check failures may be caused by changes in the base branch rather than by your PR. Please rebase your branch and push again to get up-to-date CI results.

What to do next:

  • The comment informs you what you need to do.
  • Fix each issue, then mark the PR as "Ready for review" in the GitHub UI - but only after making sure that all the issues are fixed.
  • There is no rush — take your time and work at your own pace. We appreciate your contribution and are happy to wait for updates.
  • Maintainers will then proceed with a normal review.

There is no rush — take your time and work at your own pace. We appreciate your contribution and are happy to wait for updates. If you have questions, feel free to ask on the Airflow Slack.

@Gautam-Bharadwaj Gautam-Bharadwaj force-pushed the fix-timeout-with-traceback branch from e326537 to 331fe27 Compare March 16, 2026 07:49
@Gautam-Bharadwaj
Copy link
Contributor Author

Gautam-Bharadwaj commented Mar 16, 2026

@potiuk Thanks for the review Jarek! I've rebased the branch and added comprehensive unit tests for timeout_with_traceback in airflow-core/tests/unit/utils/test_timeout_traceback.py. The tests cover:

  • Normal behavior on Unix/Linux systems with SIGALRM.
  • Graceful handling of AttributeError (case when SIGALRM is missing on Windows).
  • Graceful handling of ValueError (case when called from a non-main thread).
  • Verified that a warning is logged when timeout is unsupported.
  • Verified that TimeoutException is correctly raised when the handler is triggered.
    Ready for review! 👋

@Gautam-Bharadwaj Gautam-Bharadwaj force-pushed the fix-timeout-with-traceback branch from f0bb892 to 9bb2812 Compare March 17, 2026 05:13
@Gautam-Bharadwaj Gautam-Bharadwaj force-pushed the fix-timeout-with-traceback branch from 9bb2812 to b2f75c3 Compare March 18, 2026 07:19
@Gautam-Bharadwaj
Copy link
Contributor Author

Hi @potiuk Sir,

Could you please take a look at this PR? If it looks good to you, please feel free to merge it.

Thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants