Skip to content

Fix: raise soft ulimit for open files in Python runtime startup#3612

Closed
Ankitsinghsisodya wants to merge 4 commits intoknative:mainfrom
Ankitsinghsisodya:issue-3513-python-ulimit
Closed

Fix: raise soft ulimit for open files in Python runtime startup#3612
Ankitsinghsisodya wants to merge 4 commits intoknative:mainfrom
Ankitsinghsisodya:issue-3513-python-ulimit

Conversation

@Ankitsinghsisodya
Copy link
Copy Markdown
Contributor

Changes

  • 🐛 Raise the soft open-file limit to match the hard limit at Python middleware startup, preventing failures under load on platforms with a low default soft limit (e.g. 1024)

/kind bug

Fixes #3513

Release Note

Python functions now raise the soft open-file limit to match the hard limit at startup, matching the behaviour of the Go and Java runtimes. This prevents failures under load on platforms where the default soft limit is low (e.g. 1024).

Docs


Platforms with a low default soft limit (e.g. 1024) caused Python
functions to fail under load. This matches the behaviour already
present in the Go and Java runtimes by raising the soft limit to the
hard limit at middleware startup.

Fixes knative#3513
Copilot AI review requested due to automatic review settings April 15, 2026 20:33
@knative-prow knative-prow bot added the kind/bug Bugs label Apr 15, 2026
@knative-prow knative-prow bot requested review from dsimansk and jrangelramos April 15, 2026 20:33
@knative-prow
Copy link
Copy Markdown

knative-prow bot commented Apr 15, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: Ankitsinghsisodya
Once this PR has been reviewed and has the lgtm label, please assign matzew for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the size/S 🤖 PR changes 10-29 lines, ignoring generated files. label Apr 15, 2026
@knative-prow
Copy link
Copy Markdown

knative-prow bot commented Apr 15, 2026

Hi @Ankitsinghsisodya. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@knative-prow knative-prow bot added the needs-ok-to-test 🤖 Needs an org member to approve testing label Apr 15, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the generated Python runtime startup glue code to raise the soft open-file (RLIMIT_NOFILE) limit up to the hard limit during middleware startup, aligning behavior with other runtimes and preventing “too many open files” failures under load on low-default platforms.

Changes:

  • Add RLIMIT_NOFILE inspection/adjustment at startup for instanced HTTP Python functions.
  • Add the same RLIMIT_NOFILE inspection/adjustment at startup for instanced CloudEvents Python functions.

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated 2 comments.

File Description
templates/python/scaffolding/instanced-http/service/main.py Raises soft NOFILE limit to hard limit during startup for HTTP scaffolding entrypoint.
templates/python/scaffolding/instanced-cloudevents/service/main.py Raises soft NOFILE limit to hard limit during startup for CloudEvents scaffolding entrypoint.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +8 to +23
import resource
from func_python.http import serve

logging.basicConfig(level=logging.INFO)

# Raise the soft limit for open files to match the hard limit.
# Platforms such as some container runtimes default the soft limit to 1024,
# which causes failures under load. Go and Java runtimes do this
# automatically; we replicate the behaviour here.
try:
_soft, _hard = resource.getrlimit(resource.RLIMIT_NOFILE)
if _soft < _hard:
resource.setrlimit(resource.RLIMIT_NOFILE, (_hard, _hard))
logging.info("Raised open-file limit from %d to %d", _soft, _hard)
except Exception as e:
logging.warning("Could not raise open-file limit: %s", e)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource is not available on Windows, so importing it at module import time will raise ImportError and prevent local runs (and any Windows-based tooling) from starting. Consider wrapping the import resource in a try/except ImportError (or importing inside the existing try: block) and skipping the ulimit adjustment on unsupported platforms.

Suggested change
import resource
from func_python.http import serve
logging.basicConfig(level=logging.INFO)
# Raise the soft limit for open files to match the hard limit.
# Platforms such as some container runtimes default the soft limit to 1024,
# which causes failures under load. Go and Java runtimes do this
# automatically; we replicate the behaviour here.
try:
_soft, _hard = resource.getrlimit(resource.RLIMIT_NOFILE)
if _soft < _hard:
resource.setrlimit(resource.RLIMIT_NOFILE, (_hard, _hard))
logging.info("Raised open-file limit from %d to %d", _soft, _hard)
except Exception as e:
logging.warning("Could not raise open-file limit: %s", e)
from func_python.http import serve
try:
import resource
except ImportError:
resource = None
logging.basicConfig(level=logging.INFO)
# Raise the soft limit for open files to match the hard limit.
# Platforms such as some container runtimes default the soft limit to 1024,
# which causes failures under load. Go and Java runtimes do this
# automatically; we replicate the behaviour here.
if resource is not None:
try:
_soft, _hard = resource.getrlimit(resource.RLIMIT_NOFILE)
if _soft < _hard:
resource.setrlimit(resource.RLIMIT_NOFILE, (_hard, _hard))
logging.info("Raised open-file limit from %d to %d", _soft, _hard)
except Exception as e:
logging.warning("Could not raise open-file limit: %s", e)

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +23
import resource
from func_python.cloudevent import serve

logging.basicConfig(level=logging.INFO)

# Raise the soft limit for open files to match the hard limit.
# Platforms such as some container runtimes default the soft limit to 1024,
# which causes failures under load. Go and Java runtimes do this
# automatically; we replicate the behaviour here.
try:
_soft, _hard = resource.getrlimit(resource.RLIMIT_NOFILE)
if _soft < _hard:
resource.setrlimit(resource.RLIMIT_NOFILE, (_hard, _hard))
logging.info("Raised open-file limit from %d to %d", _soft, _hard)
except Exception as e:
logging.warning("Could not raise open-file limit: %s", e)
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource is not available on Windows, so importing it at module import time will raise ImportError and prevent local runs (and any Windows-based tooling) from starting. Consider wrapping the import resource in a try/except ImportError (or importing inside the existing try: block) and skipping the ulimit adjustment on unsupported platforms.

Suggested change
import resource
from func_python.cloudevent import serve
logging.basicConfig(level=logging.INFO)
# Raise the soft limit for open files to match the hard limit.
# Platforms such as some container runtimes default the soft limit to 1024,
# which causes failures under load. Go and Java runtimes do this
# automatically; we replicate the behaviour here.
try:
_soft, _hard = resource.getrlimit(resource.RLIMIT_NOFILE)
if _soft < _hard:
resource.setrlimit(resource.RLIMIT_NOFILE, (_hard, _hard))
logging.info("Raised open-file limit from %d to %d", _soft, _hard)
except Exception as e:
logging.warning("Could not raise open-file limit: %s", e)
from func_python.cloudevent import serve
try:
import resource
except ImportError:
resource = None # type: ignore[assignment]
logging.basicConfig(level=logging.INFO)
# Raise the soft limit for open files to match the hard limit.
# Platforms such as some container runtimes default the soft limit to 1024,
# which causes failures under load. Go and Java runtimes do this
# automatically; we replicate the behaviour here.
if resource is not None:
try:
_soft, _hard = resource.getrlimit(resource.RLIMIT_NOFILE)
if _soft < _hard:
resource.setrlimit(resource.RLIMIT_NOFILE, (_hard, _hard))
logging.info("Raised open-file limit from %d to %d", _soft, _hard)
except Exception as e:
logging.warning("Could not raise open-file limit: %s", e)
else:
logging.info("Open-file limit adjustment is unavailable on this platform")

Copilot uses AI. Check for mistakes.
@codecov
Copy link
Copy Markdown

codecov bot commented Apr 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 56.28%. Comparing base (07bdeaf) to head (993e56d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3612      +/-   ##
==========================================
+ Coverage   56.26%   56.28%   +0.02%     
==========================================
  Files         180      180              
  Lines       20522    20543      +21     
==========================================
+ Hits        11546    11563      +17     
- Misses       7774     7778       +4     
  Partials     1202     1202              
Flag Coverage Δ
e2e 36.21% <ø> (+0.01%) ⬆️
e2e go 32.85% <ø> (+<0.01%) ⬆️
e2e node 28.60% <ø> (+0.01%) ⬆️
e2e python 33.22% <ø> (+<0.01%) ⬆️
e2e quarkus 28.74% <ø> (+0.01%) ⬆️
e2e rust 28.15% <ø> (-0.01%) ⬇️
e2e springboot 26.63% <ø> (+0.01%) ⬆️
e2e typescript 28.71% <ø> (+0.01%) ⬆️
e2e-config-ci 18.06% <ø> (-0.02%) ⬇️
integration 17.46% <ø> (-0.01%) ⬇️
unit macos-14 43.34% <ø> (+0.03%) ⬆️
unit macos-latest 43.34% <ø> (+0.03%) ⬆️
unit ubuntu-24.04-arm 43.53% <ø> (+0.03%) ⬆️
unit ubuntu-latest 44.21% <ø> (+0.03%) ⬆️
unit windows-latest 43.36% <ø> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Platforms with a low default soft limit (e.g. 1024) caused Python
functions to fail under load. This matches the behaviour already
present in the Go and Java runtimes by raising the soft limit to the
hard limit at middleware startup.

The logic lives in a dedicated _ulimit.py helper so it is testable
in isolation and is not duplicated between the HTTP and CloudEvents
scaffolding variants.  Error handling distinguishes ImportError
(non-Unix platforms where the resource module is absent) from
ValueError/OSError (bad values or permission failures), so neither
case silently masks the original limit.

Five unit tests are added for each scaffolding variant and
hack/test-python.sh is updated to run them.

Fixes knative#3513
@knative-prow knative-prow bot added size/L 🤖 PR changes 100-499 lines, ignoring generated files. and removed size/S 🤖 PR changes 10-29 lines, ignoring generated files. labels Apr 15, 2026
@Ankitsinghsisodya Ankitsinghsisodya marked this pull request as draft April 15, 2026 21:02
@knative-prow knative-prow bot added the do-not-merge/work-in-progress 🤖 PR should not merge because it is a work in progress. label Apr 15, 2026
- Cap target at _MAX_NOFILE (65536) when hard == RLIM_INFINITY instead
  of passing RLIM_INFINITY to setrlimit, which raises OSError on most
  kernels
- Move _configure_ulimit() inside the if __name__ == "__main__" guard
  so importing the module does not mutate system limits as a side effect
- Add RLIM_INFINITY test case (the most common production scenario)
- Set mock RLIM_INFINITY to 9223372036854775807 (actual Linux value)
  so the soft < hard guard is not accidentally short-circuited

Addresses review feedback on knative#3513
pytest left .pytest_cache and __pycache__ directories inside the
templates tree after local test runs.  The embedded FS generator
picked them up, causing check-embedded-fs to fail in CI.

- Add .gitignore to both scaffolding roots to exclude these artifacts
- Update hack/test-python.sh to rm -rf .pytest_cache and __pycache__
  after each scaffolding test run
- Regenerate zz_filesystem_generated.go with clean templates
@Ankitsinghsisodya Ankitsinghsisodya marked this pull request as ready for review April 15, 2026 21:32
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress 🤖 PR should not merge because it is a work in progress. label Apr 15, 2026
@Ankitsinghsisodya
Copy link
Copy Markdown
Contributor Author

Closing in favour of the correct upstream fix.

The fix belongs in the middleware library, not in the scaffolding templates. Scaffolding is baked into container images at build time, so this PR would only help functions that are rebuilt after it merges — already-deployed functions would get nothing.

The correct fix is in knative-extensions/func-python#83, where it lives inside serve() and benefits all deployed functions on the next library version bump.

A follow-up PR to knative/func will be raised to bump the pinned func-python version in the scaffolding pyproject.toml once that PR merges and a release is cut.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/bug Bugs needs-ok-to-test 🤖 Needs an org member to approve testing size/L 🤖 PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support low-ulimit platforms in Python runtime

2 participants