Skip to content

fix: resolve socket hang up errors for large file execution#73

Merged
usnavy13 merged 2 commits into
mainfrom
dev
Mar 11, 2026
Merged

fix: resolve socket hang up errors for large file execution#73
usnavy13 merged 2 commits into
mainfrom
dev

Conversation

@usnavy13
Copy link
Copy Markdown
Owner

Description

Fixes socket hang up errors during large-file or slow /exec requests by keeping the HTTP connection alive and removing blocking large-file reads from the asyncio event loop.

This PR includes the dev -> main delta at 2529fe6 over 0dc767a:

  • stream keepalive whitespace from POST /exec while execution is still running
  • stream large MinIO-backed files directly to disk before mounting into the sandbox
  • add a functional regression test for concurrent requests during 50 MB file-backed execution

Fixes # (issue)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update

How Has This Been Tested?

Verified locally on March 11, 2026.

  • pytest tests/unit/
    Result: 371 passed in 0.73s
  • API_BASE=http://localhost:8080 API_KEY=<repo .env key> pytest tests/functional/test_concurrent_file_exec.py -v
    Result: 1 passed in 5.55s
  • Local stack health check at http://localhost:8080/health

Notes:

  • Functional tests default to http://localhost:8000, but this local Docker Compose stack exposes the API on http://localhost:8080, so the regression test was run against 8080.
  • GitHub PR CI currently runs pytest tests/unit/ only; the new functional regression coverage is included here as local verification and should be treated as follow-up CI work.

Reviewer Notes

  • POST /exec remains JSON-compatible, but responses may now begin with keepalive whitespace before the final JSON payload.
  • The new stream path re-raises validation and service-unavailable errors for normal FastAPI handling; reviewers should pay attention to how unexpected runtime failures surface from the streaming response.
  • No request or response schema fields changed.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Joe Licata and others added 2 commits March 5, 2026 22:28
Node.js 20 sets a default 5-second socket timeout on HTTP connections.
When code execution takes longer (cold sandbox starts, large file
mounting, heavy pandas operations), the client destroys the socket
before the server responds, causing "socket hang up" errors.

Three fixes applied:

1. Streaming keepalive on /exec endpoint: sends whitespace every 3s
   to keep the TCP connection alive during long operations. JSON
   parsers ignore leading whitespace so this is fully transparent.

2. Non-blocking file I/O: moved MinIO response.read() into the thread
   pool executor (was blocking the asyncio event loop), and added
   stream_file_to_path() using fget_object for direct disk-to-disk
   transfer without loading files into memory.

3. Increased default sandbox pool size (SANDBOX_POOL_PY=5) to reduce
   cold-start frequency under concurrent load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: Resolve socket hang up errors for large file execution
@usnavy13 usnavy13 merged commit c612471 into main Mar 11, 2026
3 checks passed
djuillard pushed a commit to On-Behalf-AI/LibreCodeInterpreter that referenced this pull request Apr 21, 2026
fix: resolve socket hang up errors for large file execution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant