Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stability: File Descriptor hog leading to container resource exhaustion #942

Closed
edmondsiu0 opened this issue Sep 2, 2021 · 1 comment
Closed
Labels
bug Something isn't working

Comments

@edmondsiu0
Copy link

edmondsiu0 commented Sep 2, 2021

Description

Despite Kernel Culling parameters being configured, Voila Python process holds open a large number of File Descriptors, which leads to instability such as high load average, or ulimit reached causing the process and container to exit.

Reproduce

App bundle is attached: simple-app.zip

1. Launch container

docker-compose -f simple-app.docker-compose.yml up --build

2. Watch Voila thread counts in container

while sleep 0.5; do docker exec -it voila-simple-app ps -eLf | grep voila | wc -l; done

3. Watch voila file descriptor count in container

docker exec -it --privileged --user root voila-simple-app ./simple-app-lsof.sh

4. Simulate traffic with ApacheBench

ab -n 100 -c 1 http://localhost:8866/

Observed behaviour

As seen in the screen recording, the amount of PID (as seen in the bottom docker stats window) kept increasing despite kernels being culled.

The amount of file descriptors open (as seen in middle-right window) climbed rapidly, and did not recover after requests have stopped.

The Load Average of this container also increased over time, and it did not recover after requests stopped.

Depending on where this container was run (e.g., local vs AWS ECS Fargate), it could hit ulimit very quickly and terminate, or becomes so sluggish to new requests and requires restarting.

Expected behavior

Expected behaviour is Voila increase in CPU and memory usage.
The amount of PIDs and file descriptors (fd) will also increase initially, and then stabilise.
When there are no new connections made to Voila, PID count and fd count will reduce back down to nominal value.

Context

  • voila version 0.2.10
  • Operating System and version: Docker on macOS 11.5.2, container image jupyter/base-notebook:python-3.8.8
  • Browser and version: apachebench
@edmondsiu0 edmondsiu0 added the bug Something isn't working label Sep 2, 2021
@trungleduc
Copy link
Member

Fixed by #969

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants