Skip to content

Commit

Permalink
Avoid garbage collections on preloaded objects
Browse files Browse the repository at this point in the history
For details see benoitc/gunicorn#1640 and
https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

I think this is the most subtle to test change. I believe this is
working.

I started a gunicorn instance with 4 workers:

```
GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4
```

Then i use the following script against that instance

```
import threading
import requests

def req():
    for i in range(10000):
        requests.get('http://localhost:8080/history/current_history_json')

for i in range(10):
    threading.Thread(target=req).start()
```

I see that the memory consumption increases much more *during* requests
without this commit. It eventually decreases again, but I think not to
the same baseline level (hard to tell without more elaborate testing). I
attribute the higher memory load during requests to the fact that the
garbage collection requiring to inspect more objects, taking more time
to run and therefor not running as fast? I'm really not sure, I think we
should just roll this out and see, it should be fairly obvious from the
grafana dashboards.
  • Loading branch information
mvdbeek committed Aug 17, 2023
1 parent e31cb6d commit afa3f27
Showing 1 changed file with 14 additions and 1 deletion.
15 changes: 14 additions & 1 deletion lib/galaxy/web_stack/gunicorn_config.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,15 @@
"""
Gunicorn config file based on https://gist.github.com/hynek/ba655c8756924a5febc5285c712a7946
"""
import gc
import os
import sys


def is_preload_app():
return "--preload" in os.environ.get("GUNICORN_CMD_ARGS", "") or "--preload" in sys.argv


def on_starting(server):
"""
Attach a set of IDs that can be temporarily re-used.
Expand Down Expand Up @@ -45,6 +50,13 @@ def on_reload(server):
server._worker_id_overload = set(range(1, server.cfg.workers + 1))


def when_ready(server):
# freeze objects after preloading app
if is_preload_app():
gc.freeze()
print("Objects frozen in perm gen: ", gc.get_freeze_count())


def pre_fork(server, worker):
"""
Attach the next free worker_id before forking off.
Expand All @@ -58,7 +70,8 @@ def post_fork(server, worker):
"""
os.environ["GUNICORN_WORKER_ID"] = str(worker._worker_id)
os.environ["GUNICORN_LISTENERS"] = ",".join(str(bind) for bind in server.LISTENERS)
if "--preload" in os.environ.get("GUNICORN_CMD_ARGS", "") or "--preload" in sys.argv:
if is_preload_app():
gc.enable()
from galaxy.web_stack import GunicornApplicationStack

GunicornApplicationStack.late_postfork_event.set()

0 comments on commit afa3f27

Please sign in to comment.