New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[23.1] Backport tool mem fixes #16601
[23.1] Backport tool mem fixes #16601
Conversation
This is in the docutils source code: ``` if level >= self.halt_level: raise SystemMessage(msg, level) ``` and it's not being caught. Might be new in new docutils?
This needs a whole bunch of refactoring.
Comparing each element of the static options with each incoming element is very inefficient. Since dictionaries are ordered this can be simplified to a simple key value insert. Prior to this change: ``` parse_static_options (/Users/mvandenb/src/galaxy/lib/galaxy/tool_util/parser/xml.py:1207) function called 7183 times 313317 function calls in 11.318 seconds Ordered by: cumulative time, internal time, call count ncalls tottime percall cumtime percall filename:lineno(function) 7183 11.281 0.002 11.318 0.002 xml.py:1207(parse_static_options) 99675 0.029 0.000 0.033 0.000 __init__.py:1004(string_as_bool) 99675 0.004 0.000 0.004 0.000 {method 'lower' of 'str' objects} 99601 0.004 0.000 0.004 0.000 {method 'append' of 'list' objects} 7183 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 0 0.000 0.000 profile:0(profiler) ``` after: ``` *** PROFILER RESULTS *** parse_static_options (/Users/mvandenb/src/galaxy/lib/galaxy/tool_util/parser/xml.py:1207) function called 7183 times 220899 function calls in 0.260 seconds Ordered by: cumulative time, internal time, call count ncalls tottime percall cumtime percall filename:lineno(function) 7183 0.230 0.000 0.259 0.000 xml.py:1207(parse_static_options) 99675 0.026 0.000 0.029 0.000 __init__.py:1004(string_as_bool) 99675 0.004 0.000 0.004 0.000 {method 'lower' of 'str' objects} 7183 0.000 0.000 0.000 0.000 {method 'values' of 'dict' objects} 7183 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects} 0 0.000 0.000 profile:0(profiler) ``` This is most due to openms_idfilter, which has 2928 static options.
functools.cache is new in python 3.9
For details see benoitc/gunicorn#1640 and https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf I think this is the most subtle to test change. I believe this is working. I started a gunicorn instance with 4 workers: ``` GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4 ``` Then i use the following script against that instance ``` import threading import requests def req(): for i in range(10000): requests.get('http://localhost:8080/history/current_history_json') for i in range(10): threading.Thread(target=req).start() ``` I see that the memory consumption increases much more *during* requests without this commit. It eventually decreases again, but I think not to the same baseline level (hard to tell without more elaborate testing). I attribute the higher memory load during requests to the fact that the garbage collection requiring to inspect more objects, taking more time to run and therefor not running as fast? I'm really not sure, I think we should just roll this out and see, it should be fairly obvious from the grafana dashboards.
e74727e
to
e0ba54e
Compare
# freeze objects after preloading app | ||
if is_preload_app(): | ||
gc.freeze() | ||
print("Objects frozen in perm gen: ", gc.get_freeze_count()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I think that's good to have
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can deploy it on Sunday on EU as well and get some testing before the week begins.
Backport of #16536
It's not entirely a bugfix, but there are commits in there that are definitely fixing bugs, and I plan to test this out on usegalaxy.org
How to test the changes?
(Select all options that apply)
License