Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[23.1] Backport tool mem fixes #16601

Merged
merged 27 commits into from Aug 29, 2023

Conversation

mvdbeek
Copy link
Member

@mvdbeek mvdbeek commented Aug 26, 2023

Backport of #16536

It's not entirely a bugfix, but there are commits in there that are definitely fixing bugs, and I plan to test this out on usegalaxy.org

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.
  • Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

This is in the docutils source code:
```
        if level >= self.halt_level:
            raise SystemMessage(msg, level)
```
and it's not being caught. Might be new in new docutils?
This needs a whole bunch of refactoring.
Comparing each element of the static options with each incoming element
is very inefficient. Since dictionaries are ordered this can be
simplified to a simple key value insert.

Prior to this change:
```
parse_static_options (/Users/mvandenb/src/galaxy/lib/galaxy/tool_util/parser/xml.py:1207)
function called 7183 times

         313317 function calls in 11.318 seconds

   Ordered by: cumulative time, internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     7183   11.281    0.002   11.318    0.002 xml.py:1207(parse_static_options)
    99675    0.029    0.000    0.033    0.000 __init__.py:1004(string_as_bool)
    99675    0.004    0.000    0.004    0.000 {method 'lower' of 'str' objects}
    99601    0.004    0.000    0.004    0.000 {method 'append' of 'list' objects}
     7183    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        0    0.000             0.000          profile:0(profiler)
```
after:
```
*** PROFILER RESULTS ***
parse_static_options (/Users/mvandenb/src/galaxy/lib/galaxy/tool_util/parser/xml.py:1207)
function called 7183 times

         220899 function calls in 0.260 seconds

   Ordered by: cumulative time, internal time, call count

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
     7183    0.230    0.000    0.259    0.000 xml.py:1207(parse_static_options)
    99675    0.026    0.000    0.029    0.000 __init__.py:1004(string_as_bool)
    99675    0.004    0.000    0.004    0.000 {method 'lower' of 'str' objects}
     7183    0.000    0.000    0.000    0.000 {method 'values' of 'dict' objects}
     7183    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        0    0.000             0.000          profile:0(profiler)
```

This is most due to openms_idfilter, which has 2928 static options.
functools.cache is new in python 3.9
For details see benoitc/gunicorn#1640 and
https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

I think this is the most subtle to test change. I believe this is
working.

I started a gunicorn instance with 4 workers:

```
GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4
```

Then i use the following script against that instance

```
import threading
import requests

def req():
    for i in range(10000):
        requests.get('http://localhost:8080/history/current_history_json')

for i in range(10):
    threading.Thread(target=req).start()
```

I see that the memory consumption increases much more *during* requests
without this commit. It eventually decreases again, but I think not to
the same baseline level (hard to tell without more elaborate testing). I
attribute the higher memory load during requests to the fact that the
garbage collection requiring to inspect more objects, taking more time
to run and therefor not running as fast? I'm really not sure, I think we
should just roll this out and see, it should be fairly obvious from the
grafana dashboards.
# freeze objects after preloading app
if is_preload_app():
gc.freeze()
print("Objects frozen in perm gen: ", gc.get_freeze_count())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be removed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I think that's good to have

Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can deploy it on Sunday on EU as well and get some testing before the week begins.

@mvdbeek mvdbeek marked this pull request as ready for review August 29, 2023 07:30
@github-actions github-actions bot added this to the 23.2 milestone Aug 29, 2023
@dannon dannon merged commit 4dab5e1 into galaxyproject:release_23.1 Aug 29, 2023
79 checks passed
@galaxyproject galaxyproject deleted a comment from github-actions bot Aug 29, 2023
@nsoranzo nsoranzo deleted the backport_tool_mem_fixes branch August 29, 2023 21:23
@dannon dannon modified the milestones: 23.2, 23.1 Sep 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants