Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for gc.freeze() for apps that use preloading #1640

Open
gcbirzan opened this issue Nov 10, 2017 · 14 comments
Open

Support for gc.freeze() for apps that use preloading #1640

gcbirzan opened this issue Nov 10, 2017 · 14 comments
Assignees

Comments

@gcbirzan
Copy link

In python 3.7 (still alpha), a new API was added: gc.freeze. The rationale and advantages (and caveats) are explained in the ticket.

Basically, the garbage collector touches a lot of objects and removes a lot of the benefits of preload, this API allows you to stop that and really share memory between processes.

@hingston
Copy link

hingston commented Dec 2, 2017

#1566

@berkerpeksag
Copy link
Collaborator

@benoitc benoitc self-assigned this Jan 7, 2018
@Somewater
Copy link

I don't understand how to use it. Does fix already in 3.7 release?
I did some test script:
https://gist.github.com/Somewater/40d7a808d1efd7b2c77f22f8bcb73553
I expected that the script allocates 1-2 GB of memory at the beginning and does not require more memory in the process of work. But it does - just after start of iterations in workers:
http://pix.toile-libre.org/?img=1536607288.png
What I did wrong?

@tilgovi
Copy link
Collaborator

tilgovi commented Sep 10, 2018

@Somewater this is not implemented yet in Gunicorn.

@tilgovi
Copy link
Collaborator

tilgovi commented Sep 10, 2018

Oh, my fault. I thought you were asking about Gunicorn. Please keep general questions about gc.freeze() in Python to other forums. This ticket is about using it in Gunicorn.

@PetrochukM
Copy link

What does gunicorn need to do to support gc.freeze? Can we just add gc.freeze after loading large objects globally in the app?

@tilgovi
Copy link
Collaborator

tilgovi commented May 20, 2020

@PetrochukM for the benefits to be large, the application probably needs to be using --preload. Then you could test it out by calling gc.freeze() in a Gunicorn hook, such as on_starting or when_ready.

For maximum benefit, we might want to pause garbage collection in the master process, but I suspect Gunicorn will need some changes to make that safe or else the master process will continue to grow in memory as workers restart.

Please post any results if you do experiment!

@aisk
Copy link

aisk commented May 20, 2020

@tilgovi Hi I think users can call gc.freeze() by hand but most people do not know it because it's hidden so deep.

I think it's cool to have an option do enable it, for users can found it and know this feature. Furthermore, maybe we can enable it automacally if --preload is specified?

@tilgovi
Copy link
Collaborator

tilgovi commented May 20, 2020

@aisk yes, that's why I'm encouraging experimentation with this by anyone who does feel comfortable doing this by hand. It would definitely be a better experience for users if it's automatic, but for that to be possible we have to be convinced that there are no bad effects. For that we need people to test and for that we don't need any code changes, just people who are willing to add some calls to their hooks.

@PetrochukM
Copy link

@tilgovi Thank you!

I was wondering if there is anything special that was needed, and it sounds like nothing special is required apart from running --preload.

I am testing a change with gc.freeze. Do you have any recommendations on measuring shared memory, to see if it actually works?

@tilgovi
Copy link
Collaborator

tilgovi commented May 21, 2020

@PetrochukM I do not, but I'm very interested to hear what you find! Thanks for trying it out.

@jab
Copy link

jab commented May 21, 2020

@aisk
Copy link

aisk commented May 22, 2020

FYI @jab I have some projects in company, which need a big readonly shared dict on start up, after gc.freeze, the memory is reduced as expected.

@joekohlsdorf
Copy link

joekohlsdorf commented Jun 20, 2020

It can be done with the following --config:

import gc

preload_app = True
workers = 1
worker_class = "sync"

# disable GC in master as early as possible
gc.disable()

def when_ready(server):
    # freeze objects after preloading app
    gc.freeze()
    print("Objects frozen in perm gen: ", gc.get_freeze_count())

def post_fork(server, worker):
    # reenable GC on worker
    gc.enable()

When testing this you'll see the effect with time. The memory usage of worker processes will initially be the same as without this configuration but it should grow slower when you hit the workers with traffic because GC won't touch shared memory areas.

Note that Instagram did this because they have low-memory servers (the original post shows 32 CPUs and 32GB of memory), today you won't find this type of offering from any cloud provider. You save some memory and some CPU cycles because GC doesn't touch these memory areas but don't expect much from it. Leaving GC disabled is probably not an option, in my case worker RSS skyrockets eating up all the benefits from it, just like Instagram had to give up on it.
I'm not sure how they managed to share 250MB of memory, I'm working with a huge app and at most I see 50MB. My guess is that they do some magic in preloading, Django is ultra lazy and preloading itself won't do much.

mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Aug 17, 2023
For details see benoitc/gunicorn#1640 and
https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

I think this is the most subtle to test change. I believe this is
working.

I started a gunicorn instance with 4 workers:

```
GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4
```

Then i use the following script against that instance

```
import threading
import requests

def req():
    for i in range(10000):
        requests.get('http://localhost:8080/history/current_history_json')

for i in range(10):
    threading.Thread(target=req).start()
```

I see that the memory consumption increases much more *during* requests
without this commit. It eventually decreases again, but I think not to
the same baseline level (hard to tell without more elaborate testing). I
attribute the higher memory load during requests to the fact that the
garbage collection requiring to inspect more objects, taking more time
to run and therefor not running as fast? I'm really not sure, I think we
should just roll this out and see, it should be fairly obvious from the
grafana dashboards.
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Aug 17, 2023
For details see benoitc/gunicorn#1640 and
https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

I think this is the most subtle to test change. I believe this is
working.

I started a gunicorn instance with 4 workers:

```
GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4
```

Then i use the following script against that instance

```
import threading
import requests

def req():
    for i in range(10000):
        requests.get('http://localhost:8080/history/current_history_json')

for i in range(10):
    threading.Thread(target=req).start()
```

I see that the memory consumption increases much more *during* requests
without this commit. It eventually decreases again, but I think not to
the same baseline level (hard to tell without more elaborate testing). I
attribute the higher memory load during requests to the fact that the
garbage collection requiring to inspect more objects, taking more time
to run and therefor not running as fast? I'm really not sure, I think we
should just roll this out and see, it should be fairly obvious from the
grafana dashboards.
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Aug 26, 2023
For details see benoitc/gunicorn#1640 and
https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

I think this is the most subtle to test change. I believe this is
working.

I started a gunicorn instance with 4 workers:

```
GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4
```

Then i use the following script against that instance

```
import threading
import requests

def req():
    for i in range(10000):
        requests.get('http://localhost:8080/history/current_history_json')

for i in range(10):
    threading.Thread(target=req).start()
```

I see that the memory consumption increases much more *during* requests
without this commit. It eventually decreases again, but I think not to
the same baseline level (hard to tell without more elaborate testing). I
attribute the higher memory load during requests to the fact that the
garbage collection requiring to inspect more objects, taking more time
to run and therefor not running as fast? I'm really not sure, I think we
should just roll this out and see, it should be fairly obvious from the
grafana dashboards.
mvdbeek added a commit to mvdbeek/galaxy that referenced this issue Aug 26, 2023
For details see benoitc/gunicorn#1640 and
https://instagram-engineering.com/copy-on-write-friendly-python-garbage-collection-ad6ed5233ddf

I think this is the most subtle to test change. I believe this is
working.

I started a gunicorn instance with 4 workers:

```
GALAXY_CONFIG_FILE="config/galaxy.yml" gunicorn 'galaxy.webapps.galaxy.fast_factory:factory()' -k galaxy.webapps.galaxy.workers.Worker --pythonpath lib --bind=localhost:8080 --config lib/galaxy/web_stack/gunicorn_config.py --preload -w 4
```

Then i use the following script against that instance

```
import threading
import requests

def req():
    for i in range(10000):
        requests.get('http://localhost:8080/history/current_history_json')

for i in range(10):
    threading.Thread(target=req).start()
```

I see that the memory consumption increases much more *during* requests
without this commit. It eventually decreases again, but I think not to
the same baseline level (hard to tell without more elaborate testing). I
attribute the higher memory load during requests to the fact that the
garbage collection requiring to inspect more objects, taking more time
to run and therefor not running as fast? I'm really not sure, I think we
should just roll this out and see, it should be fairly obvious from the
grafana dashboards.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants