New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[#1847] Add middleware that cleans up the pylons response string. #2262
Conversation
Once the response string has been served, this middleware replaces the response string with a dummy one, so the original response can be garbage collected. This is lifted from https://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode This middleware is can be switched on by setting 'ckan.use_pylons_response_cleanup_middleware = True' in your development.ini
Just test that the homepage runs fine with the middleware actiavted
b4c9676
to
68d384c
Compare
and deployment.ini_tmpl
efb0d51
to
a7d5599
Compare
def __init__(self, iterable, callback, environ): | ||
self.__iterable = iterable | ||
self.__callback = callback | ||
self.__environ = environ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ew. private members with name munging
@aliceh75, might be nice if you could test this, as it's basically your fix without the timer, you might also want to look into using gunicorn + gevent that I suggested to utilize the process restart after serving a number of requests. |
Hi, I can confirm this stabilizes the memory. The test I was just running (continuous loop fetching 10,000 rows with 4 workers) saw the memory go from 800M to 1.8G and back quite rapidly. With the middleware enabled it is far more stable (It's been oscillating between 793M and 823M for a while now, most of the time staying around 810M). This is a great improvement :-) Thank you for doing this. |
mod_wsgi can also restart after a number of requests btw. |
🎉 woo, nice. Great work @joetsoi |
@joetsoi I've made a couple small changes: https://github.com/wardi/ckan/tree/1847-pylons-clean-middleware would you take a look and cherry-pick or merge into your branch? |
@joetsoi this link makes it easier to see the two commits https://github.com/wardi/ckan/commits/1847-pylons-clean-middleware Would you like a PR against your PR? :-) |
@wardi, we should just all give each other edit permissions to each other's ckan forks, that'd be distributed version control in action :p I've fixed up the tests/changelog and made it switch on by default, no point having it if it's not going to be used. |
@joetsoi sounds good. is this good to go? |
@wardi yeah it's all ready, it's just coveralls moaning. |
(fixes #1847)
Once the response string has been served, this middleware replaces the response string with a dummy one, so the original response can be garbage collected. This is lifted from https://code.google.com/p/modwsgi/wiki/RegisteringCleanupCode This middleware is can be switched on by setting 'ckan.use_pylons_response_cleanup_middleware = true' in your development.ini
So I did a comparison of with and without this enabled. Everyone likes graphs right?
Notably, the memory usage still grows with this enabled, but is a bit more stable. I also did some examining of the memory and this memory has been returned to python's free lists but has not been released by the python process. see (http://effbot.org/pyfaq/why-doesnt-python-release-the-memory-when-i-delete-a-large-object.htm). So we'll still use up as much memory as there are threads
So the other way around it is to do something like use gunicorn + gevent, although this isn't a fair direct comparison as this is only using one worker process, but I mainly suggest this as you can set worker processes to restart after serving a maximum number of requests (see http://docs.gunicorn.org/en/develop/configure.html#max-requests), which would release any memory that the process has been allocated but is not using.
Anyway, this isn't critical or anything as it does just fix an edge case for a very particular workload that most people will not come across. (still needs tests)