Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to cache minified pages? #53

Closed
mkoistinen opened this issue May 27, 2013 · 20 comments
Closed

How to cache minified pages? #53

mkoistinen opened this issue May 27, 2013 · 20 comments

Comments

@mkoistinen
Copy link

Hi, I'd love to use HtmlMinifyMiddleware, but I can't figure out where to put it in my settings.MIDDLEWARE so that it does its job, but is then also cached, so that we don't have to minify again until the page changes or the cache expires. As it is now, when I disable HtmlMinifyMiddleware, my server can handle >60X more requests per second, which suggests that, when it is enabled, HtmlMinifyMiddleware is doing its job for each request. I'd like it to do its job only when the page is rewritten to memcache via UpdateCacheMiddleware/FetchFromCacheMiddleware.

My settings look a bit like this:

MIDDLEWARE = (
    'django.middleware.cache.UpdateCacheMiddleware',
    'htmlmin.middleware.HtmlMinifyMiddleware',
    ...  # Other middleware
    'django.middleware.cache.FetchFromCacheMiddleware',
    'django.contrib.redirects.middleware.RedirectFallbackMiddleware',
)

My reasoning (which could easily be misguided), is that minification should be the last thing to happen to a response body before it gets stored in the cache, hence, it should be the last thing before UpdateCacheMiddleware on the way "out" as a response, therefore the first thing after UpdateCacheMiddleware in the MIDDLEWARE tuple.

I'm using Python 2.7 and Django 1.4, if that is relavant.

What am I doing wrong? Also, whatever the answer, it would make a good addition to the installation guide.

@andrewsmedina
Copy link
Member

Hi @mkoistinen ,
the htmlmin should be cached by django cache middleware. I will try to reproduce it and debug why the cache is not working.

@mkoistinen
Copy link
Author

Any luck on this?

@andrewsmedina
Copy link
Member

@mkoistinen I did a test and the cache worked.

The order of the middleware:

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.UpdateCacheMiddleware',
    ... #other middlewares
    'htmlmin.middleware.HtmlMinifyMiddleware',
    'django.middleware.cache.FetchFromCacheMiddleware',
)

@mkoistinen
Copy link
Author

Hmmm, I wish I could say the same for my tests.

With the cache primed and HTML_MINIFY = False

> ab -n 500 -n 50 http://blah.tld
...
              min  mean[+/-sd] median   max
Total:          3    5   1.7      4      15
...

With the cache 'primed' and HTML_MINIFY = True

> ab -n 500 -n 50 http://blah.tld
...
              min  mean[+/-sd] median   max
Total:        160  215  33.5    213     327
...

And, since these tests are performed with ab, there shouldn't be anything to vary headers, between calls. This suggests strongly to me that django-htmlmin is still working even though we should be simply grabbing the pre-minified html from the cache.

Both tests run from the same server, which is different from the websever. These tests are repeatable and it appears to make no difference if the HtmlMin middleware is just after 'django.middleware.cache.UpdateCacheMiddleware' or just before 'django.middleware.cache.FetchFromCacheMiddleware'.

If you're not seeing the same sort of results, then this is something unique to my setup. Here's my full list of middleware, in case you spot something interesting:

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.UpdateCacheMiddleware',

    # This kills performance when enabled
    'htmlmin.middleware.HtmlMinifyMiddleware',

    'django.middleware.locale.LocaleMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'cms.middleware.user.CurrentUserMiddleware',
    'cms.middleware.page.CurrentPageMiddleware',
    'cms.middleware.toolbar.ToolbarMiddleware',
    'cms.middleware.language.LanguageCookieMiddleware',
    'django.middleware.cache.FetchFromCacheMiddleware',
    'django.contrib.redirects.middleware.RedirectFallbackMiddleware',
)

@mkoistinen
Copy link
Author

I just tried the same tests on another project which doesn't use any of the cms.* middleware, but is otherwise using a similar set of MW. Results were the same. When HTML_MINIFY = True, I get lack-lustre performance, when HTML_MINIFY = False, its super fast, when the cache is primed. I'd expect the performance to be slightly better not 43X worse.

Can you share more details of your setup?

@andrewsmedina
Copy link
Member

my settings:

MIDDLEWARE_CLASSES = (
'django.middleware.cache.UpdateCacheMiddleware',
'django.middleware.common.CommonMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'htmlmin.middleware.HtmlMinifyMiddleware',
'django.middleware.cache.FetchFromCacheMiddleware',
)

CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.locmem.LocMemCache',
'LOCATION': 'unique-snowflake'
}
}

@idavydov
Copy link

I can confirm that using these settings significantly slows down server response when caching is enabled.

@idavydov
Copy link

Ok. I think I know the source of the problem. From the Django documentation:

Unlike the process_request() and process_view() methods, the process_response() method is always called, even if the process_request() and process_view() methods of the same middleware class were skipped (because an earlier middleware method returned an HttpResponse). In particular, this means that your process_response() method cannot rely on setup done in process_request().

I'm not sure how the response is saved in django caching middleware. But do you think this is possible to mark content as minified somehow and check in htmlmin middleware whether it was minified already.

@moeffju
Copy link

moeffju commented Nov 28, 2013

This is fixed by #61. Make sure the order of middlewares is correct, i.e.

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.UpdateCacheMiddleware',
    'htmlmin.middleware.HtmlMinifyMiddleware',
    // ... other middlewares
    'django.middleware.cache.FetchFromCacheMiddleware',
)

@mkoistinen
Copy link
Author

Hmmm, I wish I could agree with this assessment. Here's what I'm finding:

# WITHOUT HtmlMinify
ab -n 1000 -c 25 http://www.website.com/
Requests per second:    345.10 [#/sec] (mean)

Note: This is on a single-core server (2GHz). Without caching at all, this number is about 40 Req. per sec.

# WITH HtmlMinify in the MW order as specified above.
ab -n 1000 -c 25 http://www.website.com/
Requests per second:    15.18 [#/sec] (mean)

My Middleware:

MIDDLEWARE_CLASSES = (
    'django.middleware.cache.UpdateCacheMiddleware',
    'htmlmin.middleware.HtmlMinifyMiddleware',
    'django.middleware.locale.LocaleMiddleware',
    'django.middleware.common.CommonMiddleware',
    'django.contrib.sessions.middleware.SessionMiddleware',
    'django.middleware.csrf.CsrfViewMiddleware',
    'django.contrib.auth.middleware.AuthenticationMiddleware',
    'django.contrib.messages.middleware.MessageMiddleware',
    'cms.middleware.user.CurrentUserMiddleware',
    'cms.middleware.page.CurrentPageMiddleware',
    'cms.middleware.toolbar.ToolbarMiddleware',
    'cms.middleware.language.LanguageCookieMiddleware',
    'django.middleware.cache.FetchFromCacheMiddleware',
    'django.contrib.redirects.middleware.RedirectFallbackMiddleware',
)
> pip freeze

Django==1.4.5
MySQL-python==1.2.4
Pillow==2.2.1
PyHyphen==2.0.4
South==0.8.1
argparse==1.2.1
beautifulsoup4==4.3.2
distribute==0.7.3
django-admin-honeypot==0.2.5
django-admin-sortable==1.5.5
-e git+https://github.com/mkoistinen/django-bitly.git@7ab5cb6241a51531302d2ca3b6ea8b634e4c36b7#egg=django_bitly-dev
django-classy-tags==0.4
django-cms==2.4.2
django-debug-toolbar==0.9.4
django-filer==0.9.5
-e git+https://github.com/cobrateam/django-htmlmin.git@84f3bd51feaa227541354991bf549e4eebe06ba6#egg=django_htmlmin-dev
django-hvad==0.3
django-memcache-status==1.1
django-memcached==0.1.2
django-mptt==0.5.2
django-polymorphic==0.5.1
django-sekizai==0.7
django-sortedm2m==0.6.0
easy-thumbnails==1.3
html5lib==1.0b3
mailsnake==1.6.2
oauthlib==0.5.1
python-memcached==1.53
requests==1.2.3
requests-oauthlib==0.3.2
simplejson==3.3.0
six==1.3.0
twython==3.0.0
wsgiref==0.1.2

How is it that my results so are different?

@moeffju
Copy link

moeffju commented Nov 28, 2013

What cache backend are you using?

@moeffju
Copy link

moeffju commented Nov 28, 2013

Sorry, my mistake: You need to put HtmlMinifyMiddleware after FetchFromCacheMiddleware.
Since this is a sub-optimal solution, I will split HtmlMinifyMiddleware into two halves, just like CacheMiddleware.

This is an annoying Django wart.

@mkoistinen
Copy link
Author

After moving the middleware:

ab -n 1024 -c 16 http://www.website.com/
Requests per second: 387.85 #/sec # WITHOUT HtmlMinfy
Requests per second: 413.01 #/sec # WITH HtmlMinify

AWESOME ! Thanks so much!

Oh, and just for completeness, here it is with my Varnish cache in front of everything: =)
Requests per second: 3318.28 #/sec

@mkoistinen
Copy link
Author

I've closed this ticket, but I'm still anxious to get the proper, "split into two halves" solution =)

Thanks again.

@moeffju
Copy link

moeffju commented Nov 28, 2013

With the middleware split, I save about 40ms per request on my test app, and about 4 MB over 1000 requests over the wire (yes, it’s a complex app :D)

@moeffju
Copy link

moeffju commented Nov 28, 2013

Cf. #64.

@mkoistinen
Copy link
Author

Can't make it work. Installed the repo, installed both MW as directed and I get nothing but 500s. Sadly, I have logging disabled on this project, so I'll have to test it on another to get a log trace. I'll come back shortly.

@mkoistinen
Copy link
Author

OK, I had some issue with pip, it seems. I cleared out the old, then reinstalled from your clone. Seems to work! I couldn't see any significant difference in performance, but, then again, my app is already pretty lean.

Thanks again!

@mkoistinen
Copy link
Author

This doesn't appear to be in the master branch yet. Are there plans to do so?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants