with try_files we use absurd amounts of cache #213

Closed
jeffkaufman opened this Issue Apr 3, 2013 · 4 comments

Projects

None yet

1 participant

@jeffkaufman
Google PageSpeed member

Add to conf:

server {
  listen 8051;
  server_name tryfiles.example.com;
  pagespeed FileCachePath "/var/ngx_pagespeed_cache";

  location ~ "\.pagespeed\.([a-z]\.)?[a-z]{2}\.[^.]{10}\.[^.]+" { }
  location ~ "^/ngx_pagespeed_static/" { }
  location ~ "^/ngx_pagespeed_beacon$" { }

  try_files $uri $uri/ /mod_pagespeed_example/index.html;
}

Then run:

$ cd /var/ngx_pagespeed_cache
$ du -hs *
1.9M    http,3A
1.6M    rname
# There's already some stuff in these caches because I ran the system tests.
# I think that's unrelated.
$ while true ; do du -hs * ; echo ; sleep 1 ; done

Note that the caches are constant-size.

Now in another window run:

$ curl http://localhost:8051/this-does-not-exist -H Host:tryfiles.example.com

Observe cache growth:

2.1M        http,3A
2.0M        rname

2.7M        http,3A
2.9M        rname

3.5M        http,3A
3.9M        rname

3.8M        http,3A
5.3M        rname

5.8M        http,3A
6.4M        rname

5.9M        http,3A
8.1M        rname

8.6M        http,3A
9.6M        rname

11M http,3A
12M rname

13M http,3A
14M rname

18M http,3A
17M rname

20M http,3A
20M rname

26M http,3A
23M rname

[killall nginx]

26M http,3A
23M rname

The contents of the http cache subdirectory for tryfiles, which is the big one:

$ find http,3A/,2Ftryfiles.example.com,3A8051/
http,3A/,2Ftryfiles.example.com,3A8051/
http,3A/,2Ftryfiles.example.com,3A8051/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles/images
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/styles/images/pagespeed_logo.png,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/images
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/images/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/images/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/images/pagespeed_logo.png,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/images/images
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/styles/images/images/pagespeed_logo.png,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/images
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/images/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/images/styles/styles
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/images/styles/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/images/styles/index_style.css,
http,3A/,2Ftryfiles.example.com,3A8051/styles/styles/styles/styles/styles/styles/styles/styles/images/styles/images
...

Still working on debugging this.

@jeffkaufman
Google PageSpeed member

Looking at the access log, we make a huge number of loopback requests:

127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /this-does-not-exist HTTP/1.1" 200 23627 "-" "curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.1 zlib/1.2.3.4 libidn/1.23 librtmp/2.3"
127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /styles/index_style.css HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /styles/styles/index_style.css HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /styles/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /images/styles/index_style.css HTTP/1.1" 200 23626 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:29 -0400] "GET /images/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:30 -0400] "GET /styles/styles/styles/index_style.css HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:30 -0400] "GET /styles/styles/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:30 -0400] "GET /styles/images/styles/index_style.css HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:30 -0400] "GET /styles/images/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
...
127.0.0.1 - - [03/Apr/2013:14:42:32 -0400] "GET /styles/styles/images/images/styles/images/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:32 -0400] "GET /styles/styles/images/images/images/styles/styles/index_style.css HTTP/1.1" 200 23626 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:32 -0400] "GET /styles/styles/images/images/images/styles/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:32 -0400] "GET /styles/styles/images/images/images/images/styles/index_style.css HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
127.0.0.1 - - [03/Apr/2013:14:42:32 -0400] "GET /styles/styles/images/images/images/images/images/pagespeed_logo.png HTTP/1.1" 200 23627 "-" "Serf/1.1.0 mod_pagespeed/1.4.0.0-2729"
[nginx killed]
@jeffkaufman
Google PageSpeed member

I think what's happening is:

  • request comes in for /this-does-not-exist
  • because it doesn't exist, try_files does in internal redirect to /mod_pagespeed_example/index.html
  • index.html has relative links to styles/index_style.css and images/pagespeed_logo.png
  • pagespeed tries to fetch those resources in the background, but relative to a base url of /this-does-not-exist instead of /mod_pagespeed_example/index.html, which means they also don't exist
  • because of try_files we repeat this all again, but with /images/... and /styles/... and then /images/images/styles..., images/images/images/..., /styles/images/images/... etc

There are two problems here:

  • simple: pagespeed is supposed to ignore loopback fetches and not handle them, not sure why that's broken here
  • harder: ngx_pagespeed needs to understand internal redirects when determining the "url" of a page.
@jeffkaufman
Google PageSpeed member

You can see this configuration on 08891eb

@jeffkaufman
Google PageSpeed member

we need to port and call instaweb_handler.cc:is_pagespeed_subrequest()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment