New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add wildcard support to ModPagespeedLoadFromFile #377

Closed
GoogleCodeExporter opened this Issue Apr 6, 2015 · 26 comments

Comments

Projects
None yet
1 participant
@GoogleCodeExporter

GoogleCodeExporter commented Apr 6, 2015

From: Andrew Mattie amattie@gmail.com
Feb 2 (5 days ago)

Since the domain mapping directives allow for wildcards, it'd be nice if 
ModPagespeedLoadFromFile also allowed for wildcards. My testing revealed that 
it doesn't currently (v0.10.19.5-1253). I intend for my setup to have multiple 
subdomains off a main domain (all sharing the same files), but I unfortunately 
can't easily accommodate that using ModPagespeedLoadFromFile without wildcard 
support.

Original issue reported on code.google.com by jmara...@google.com on 8 Feb 2012 at 2:11

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

So to be precise, you'd like to allow wildcards for the source-domain?  E.g.

 ModPagespeedLoadFromFile "http://*.example.com/static/" "/var/www/static/"

Original comment by jmara...@google.com on 8 Feb 2012 at 2:13

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Yep, exactly. The wildcard support in the MapRewriteDomain directive allows me 
to have those source domains rewritten, but I'd like the fetch for that content 
to come from the local filesystem.

Original comment by amat...@gmail.com on 8 Feb 2012 at 5:54

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

And it will work for you to map all the wildcard domains to the same directory? 
If so, that's doable.

Original comment by sligocki@google.com on 8 Feb 2012 at 5:59

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Yep. All of the subdomains' files are in the same directory.

Original comment by amat...@gmail.com on 8 Feb 2012 at 6:08

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Please cross-reference issue 388. It'd be nice to support at least the 
HTTP_HOST token like a few other Apache directives do (such as mod_rewrite). I 
could then also specify something like this:

  ModPagespeedLoadFromFile "http://%{HTTP_HOST}/static/" "/var/www/static/"

Original comment by amat...@gmail.com on 20 Feb 2012 at 6:19

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

all in favor of this idea :-)
sounds just like what I was asking for!

Original comment by ovi...@pacura.ru on 3 Apr 2012 at 11:40

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Original comment by jmara...@google.com on 24 May 2012 at 7:51

  • Changed state: Accepted
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

If you decide to do the wildcard, it'd be incredibly nice to make it work for 
directories as well. The impetus behind that request is our switch from 
*.example.com/static to www.example.com/*/static where * is the name of a given 
user. I wish we could just serve static content from www.example.com/static, 
but we unfortunately can't due to restrictions in the platform we're working 
with (WordPress).

Original comment by amat...@gmail.com on 24 May 2012 at 8:39

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

[deleted comment]
1 similar comment
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

[deleted comment]
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

In case anyone is keeping track, this now has as many votes for it as memcached 
support :)

The implementation of wildcard support in the host _and_ URL subdir would be of 
great benefit to the WordPress CMS as all of the paths used for assets are 
relative to the user, thus requiring mod_rewrite to work, instead of absolute 
to the app. For our large WP-based app platform, it'd be the single biggest 
improvement in MPS that could be made.

Original comment by amat...@gmail.com on 2 Aug 2012 at 7:09

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Bumping priority to reflect the popularity of this idea.

Also moving the task to Jeff as I think he's got more spare cycles now than 
Shawn.

Original comment by jmara...@google.com on 11 Aug 2012 at 2:54

  • Added labels: Priority-High
  • Removed labels: Priority-Medium
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

With "www.example.com/*/static" we'd have something like:

  ModPagespeedLoadFromFile "http://www.example.com/*/static/ "/var/www/static/"

That is, if we do a wildcard match on the directory, would 
"http://www.example.com/jon/static/example.jpg" and 
"http://www.example.com/mary/static/example.jpg" both load 
"/var/www/static/example.jpg"?

Original comment by jefftk@google.com on 13 Aug 2012 at 1:06

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Yep, while I can't speak for the other voters for this task, that's exactly the 
way I believe this should be approached. That is, in the example above, the * 
simply allows the user to trim away a directory. I'm not sure it'd be necessary 
to allow for more than one wildcard; anyone else care to speak up?

Please also take a look at comment #1 and comment #5. #1 is very similar where 
the wildcard is in the domain instead of the directory. While not a true 
wildcard, #5 is incredibly useful and in the same vein as the overall idea, and 
thus may be prudent to work on at the same time as the rest of this.

Original comment by amat...@gmail.com on 13 Aug 2012 at 4:46

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

This issue was closed by revision r1827.

Original comment by jefftk@google.com on 27 Aug 2012 at 5:41

  • Changed state: Fixed
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

We decided to implement this with a new directive, 
ModPagespeedLoadFromFileMatch.  Documentation will come, but usage is like:

    ModPagespeedLoadFromFileMatch "^https?://example\.com/~([^/]*)/static/" "/var/www/static/\1"

Which would apply to both http://example.com and https://example.com, and would 
map ~foo/static/... to /var/www/static/foo/...

For amattie's request in #8 above, the directive should be:

    ModPagespeedLoadFromFileMatch "^http://www.example.com/[^/]*/static/" "/var/www/static/"

Original comment by jefftk@google.com on 27 Aug 2012 at 5:48

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Original comment by sligocki@google.com on 27 Aug 2012 at 6:02

  • Added labels: release-note
@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

I just built from trunk and tried this. It doesn't appear to work. My URLs look 
like this:

https://www.example.com/foo-user/wp-admin/css/colors-fresh.css

Here's the directive I tried:

ModPagespeedLoadFromFileMatch "^https?://www.example.com/[^/]+/([^/]+)/" 
"/var/www/public/\1/"

From my logs:

[error] [mod_pagespeed 0.10.0.0-2014 @4813] 
/var/www/public/css/colors-fresh.css:0: open input file (code=2 No such file or 
directory)

I also tried [ModPagespeedLoadFromFileMatch "^https?://www.example.com/[^/]+/" 
"/var/www/public/"] as the directive and saw the exact same thing in my logs.

I then tried explicitly specifying the directory using 
[ModPagespeedLoadFromFileMatch "^https?://www.example.com/[^/]+/wp-admin/" 
"/var/www/public/wp-admin/"]. It looks like the directive was completely 
ignored in that case: "[info] [mod_pagespeed 0.10.0.0-2014 @5089] Cannot fetch 
url 'https://www.example.com/wp-admin/css/colors-fresh.css?ver=3.4.2': as https 
is not supported"

Original comment by amat...@gmail.com on 1 Oct 2012 at 5:27

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

FYI, looks like mod_pagespeed is seeing this URL as 
https://www.example.com/wp-admin/css/colors-fresh.css (without the /foo-user/). 
Do you know how your system is rewriting URLs?

Original comment by sligocki@google.com on 1 Oct 2012 at 5:37

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

On further investigation, it appears this may instead be an important 
documentation issue. I am using mod_rewrite to internally map 
"https://www.example.com/foo-user/wp-admin/css/colors-fresh.css" to 
"https://www.example.com/wp-admin/css/colors-fresh.css". 
ModPagespeedLoadFromFileMatch appears to interpret the rewritten and final URL, 
not the URL requested by the browser. After adjusting my directive to reflect 
the rewritten URL, it appears to work correctly.

Original comment by amat...@gmail.com on 1 Oct 2012 at 5:38

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

We should be scanning the URL and saving it before mod_rewrite runs, but I 
agree that looks like what's happening. Can explain a little more about your 
setup? What modules are you using with Apache, what directives do you use to 
rewrite? So that we can try to reproduce this.

For example, you are not rewriting the URLs on one server and then requesting 
them from a second server, right? That would explain the different URLs.

Original comment by sligocki@google.com on 1 Oct 2012 at 5:46

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Using these modules:

LoadModule asis_module        modules/mod_asis.so
LoadModule auth_basic_module  modules/mod_auth_basic.so
LoadModule authn_file_module  modules/mod_authn_file.so
LoadModule authz_host_module  modules/mod_authz_host.so
LoadModule authz_user_module  modules/mod_authz_user.so
LoadModule deflate_module     modules/mod_deflate.so
LoadModule dir_module         modules/mod_dir.so
LoadModule env_module         modules/mod_env.so
LoadModule expires_module     modules/mod_expires.so
LoadModule headers_module     modules/mod_headers.so
LoadModule mime_module        modules/mod_mime.so
LoadModule pagespeed_module   modules/mod_pagespeed.so
LoadModule php5_module        modules/libphp5.so
LoadModule rewrite_module     modules/mod_rewrite.so
LoadModule setenvif_module    modules/mod_setenvif.so
LoadModule ssl_module         modules/mod_ssl.so

Rewrite directives are complicated and long, but I picked out the important 
bits below:

  RewriteEngine  On
  RewriteBase    /

  # files, directories, and anything requested on our static host header should stop here
  RewriteCond    %{REQUEST_FILENAME} -f [OR]
  RewriteCond    %{REQUEST_FILENAME} -d
  RewriteRule    ^ - [L]

  # handle wordpress's subdir network functionality requirements
  RewriteRule    ^[_~\-\d\w]+/(wp-(content|admin|includes)/.*) $1 [L]

  # continuing from the https -> http handling above, we want to redirect non-admin https requests to http
  RewriteCond    %{ENV:HTTPS} on
  RewriteCond    %{REQUEST_URI} !/wp-admin(/.*)?$
  RewriteRule    .* http://%{HTTP_HOST}/$0 [R=301,E=CACHE_301,L]

With the understanding that the above rules are in place, it also seems that 
mod_rewrite is rewriting URLs before mod_pagespeed can processes and respond to 
the request. That didn't happen before, but then again, I was never able to 
have pagespeed process HTTPS requests prior to this feature. I'm seeing 
requests like this 
<https://www.example.com/wp-content/plugins/akismet/akismet.js,qver=2.5.4.6.page
speed.jm.UhiTLymbXH.js> get 301'd to 
<http://www.example.com//wp-content/plugins/akismet/akismet.js,qver=2.5.4.6.page
speed.jm.UhiTLymbXH.js>. Notice the loss of the https? If I remove the HTTPS -> 
HTTP rule, the redirect doesn't happen.

Original comment by amat...@gmail.com on 1 Oct 2012 at 6:09

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

To extend comment #22, if I add [RewriteCond %{REQUEST_URI} !pagespeed] as a 
condition to my HTTPS -> HTTP redirect rule, I can prevent the redirect from 
happening.

In addition to the apparent filter processing order issue, I also incidentally 
noticed that the cache isn't immediately shared between the URLs the regex 
matches. For example, when I load https://www.example.com/foo-user/wp-admin/, 
reload, and then view source, I see rewritten assets. When I then load 
http://www.example.com/bar-user/wp-admin/ and view source (no reload), none of 
the assets appear to be rewritten. If the URLs resolve to the same file path, 
I'd expect they'd share the same cache and thus have the source for "bar-user" 
immediately show the rewritten assets on first load.

Original comment by amat...@gmail.com on 1 Oct 2012 at 8:27

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Would it be preferred if I open another bug for these two issues (filter order 
issue, cache seemingly not shared issue)?

Original comment by amat...@gmail.com on 10 Oct 2012 at 12:07

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Yes, please open new feature requests for those.

Original comment by sligocki@google.com on 10 Oct 2012 at 2:57

@GoogleCodeExporter

This comment has been minimized.

GoogleCodeExporter commented Apr 6, 2015

Original comment by j...@google.com on 26 Oct 2012 at 5:49

  • Added labels: Milestone-v23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment