Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Nginx Documentation #343

Closed
mayamcdougall opened this issue Apr 16, 2016 · 49 comments
Closed

Improve Nginx Documentation #343

mayamcdougall opened this issue Apr 16, 2016 · 49 comments

Comments

@mayamcdougall
Copy link
Collaborator

Pico's Nginx documentation is practically non-existent. It should be improved and provide some detailed examples equivalent to what is provided for Apache in .htaccess.

A discussion has already begun on d8f9166. It seems there is a lot to consider when using a Regex Location Block for Pico's rewriting, as Nginx will only use the first matching rule it finds. Since a typical Nginx setup also uses Regex to detect when to send a page to PHP, a misconfiguration can easily break PHP detection altogether.

The solution seems to be placing the PHP Location Block above any Pico rewrite blocks, that way any requests to index.php get properly matched.

In the current example, location ~ ^/pico(.*) matches both the Page ID you're trying to rewrite (eg /pico/page) and also the index itself (at /pico/index.php, resolved from the URI /pico/?page).

Since /pico/index.php matches location ~ ^/pico(.*), Nginx has succeeded in finding a match and will not move on to match it to location ~ \.php$. The PHP page gets served as a download file instead of interpreted by PHP.

@Robby-
Copy link

Robby- commented Apr 16, 2016

Hi, I'm the same Robby from IRC. While your proposed solution works, the solution I'm using is actually to include the PHP location block in each location block where I need it. To avoid having to duplicate this in too many places I put the PHP location block in a snippet file and include it in each location block as needed.

The reason I'm currently doing it like this is because of the following.

Not sure if this is useful for the situation described in this issue, but here is a problem I had:
The other day I was looking to password protect some specific directories of another PHP application with nginx and http auth and I found that PHP was still getting executed instead of requesting http auth.

The following links has relevant reading material on nginx and (nested) location blocks:
https://serverfault.com/questions/375961/nginx-basic-http-authentication-on-php-script
https://serverfault.com/questions/119076/how-to-use-fastcgi-globally-and-basic-auth-in-sublocations-in-nginx

Back to pico.
Note: I'm not using a subdirectory for pico anymore, I've put it in the root directory instead.
So, I noticed pico has an .htaccess file and I wanted to do some nginx equivalent of protecting for example pico's config directory, by doing this:

location /config/ {
  return 403;
}

But, the above does not block access to /config/config.php (a blank page is shown instead of a 403).

So, taking into account the information in the 2 links I mentioned above, the following location block does block access to /config/config.php:

location /config/ {
  location ~ [^/]\.php(/|$) {
    return 403;
  }
}

Other than that, my current location block for pico is like this:

location ~ ^/(.*) {
  include snippets/fastcgi-php-custom.conf;
  index index.php;
  try_files $uri $uri/ /?$1&$args;
}

My snippets/fastcgi-php-custom.conf contents, based on the information in this link:

location ~ [^/]\.php(/|$) {
  fastcgi_split_path_info ^(.+?\.php)(/.*)$;
  if (!-f $document_root$fastcgi_script_name) {
    return 404;
  }
  fastcgi_pass unix:/var/run/php5-fpm.sock;
  fastcgi_index index.php;
  fastcgi_param SCRIPT_FILENAME $document_root$fastcgi_script_name;
  fastcgi_param PATH_INFO $fastcgi_path_info;
  include fastcgi_params;
}

Note that I'm not knowledgeable about regex, so most likely there's a better way to do some of the things I mentioned. I just wanted to share my findings.

@mayamcdougall
Copy link
Collaborator Author

In order to block access to config or other folders, I think you'd need to match it with regex similar to this:

location ~ /(config|content|content-sample|lib|vendor)/.* or location ~ /config/.*

The important part being that you'd need to tell it to match anything in the config folder, not just the folder itself.

If you haven't checked it out yet, there's some more examples at d8f9166. The discussion started there, but should really be continued here.

It's a really funny coincidence that you brought up the Nginx issues when you did because I just decided to check it out. Together it seems we've generated a bit of interest on the topic.

Not sure if I'd agree with the idea of including the php block in every location block... but I didn't even know you could nest location blocks. I've got a lot of similar sections across several different site configs, so I plan to explore using include to simplify that process a bit.

So far it sounds like we have three different approaches to the subject that each have some benefits and drawbacks.

Ultimately, there may not be a "right" answer, but we'll have to decide what would be the best configuration for new users. An experienced Nginx user will likely be able to take whatever example we provide and fit it to their own needs/preferences anyway.

@mayamcdougall
Copy link
Collaborator Author

So, there's a few criteria that I think should be focused on for the example config:

  • Ease of Use - The easier to implement, the better. If it can be configured in a way that requires almost no user interaction (like being as simple as adding .htaccess), that should be the goal. If possible, require no editing by the user. Perhaps it's just a config file that they add to their site config using include.
  • Performance - One of the biggest benefits of Nginx is the performance it brings. From the brief research I've done, it seems that different solutions can have vastly different performance. They even have a section of their documentation titled Things to Avoid at all Costs. I'm not sure, for example, if using Location to rewrite the url vs using rewrite has any sort of performance difference. They both seem like valid options, but if I had to guess I'd say rewrite is more taxing. While this difference is probably not enough to matter, we should still treat each situation as if it will be a production level deployment with tons of simultaneous users.
  • Simplicity - Why write in 50 lines what can be done in 10. The example should be simple and easy to read. Unfortunately, I think it'll be hard to draw the line between what counts as "configuring Pico" and "general Nginx configuration". Do we hold the user's hand to make sure they get PHP configured properly or should this be up to Nginx tutorials? As I've described, I had a lot of difficulty with PHP due to the way Nginx processes Locations. The information on this issue was surprisingly hard to find and not something that was immediately obvious to me when I encountered difficulty. We could write probably write pages and pages about different configuration solutions, so I'm really not sure where the line would be drawn. Perhaps provide a simple example config, but an entire page of additional reading if the user is interested.

Just some food for thought to hopefully spark more discussion.

@iwedler
Copy link

iwedler commented Apr 17, 2016

@smcdougall To block access to folders and files this rule is enough, no need for more regex.

location ~* /(config|content|content-sample|lib|vendor)

The essential part is WHERE you place this. It MUST be before the php processing rule in order to block also php files. Processing order is the word. I testet it and posted a config here: d8f9166

@mayamcdougall
Copy link
Collaborator Author

Ah, that makes sense. Good to know.

Do you have any other input on the criteria I thought up? You seem like you're probably the most experienced of the four of us right now. 😃

@mayamcdougall
Copy link
Collaborator Author

So what has worked well for me was a separate pico-common config that I include in all my Pico sites. It contains all the access rules and the php rule. The main benefit being that it gives me one place to edit the common settings for all my sites. I also have a second pico-root config that contains the rewrite rule currently in Pico's docs. Most of my sites have Pico in the Document Root, so I just include pico-root for quick access to that configuration. The one site I have setup with dual Pico subfolders has it's own configuration instead of including pico-root. Here are the two examples:

  • pico-common
        index index.html index.htm index.php;

        location ~* /\.ht {
                deny all;
        }

        location ~* /(config|content|content-sample|lib|vendor) {
        return 404;
        }

        location ~ \.php$ {
                fastcgi_split_path_info ^(.+\.php)(/.+)$;
                fastcgi_pass unix:/var/run/php5-fpm.sock;
                fastcgi_index index.php;
                include fastcgi_params;
        }
  • pico-root
       location ~ ^/(.*) {
                try_files $uri $uri/ /?$1&$args;
        }

Pico-root (or a manual rewrite configuration) gets included after pico-common in the server block.

What I'm aiming for here is simplicity. I like writing something once and using it everywhere.

Just wanted to share what seems to be working well for me.

I plan to draft up a Docs page about Nginx configuration sometime later today. My plan is to extend the existing section slightly, then link to another page entirely about Nginx. The page will be able to go in depth about the how's and why's of different configuration tips. It'll touch on important points, like Nginx's processing order, the fact that Nginx uses a centralized configuration (and how that makes configuring Pico a little harder), etc.

Obviously I'm by no means an expert on the topic, but I think I've gathered enough information to provide a good write-up on the subject. After I get the initial write-up done, it'd of course be open to further contribution. @iwedler, I'd especially like it if you could weigh in a little on your config, such as why you prefer your @pico fallback location over the regex location method (Pros, cons, etc).

I'll link the page and the PR here once I've written it.

@PhrozenByte
Copy link
Collaborator

Great to hear! Looking forward to your PR 👍 😃

Just a minor suggestion: You don't need a regex when installing Pico to the root directory (the regex's objective is to remove the subdirectory part of the URL). The following works just fine in this particular case:

location / {
    try_files $uri $uri/ /?$uri&$args
}

@mayamcdougall
Copy link
Collaborator Author

That's true I suppose. It got lost somewhere in my quest for a universal config. 😆 Would you rather I restore that example to the Docs and use the location ~ ^/pico(.*) as an optional "How to configure Pico in a subfolder" example?

@PhrozenByte
Copy link
Collaborator

I'm not sure yet, depends on how the docs page will look like.

@mayamcdougall
Copy link
Collaborator Author

Random discovery:

SetEnv PICO_URL_REWRITING 1 in Apache is
fastcgi_param PICO_URL_REWRITING 1; in Nginx (in the PHP location block)

No more $config['rewrite_url'] = true; 😄

That'll definitely be going in the Nginx page. I'm partway done with it so far, but it's taking longer than I though. Probably won't have it up today, but it's definitely in progress.

@mayamcdougall
Copy link
Collaborator Author

I finished a rough draft of the page. Take a look at #345. Any input is appreciated. 😃

@Robby-
Copy link

Robby- commented Apr 21, 2016

Looks great so far! The only issue is the php example location block, it poses a security risk. It is best to use the example from the nginx wiki due to security reasons mentioned here. Using the example mentioned in the first link doesn't require changing the cgi.fix_pathinfo setting either (it says so in the Notes section on the bottom) as it can break things that rely on PATH_INFO/PHP_SELF. For some reason the second link doesn't mention that.

@PhrozenByte
Copy link
Collaborator

PhrozenByte commented Apr 21, 2016

It should be sufficient to let the PHP location block match index.php only. Pico doesn't require PATH_INFO either.

@mayamcdougall
Copy link
Collaborator Author

I had forgotten about cgi.fix_pathinfo. Since PHP configuration is a little out-of-scope for Pico's docs, it sounds like the best solution would be to match only index.php. I'll probably add a brief explanation and/or a link with more details about why we've chosen this.

Also, I'd like to make the PHP block as simple as possible. While the other blocks are meant to be copied, the PHP block is really just an example. I'll try to emphasize this a little more. Since PHP configuration is somewhat dependent on the user's system, I don't want them to just copy and paste it. It should only be used as a reference.

@mayamcdougall
Copy link
Collaborator Author

😕 I've actually flip-flopped on this and I think I'd rather just mention cgi.fix_pathinfo as a warning.

I've had time to reread my page and remember that I planned to link to a php-fpm tutorial anyway.

I don't like the Nginx Wiki's example because it uses an if statement and I'm trying to keep the configuration as lean as possible.

Although I like the elegance of specifically matching index.php, I feel like it would be over-complicated to explain. I think it would require extra effort as well, such as:

  • If you used location /index.php$ You'd have to either have Pico in your Document Root or configure it to include your subfolder.
  • If you used location ~ /index.php$, you wouldn't have to configure anything, but I feel you'd still be vulnerable to the security risk described above.

On top of that, I think I'd have to break PHP Configuration into "Document Root" and "Subfolder" sections, which would make it more confusing.

@Robby-
Copy link

Robby- commented Apr 21, 2016

I know its meant as an example, but people are people and will copy paste, so its better to have a secure example in there in the first place. Also, do keep in mind people who may also wish to run other .php files on a Pico site, I'm sure there's people out there that would or do.

What's the problem with the if? It's perfectly fine to use it there. For Pico's case, and since the consensus here seems to have become to lock it down to index.php only, I guess the if statement is not needed, but otherwise it should be there.

@Robby-
Copy link

Robby- commented Apr 21, 2016

Reading the above, I feel more than one example configuration seems to be needed. For people that use Pico and also have other .php files, so it can't be locked down to just index.php (and then the if statement becomes a must-have for security).

@mayamcdougall
Copy link
Collaborator Author

Nah, now that I've thought about it, I don't want to lock it down to index.php either.

And Nginx even states that If is Evil. 😆

The thing is though, if they haven't followed another guide to properly setup php-fpm, then it's not going to be listening on that socket anyway. The copied Nginx rule won't do anything as far as I know.

I think a good, "Warning:" should be sufficient to inform a user about the security issue. Ultimately though, nothing we're providing in this example is at fault for the security issue. 😕

@Robby-
Copy link

Robby- commented Apr 21, 2016

Yeah, I've seen that If is Evil page, but note that they also say that it is perfectly fine to use it there in the Notes section at the bottom of the PHP example page. Then there also becomes no need to even touch or mention cgi.fix_pathinfo at all.

@mayamcdougall
Copy link
Collaborator Author

It's a tough call. I really don't want to make the example code more complicated than it has to be. Also, to me, it feels a lot like a workaround to what is clearly an insecure default setting for php-fpm. I don't like workarounds, lol, I'd rather a proper fix.

I will give it credit that it's the official recommendation of the Nginx Wiki though. I can see merits to doing it either way.

@PhrozenByte, @iwedler, what do you think? I'll go whichever way the crowd goes on this issue.

@PhrozenByte
Copy link
Collaborator

I don't feel like having the appropriate background knowledge to tell what solution is better...

The only thing I can say is that Pico neither uses PATH_INFO nor needs to execute any other file than index.php. Security always comes first, simplicity second. If both solutions have the same security level, choose the simpler one.

@mayamcdougall
Copy link
Collaborator Author

I'll likely include the if statement. The only extra security it brings is security from the user's own misconfiguration, but that is still extra security.

Ultimately, if the user wants to remove that line because they feel they know better, they can!

A user who isn't as experienced though would benefit from this extra included protection.

I suppose it's on the same level as blocking .htaccess or denying Indexes in Apache. These are safety measures that should be set by an experienced user, but they're provided nonetheless.

@mayamcdougall
Copy link
Collaborator Author

On the subject of .htaccess, I've been wondering what would be more secure: Denying access to the file, or returning 404.

The general example floating around the web is to use deny all to prevent access. But technically this does still inform a malicious party that there is a file there, it's just being denied.

Returning 404 on the other hand, would be the same response they'd get if the file really didn't exist.

I'm sure there's a good reason that deny all is used in every example I've come across, so I'm not about to change it. It's just something I've been wondering about.

@PhrozenByte
Copy link
Collaborator

Just a short question about the PATH_INFO topic: What happens if you remove both the fastcgi_split_path_info directive and if statement? Wouldn't be a try_files $uri =404; sufficient then?

@Robby-
Copy link

Robby- commented Apr 24, 2016

@smcdougall

On the subject of .htaccess, I've been wondering what would be more secure: Denying access to the file, or returning 404.

The general example floating around the web is to use deny all to prevent access. But technically this does still inform a malicious party that there is a file there, it's just being denied.

Returning 404 on the other hand, would be the same response they'd get if the file really didn't exist.

I'm sure there's a good reason that deny all is used in every example I've come across, so I'm not about to change it. It's just something I've been wondering about.

I've come up with an elegant way to fix this:

location ~* /(config|content|content-sample|lib|plugins|vendor|CHANGELOG.md|composer.json|composer.lock|CONTRIBUTING.md|LICENSE.md|README.md) {
  error_page 404 /?$uri&$args;
  return 404;
}

This makes Pico return its own 404 instead of nginx's own errors, which is even better I think.
Note that I'm using Pico in the root directory, so for Pico installations in subdirectories, the path should be adjusted.

@mayamcdougall
Copy link
Collaborator Author

If you mean the path for config, content, etc, I don't think you'd have to adjust it for a subdomain. Since your regex doesn't begin with ^, it'll match it anywhere in the url, not just at the beginning. You'd still have to change the path on the error_page line, so perhaps that's what you meant.

I hadn't thought of adding those other files to the 404 rule. That's a good idea. I'll test it out sometime and add something similar to the example. @PhrozenByte Those files should probably be added to Pico's .htaccess as well.

This doesn't actually answer the question you quoted from me though. I was asking specifically about the .htaccess file itself. Is it better to deny access to it or to return 404? All the examples online suggest using deny, but my own logic tells me that using return 404 would be "safer".

@PhrozenByte
Copy link
Collaborator

PhrozenByte commented Apr 24, 2016

@smcdougall, your thoughts about the reasons for using 404 Not Found instead of 403 Forbidden are completely right (plus some minor other "technical" reasons about how Apache works). Some other projects (like ownCloud) do it the same way.

Unfortunately I don't like extending the config|content|... rule to these fully static files in the base directory. I don't see a reason why we should block access to them by default. @Robby-, can you please explain your thoughts?

However, using Pico's 404.md for blocked contents makes absolute sense. I'll see if and how we can achieve this with Apache. #edit: I've implemented this with 49cb6c1 and 31e55ca. Pico now also always uses the on404Content... execution path when serving a 404.md (i.e. Pico sends a 404 Not Found header even when requesting http://example.com/pico/?404 directly; 404.md has always been treated specially, thus this actually isn't something new; see 6234be8). However, passing $0 ($uri with nginx) and giving Pico the chance to find a appropriate content file instead of redirecting everything to 404.md seems to be a better idea (see 31e55ca for a explanation).

@mayamcdougall
Copy link
Collaborator Author

My assumption for his reasoning is that these are files that are included by default in Pico, but not intended to be accessed on the webserver. Although they pose no security risk, they are files that a determined user could navigate to.

I don't know if they should be blocked or not, but I had never thought about the fact that they can be accessed. When I read his comment, I had a moment where I thought "you can't do that... can you?". Sure enough, if I navigate to /README.md on one of my Pico sites, it does indeed fetch me Pico's readme as a download. (Well, it would, if it wasn't NotePaper's readme I have stored there...)

It just feels a little uncomfortable. I just went through the "Pico Showcase" on the Wiki... all but three entries willingly offered me their README.md file. And one of those three appeared to be down altogether, so I can't really count that. Two of the readme's weren't even for Pico. 😆

I don't know what else to say about it (plus, I'm tired and his is going to be my last comment for the night). I just hadn't even thought about the fact that obviously these files would be accessible. I'd assume the same probably holds true for the owners of all the Pico sites I just probed. 😒

Anyway, I still don't know if they should be blocked by default, but was my reaction after reading @Robby-'s comment.

@mayamcdougall
Copy link
Collaborator Author

I should never say something is my last comment for the night 😒

Just caught up on some of the commits from the last few hours. I'll add .git to the Nginx 404 rule like in 49cb6c1 and see about rewriting it to Pico's internal 404 document as well.

@PhrozenByte
Copy link
Collaborator

PhrozenByte commented Apr 24, 2016

Starting with Pico 1.1 you can use try_files $uri $uri/ index.php; (sort of - see the updated inline user docs for a real example) no matter Pico is installed in a subdirectory or not (i.e. no regex necessary), see 6465c2b for details.

@mayamcdougall
Copy link
Collaborator Author

I'll have a chance to finish this up tomorrow (4/27). Do you want me to focus on 1.1 for it or focus on getting instructions finished for Pico's current version?

I should be able to easily get the existing examples finalized, merged, and then work on 1.1 afterward.

@PhrozenByte
Copy link
Collaborator

I should be able to easily get the existing examples finalized, merged, and then work on 1.1 afterward.

That's probably the best way to deal with it 😃

@mayamcdougall
Copy link
Collaborator Author

I thought so too. 👍

I'm going to finish it up with the examples we had finalized a few days ago. Things like using Pico's internal error page can be an enhancement for 1.1. It's a simple enough idea, but it will take a little more debugging before I'm sure we have a bullet-proof example. With that in mind, it might as well be done with the new infrastructure in 1.1.

@Robby-
Copy link

Robby- commented Apr 27, 2016

Sorry for the late response.

@smcdougall I also think 404 is the better response, instead of a 403.

@PhrozenByte My reason for adding these files to the 404 block is just personal preference I guess.
I do not wish these files to be served at all to someone who probes the webserver for seemingly random files, I guess I could just remove them manually on each Pico installation, but this method seems much more practical. Also changelog files and the like can be used to determine or find out if any vulnerable software may be installed on the webserver.

@mayamcdougall
Copy link
Collaborator Author

Indeed. I think it still deserves more discussion going forward. For now I'm working to finalize what we had, and we can address the rest afterward. In 1.1, it'll be much easier to use Pico's internal error page for things as well.

I could also potentially add a tip near the bottom about blocking access to those other files. I definitely understand where you're coming from.

@PhrozenByte
Copy link
Collaborator

Hmm... Yeah, hiding the version number isn't a bad idea (i.e. blocking access to composer.json, composer.lock and CHANGELOG.md - even though I don't know other composer projects doing this), but what's about README.md, LICENSE.md and CONTRIBUTING.md? Sure, they actually don't provide information what isn't accessible in another way (and the MIT license doesn't require a user to let a website visitor know the license of the software used), but they still may help a visitor of a website to determine which software is being used - maybe he wants to use it, too. Furthermore a user can still remove the files or block access to them manually. What do you think about adding composer.json, composer.lock and CHANGELOG.md as compromise? Or can you think of reasons why blocking access to README.md, LICENSE.md and CONTRIBUTING.md has more advantages than disadvantages?

@mayamcdougall
Copy link
Collaborator Author

I'm really not sure where I stand on the issue. I'd rather keep the examples simple, but if it could be a security issue then maybe it needs to change.

Right now, I'm looking at keeping feature parity with .htaccess. If we decide to block these files, I feel it should be in both places.

By the way, I noticed something that might need to be fixed in .htaccess because it needed to be fixed for Nginx. When you added .git to the rewrite rule in .htaccess, you forgot to account for .'s meaning in regex. A period would match any character, so while it would technically work, it should probably be changed to \. to escape it.

I've done that in my current copy of Nginx.md, and I've also combined the .htaccess rule with the rest of the Pico folders, since they're all return 404 now anyway. 😃

@mayamcdougall
Copy link
Collaborator Author

I think I've addressed most of the concerns we've discussed here, minus the ones postponed for 1.1. I'd call this page ready for review now. There's probably still some last minute fixes to make, so tell me what you think. Go ahead and tear it apart. 😉

http://smcdougall.github.io/Pico/nginx/
http://smcdougall.github.io/Pico/docs/#nginx-configuration

@mayamcdougall
Copy link
Collaborator Author

Great, now I need to update it again. 😒
(:laughing:)

@PhrozenByte
Copy link
Collaborator

@smcdougall's nginx configuration page finally got merged and he IMHO did a great job! 😃 You can find the final document at http://picocms.org/in-depth/nginx/. Any further thoughts about this @ALL?

@smcdougall, @Robby-, @iwedler

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants