Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/status?json&full is not suppressed when access.suppress_path used #10116

Open
theophileds opened this issue Dec 16, 2022 · 13 comments
Open

/status?json&full is not suppressed when access.suppress_path used #10116

theophileds opened this issue Dec 16, 2022 · 13 comments

Comments

@theophileds
Copy link

theophileds commented Dec 16, 2022

Description

The following php-fpm configuration is supposed to not log requests made on /status:

[www]
pm = "static"
pm.max_children = 5
pm.process_idle_timeout = 1m;
pm.max_requests = 500
pm.status_path = /status
access.suppress_path[] = /status

However, the Docker container running from the image php:8.2-fpm, is still logging it:

127.0.0.1 -  16/Dec/2022:10:39:41 +0000 "GET /status" 200

Is this the expected behavior?

PHP Version

8.2.0

Operating System

No response

@theophileds theophileds changed the title /status is not effective in access.suppress_path /status is not suppressed effectively when using access.suppress_path Dec 16, 2022
@theophileds
Copy link
Author

Any idea?

@bukka
Copy link
Member

bukka commented Jan 15, 2023

@theophileds I just tested it with the docker container directly with my fastcgi client as well as with nginx but don't see the issue (it's correctly suppressing /status for me). My config is following:

[global]
error_log = /dev/stderr
[unconfined]
listen = 0.0.0.0:9000
pm = "static"
pm.max_children = 5
pm.process_idle_timeout = 1m;
pm.max_requests = 500
pm.status_path = /status
access.log = /dev/stderr
access.suppress_path[] = /status
catch_workers_output = yes

The whole setup is here: https://github.com/bukka/php-util/tree/19f627bb491c17675c893d7161d3bf07dd00e730/tests/fpm/access-suppress-path-status - if you want to run it locally, you will need to modify nginx pid and root file or just create nginx in the container (I was lazy to do that as I have got already local nginx setup...).

Would you be able to create a similar setup (feel free to use docker compose if that's easier) where the issue is visible or at least describe in detail steps how to reproduce it.

@theophileds
Copy link
Author

theophileds commented Jan 17, 2023

Thank you @bukka for your answer.

I've got some exciting news to share! I was able to successfully hide /status requests using the command cgi-fcgi -bind -connect 127.0.0.1:9000 and the following parameters:

SCRIPT_NAME="/status" \ SCRIPT_FILENAME="/status" \ QUERY_STRING="" \ REQUEST_METHOD="GET"

However, I noticed that some /status requests are still being logged from the php-fpm_exporter. I used tcpdump to inspect the network frames, and after further investigation, I discovered that the following parameters were used:

SCRIPT_FILENAME="/status" \ SCRIPT_NAME="/status" \ REQUEST_METHOD="GET" \ QUERY_STRING="json&full" \ CONTENT_LENGTH="0" \ SERVER_SOFTWARE="go/php-fpm_exporter" \ REMOTE_ADDR="127.0.0.1"

I've finally managed to understand that the faulty parameter was QUERY_STRING as when I set it back to an empty string, the requests were no longer being logged.

@bukka
Copy link
Member

bukka commented Jan 19, 2023

Ah I see. This looks to me like this was done on purpose:

// Never suppress when query string is passed
if (proc->query_string[0] != '\0') {
return 0;
}
so we can't consider this as a bug. But I agree that there's definitely use case for suppressing queries especially for status which is often used with queries and currently there is no way to suppress it. I'm thinking maybe some sort of wildcard (/status*) and / or explicit suppression of queries (/status?json&full) could be introduced.

@maaarghk is the original author of this feature so might be also good to hear is thoughts on this.

@bukka bukka changed the title /status is not suppressed effectively when using access.suppress_path /status?json&full is not suppressed effectively when using access.suppress_path Jan 19, 2023
@bukka bukka changed the title /status?json&full is not suppressed effectively when using access.suppress_path /status?json&full is not suppressed when access.suppress_path used Jan 19, 2023
@maaarghk
Copy link
Contributor

Ah yeah, the intention here was to ensure things like bots making attempts at SQL injection or other vulns couldn't sneak them in behind the suppression.

I do think that it's worthwhile to keep that in since some big frameworks with spotty security records use things like a health_check.php which loads up a large surface area to test database connections can be made. Explicit suppression of specific query strings feels like a good middle ground to me.

@maaarghk
Copy link
Contributor

I'm realising from the initial post that %Q%q must not be in the default access.format which I suppose is sort of ironic.

@theophileds
Copy link
Author

Thanks @bukka and @maaarghk for your prompt responses!

Are there currently any plans for implementing a solution for this issue?

@bukka
Copy link
Member

bukka commented Feb 3, 2023

I put it on my list but I have got bunch of other priorities so it might take me some time. Of course I would be happy to review a PR if someone else implements it sooner.

@maaarghk
Copy link
Contributor

maaarghk commented Feb 7, 2023

@bukka I took a look at this because it seems simple enough on the surface but left with far more questions than I entered :)

There is a solution which feels sort of tidy to me, which is to update fpm access log so that %R = FCGI_GETENV(request, "REQUEST_URI") || SCRIPT_NAME + optionally("?" + QUERY_STRING) (making %Q redundant although %q for "QUERY_STRING" is probably still useful). IMO this is intuitive to sysadmins and PHP programmers, since it matches what you would see in $_SERVER['REQUEST_URI'], it matches what nginx admin would expect in $request_uri (which is what the default config sets fastcgi_param REQUEST_URI to), and this comment appears to expect apache to pass the entire uri with query too (although seemingly nothing is intuitive about this in apache). This would then be what the suppression code would compare against.

REQUEST_URI is optional because the cgi spec itself doesn't define it, and there are definitely widely used status check implementations out there which don't send it (manual and php-fpm-healthcheck) hence the fallback.

Currently %R = SG(request_info).request_uri, an internal variable which feels confusingly named. The code comments are sort of contradictory and talk about PHP_SELF being set to SCRIPT_NAME, which is implemented as the code but PHP_SELF is set to SG(request_info).request_uri (also if fixpath is not set) - which it is in all cases. Having it known internally that "request_uri" actually contains "script_name" presumably for compatibility with older code or other sapis, is obviously fine, but the docs say that %R is defined as "the request URI (without the query string, see %q and %Q)", but imo that is not what any PHP user would understand the request URI to be (the way to output the actual request URI as it is commonly understood is %{REQUEST_URI}e).

Anyway, I feel like you probably understand the context and reasons behind this, so I suppose the questions summarise to:

  • (a) can you envisage a world in which SG(request_info).request_uri actually gets set to REQUEST_URI (or SCRIPT_NAME if it is not set) without breaking everything? Is request_uri really the wrong name for that variable or have I missed something fundamental?
  • (b) Does the "simple" solution presented of updating the meaning of "%R" seem like it would be confusing / helpful / wrong?

@bukka
Copy link
Member

bukka commented Feb 9, 2023

@maaarghk I have done some investigation and I think we should not change SG(request_info).request_uri as this could break things as queries or path info are not expected there. Do you see any issue related to using SCRIPT_NAME for SG(request_info).request_uri?

Currently %R = SG(request_info).request_uri, an internal variable which feels confusingly named.

Access log %R is for remote IP. Did you actually mean %r? I would prefer not to change it as it could be obviously an issues if anyone uses %r%Q already. I think it could be useful to have a special new flag (e.g. %U or whatever you think makes most sense and it's free) which would use REQUEST_URI and fall back to %r%Q if not set. Might be better to create a new GitHub issue for this as it's not the same thing that is requested here.

In terms of this issue I think we just need to do filtering on SG(request_info).query_string if ? is part of the path. I think it would be also useful to add support for suffix wildcard (*) which would easily allow all queries...

@maaarghk
Copy link
Contributor

maaarghk commented Feb 9, 2023

yes I meant %r oops; my issue isn't really what's in SG(request_info).request_uri itself, but just that exposing that to users under the name request_uri is confusing and unexpected. I think if someone reads the config and uses %r right now they're expecting REQUEST_URI and it is surprising to receive SCRIPT_NAME instead, but I agree it'd be a minor version change warranting an upgrade note to change the meaning of %r even if it is currently - imo - not accurate. Someone who relies on the current behaviour could be informed to replace it with %{SCRIPT_NAME}e.

Back on topic: The issue with filtering on SG(request_info).query_string, and something I originally wrote but then removed for brevity last time :-) is that we'd need to manually allocate a buffer, copy (the equivalent of) %r%Q%q into it, then compare the ini file entry (containing both the path and the query string) against that buffer within the suppress function, which feels inefficient; but I can see that since REQUEST_URI is not a required request parameter per the CGI spec we can't really just compare against FCGI_GETENV(request, "REQUEST_URI") either. But I do think it'd be important to ensure that at least these cases are possible to filter:

  • /status inbuilt request bare only (REQUEST_URI is not set, SCRIPT_NAME is /status)
  • /status?<format> inbuilt request explicitly (REQUEST_URI is not set, SCRIPT_NAME is /status, QUERY_STRING is e.g. json)
  • php script optionally with query e.g. /health_check.php?mode=quick (REQUEST_URI is /health_check.php?mode=quick, SCRIPT_NAME is /health_check.php, QUERY_STRING is mode=quick)
  • health check behind front-controller / router pattern e.g. /healthcheck?mode=full (REQUEST_URI is /healthcheck?mode=full, SCRIPT_NAME is /index.php, QUERY_STRING is mode=full)

I suppose you can see the related issue that if someone who uses a front-controller + router wants the actual request URI in their fpm access log right now, %{REQUEST_URI}e is the best option but means that /status does not appear in the access log output. (What I'm getting at is if %r were updated to mean (pseudocode) REQUEST_URI ?? (SCRIPT_NAME + (QUERY_STRING ? "?" + QUERY_STRING : ""), comparing against that would support these use cases but also - in my opinion - be what a user expects to see.)

On wildcards I do think it's a bit of a foot gun for the php script / front-controller use cases where potentially a wide surface area is loaded in and might be exploitable from $_GET or something. I continue to prefer explicit-only suppression. Just my opinion of course :)

@bukka
Copy link
Member

bukka commented Feb 9, 2023

I think we could get around a need for extra buffer by comparing parts of the filtered path with SCRIPT_NAME, PATH_INFO and QUERY_STRING. Something like this pseudocode condition:

strncmp(path, script_name, strlen(script_name) == 0 && 
strncmp(path + strlen(script_name), path_info, strlen(path_info) &&
path[strlen(script_name) + strlen(path_info)] == '?' &&
strcmp(path + strlen(script_name) + strlen(path_info) , query_string)

This is very simplified as some further checks would need to be done (e.g. if path info or query set) but hopefully you get the idea.

health check behind front-controller / router pattern e.g. /healthcheck?mode=full (REQUEST_URI is /healthcheck?mode=full, SCRIPT_NAME is /index.php, QUERY_STRING is mode=full)

Are you sure about this? I thought that the REQUEST_URI would be usually /index.php/healthcheck?mode=full in this case. I think the best way to address front controller use case would be just to consider PATH_INFO. The filter should still specify the /index.php as that's what is actually executed by PHP.

@TPXP
Copy link

TPXP commented Jan 11, 2024

Hello, in case someone else is having this issue, a quick workaround is to use the separate socket for status endpoints offered by PHP-FPM with pm.status_listen:

[www]
listen = /dev/shm/php-fpm.sock
pm.status_listen = /dev/shm/php-fpm-status.sock
pm.status_path = /__local__/status

You don't even need to add the pm.status_path value in access.suppress_path[], as it seems PHP-FPM does not maintain an access log for this special socket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants