Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FPM logger refactoring #2458

Merged
merged 1 commit into from Jul 7, 2018
Merged

FPM logger refactoring #2458

merged 1 commit into from Jul 7, 2018

Conversation

@bukka
Copy link
Member

bukka commented Apr 7, 2017

This is just an initial work on the log limit for FPM (just a zlog part). It introduces a new global conf log_limit that defaults to previous value 1024. Currently this is meant mainly for error_log function and removing truncating of child pipes from stdio when catch_workers_output is enabled (the ...). However that part is still wrapped and I think we should have an another config for that as it's not log related exactly. I actually created an empty API in zlog to address that and it's called zlog_stream which will allow push based approach. The thing is that zlog just adds a prefix and then pass it to the zlog_fd (unless it is set for syslog which still requires buffering). If zlog_fd is a normal stream, then we should be able to just pass it without extra copying (it means doing multiple writes). The situation gets slightly more complicated with children (workers) which is for example used when you do error_log. They do also external logging using fcgi_write in sapi_cgi_log_fastcgi but it could be also optimised to not do extra copying.

So from the above it is clear that there is a huge space for improvements and optimization and the patch will be probably considerably changed before I remove the WIP from title and make it ready for review (btw. it's still probably buggy so don't use it anywhere yet :) ).

I'm opening this PR to mainly see what others think about log_limit setting and also about another option called worker_output_limit that I'm preparing for the piped variant (the one used when catch_workers_output set). That would basically extend the size of the wrapping. I was also thinking about another option that would allow to remove prefix [pool %s] child %d said into %s: \"%s\, ugly double quoting and maybe even the warning prefix but not sure atm. I will be glad to hear your thoughts about that! :)

Also this is an initial answer to #1076 . Atm. it's based on master 7.1 but the final branch will depend how much refactoring I will do, how much confident I will be with the changes and of course on the RM. If it gets too big I will probably rebase it to master...

@krakjoe krakjoe added the Enhancement label Apr 8, 2017
@bukka bukka changed the title WIP: FPM log limit WIP: FPM logger refactoring Apr 18, 2017
@bukka bukka changed the base branch from PHP-7.1 to master Apr 18, 2017
@bukka bukka force-pushed the bukka:fpm_log_limit branch from 45773aa to 081d644 Apr 20, 2017
@bukka bukka force-pushed the bukka:fpm_log_limit branch from d2bc38d to e148420 Jun 4, 2017
@arielkung
Copy link

arielkung commented Jun 9, 2017

Hi @bukka @krakjoe when do you think this will be released? Some minor version of 7.1 or we will have to wait for 7.2?
Thanks!

@nahpeps
Copy link

nahpeps commented Jun 9, 2017

We are heavily facing this issue and would be more than happy if this could make it into a PHP release quite fast! :-)

@arielkung
Copy link

arielkung commented Jun 9, 2017

Totally the same here

@bukka
Copy link
Member Author

bukka commented Jun 11, 2017

I'm afraid it will have to wait officially till 7.3 as there is still quite a lot of work to do and my time quite limited considering that I also need to couple of things in json and openssl ext before the 7.2 deadline that is very close.

However I'm thinking about creating a separate SAPI that I would maintain separately and it would work with 7.0+. As I said it still needs quite a lot of work as there quite a few issues and needs some extended testing. I will prioritise it after the first beta for 7.2.

@blodan
Copy link

blodan commented Jun 22, 2017

This addition would be awseome!

I'm currently patching the hardcoded limit value everytime a new release is built :)

@bukka bukka force-pushed the bukka:fpm_log_limit branch from e148420 to a26eb25 Aug 3, 2017
@bukka bukka force-pushed the bukka:fpm_log_limit branch from 0304eec to 1f1d231 Aug 28, 2017
@bukka
Copy link
Member Author

bukka commented Nov 12, 2017

Just to give an update. I'm slowly working on it as part of FPMi next branch which is atm just current FPM + logging changes. Once I have it more or less stable and tested, I will update this PR.

@tomfotherby
Copy link

tomfotherby commented Nov 13, 2017

I was also thinking about another option that would allow to remove prefix [pool %s] child %d said into %s: "%s, ugly double quoting and maybe even the warning prefix but not sure atm. I will be glad to hear your thoughts about that!

This would be a huge advantage for the community running php in docker where there is a convention to log everything to SDTOUT so that log aggregation can have a common solution for all containers, not on a app-by-app basis. It's a big problem at the moment that app logs have the above mentioned prefix and warning. It would be a huge WIN to be able to remove it (or customise it).

@ingluisjimenez
Copy link

ingluisjimenez commented Feb 5, 2018

@bukka this is definitely hero's work, appreciate your take on this. Is there any way we can help you on this?

@bukka
Copy link
Member Author

bukka commented Feb 22, 2018

@ingluisjimenez thanks for offering the help. I will send an update once I'm happy with the patch and then it would be really great if you and possibly others could test it!

In the meantime the progress (currently I'm a bit slow but hope to have more time soon) can be seen here: bukka/fpmi@0.1.0...next

@dzuelke
Copy link
Contributor

dzuelke commented Apr 7, 2018

Thanks for your recent work on this, @bukka. Can I help test anything, discuss anything, verify anything, give feedback on current state?

@bukka
Copy link
Member Author

bukka commented Apr 8, 2018

@dzuelke Thanks! I think I have got most of the functionality in there (currently just in fpmi as linked above) but it needs some optimization for buffered logging - mainly reusing buffer. Also need to do a little bit more testing on my side but hopefully won't take too much time. Once it's done it would be great if you and others could test it. I will send a bit more detailed description of what has been done.

@dzuelke
Copy link
Contributor

dzuelke commented Apr 8, 2018

No problem @bukka, I can set aside work time to assist with testing etc. Just ping me when!

@bukka bukka force-pushed the bukka:fpm_log_limit branch 2 times, most recently from 53b0e28 to 30f5ff0 Jun 14, 2018
@bukka bukka changed the title WIP: FPM logger refactoring FPM logger refactoring Jun 15, 2018
@bukka
Copy link
Member Author

bukka commented Jun 15, 2018

@dzuelke , @ingluisjimenez and others: I think it's ready for testing. There are 3 new options:

  • log_limit: (global) max number of characters in a single line
  • log_buffering: (global) whether to buffer line data and write them in a single write - it's more experimental and the default is on which should do for most cases.
  • decorate_worker_output: (pool) whether to use the prefix - basically as requested in comment #2458 (comment) from @tomfotherby

A little more info about the options can be seen in the updated conf files.

; logging to a file descriptor. It means the new line character is not present
; when logging to syslog.
; Default Value: 1024
;log_level = 4096

This comment has been minimized.

Copy link
@jimbobhickville

jimbobhickville Jun 21, 2018

Shouldn't this say log_limit instead of log_level?

This comment has been minimized.

Copy link
@bukka

bukka Jun 22, 2018

Author Member

Ah good catch! Will fix it! Thanks!

This comment has been minimized.

Copy link
@bukka

bukka Jun 22, 2018

Author Member

Should be fixed now!

@bukka bukka force-pushed the bukka:fpm_log_limit branch from 30f5ff0 to 31b6c7e Jun 22, 2018
@dzuelke
Copy link
Contributor

dzuelke commented Jun 22, 2018

Is there any reason log_limit couldn't also be 0?

@dzuelke
Copy link
Contributor

dzuelke commented Jun 22, 2018

Could you also describe the idea behind log_buffering? With that on, does a single line get buffered, or multiple?

@bukka
Copy link
Member Author

bukka commented Jun 22, 2018

@dzuelke

Is there any reason log_limit couldn't also be 0?

I have been thinking about it but if buffering is used (which is default and primary mode), buffer is reused so it is kept in the memory and it could result in a really big buffer. It's kind of like zones in nginx (with an exception that it's not a shared memory [just master allocated memory] and it's the maximum that it can grow to) - you don't create an unlimited zone. If you want to allow really long lines (note that this is a limit for a single line), then you can set it high but there should be always some reasonable limit IMHO.

Could you also describe the idea behind log_buffering? With that on, does a single line get buffered, or multiple?

If it's on (default), then it's a buffering for a single line. Basically the line is then written in a single write call.

@dzuelke
Copy link
Contributor

dzuelke commented Jun 22, 2018

@bukka

If it's on (default), then it's a buffering for a single line. Basically the line is then written in a single write call.

I wonder if disabling that is useful in any way. Presumably a program could e.g. file_put_contents("php://stderr", "ohai, no newline");, but what use is that? and what happens if another log call arrives before the next portion of data, then the two lines get mixed up?

IMO writes should purely line based, and atomic (in one call). For logging to stdout/stderr that are not CLI apps (think curl download progress status), there is pretty much zero use in having unterminated lines, no?

@bukka bukka force-pushed the bukka:fpm_log_limit branch from 31b6c7e to 3e5afbf Jul 7, 2018
@php-pulls php-pulls merged commit 3e5afbf into php:master Jul 7, 2018
0 of 2 checks passed
0 of 2 checks passed
continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
@nikic
Copy link
Member

nikic commented Jul 7, 2018

@bukka Just to check, does the merged version fix the "pipe is closed" issue from above? Or is that supposed to work that way?

@bukka
Copy link
Member Author

bukka commented Jul 7, 2018

@dzuelke Ok I merged it as it's already in much better state than the current logging but will take a look on that issue with a high log limit during the beta.

About the default change, I'm not really sure as it can technically break some old syslog loggers - not sure however how likely it is. I will keep it for now...

That leak looks strange as it is in filter extension so not really sure if it's even related as these are changes mainly for master process... If you can take a look, that would be great!

@bukka
Copy link
Member Author

bukka commented Jul 7, 2018

@nikic What "pipe is closed" issue? :) If you mean the line split, that was already present and never worked properly. The only issue that I know about is the one with very large log limit that I plan to take a look during the beta. I don't think it will require complex changes - the hardest part will be to debug it...

@dzuelke
Copy link
Contributor

dzuelke commented Jul 9, 2018

Not the line split, @bukka, but note how every single file_put_contents() call from my examples above produced a ", pipe is closed" suffix. That doesn't happen with 7.2.

@bukka
Copy link
Member Author

bukka commented Jul 15, 2018

@dzuelke @nikic Ah I see! I think I got the logic wrong. That , pipe is closed should be there only if the read fails with and the res < 0 so that's a bug. Will need to fix it.

@dzuelke
Copy link
Contributor

dzuelke commented Jul 20, 2018

I took a brief look at the code to see if I could fix it in one minute or two.

Is this suffix even necessary? What actionable information does ", pipe is closed" get anyone? Can't this just be dropped entirely, @bukka and @nikic?

@bukka
Copy link
Member Author

bukka commented Jul 22, 2018

@dzuelke I think it says that either the read failed for some unusual reason (which is also logged as SYSERROR) or the pipe was closed during the processing so it means that the logged message is incomplete. I guess it might be useful to know if the logged message is just partial but not 100% sure.

@bukka
Copy link
Member Author

bukka commented Jul 29, 2018

I have got an initial fix in fpmi: bukka/fpmi@912c9f0 .

I plan to push it to fpm but would like to actually cover the closed pipe branch in test. Not really sure how to recreate it atm. I can get the pipe fd (at least I think that's the fd) like:

$pid = getmypid();
exec("lsof -p $pid | grep pipe", $output);
$fd = preg_split('/\s+/', $output[count($output) - 1])[7];

but not really sure how to close it without re-opening. Something like fclose(fopen("php://fd/$fd", "w")) can't work though. Need to think about it a bit more I guess or if you have any idea, that would be great!

@dzuelke
Copy link
Contributor

dzuelke commented Jul 30, 2018

But what's the closed pipe branch even for? When does that happen? If a message arrives, but between receiving it and emitting it, the pipe has closed? Doesn't the closing only happen if that child shuts down?

The question for me remains: in what way is the ", pipe is closed" information even useful for someone who looks at the logs? If the child crashes or gets restarted, that's already logged by FPM.

Why not just remove that whole bit?

@bukka
Copy link
Member Author

bukka commented Aug 5, 2018

@dzuelke I guess it's more an info that the log message is not complete so you know that not everything has been logged. It could also happen if pipe fd is closed which I'm not sure is even possible from PHP code but should be possible in C extension. I don't really think it's a likely scenario but as an identifier that the message is not complete, it might still be useful when using decorated messages.

I have fixed it in 2a67d62 as well as did a bit of reordering so the unlikely sys error is after the message. I think that should do.

@bukka
Copy link
Member Author

bukka commented Aug 12, 2018

@dzuelke I just tried to recreate your issue with long limit but no success.

This is my config and steps that I used:

[jakub@bukka loglimit]$ cat fpmi-log-long.conf 
[global]
error_log = /dev/stderr
log_limit = 2000000
log_buffering = yes
[unconfined]
listen = 127.0.0.1:9012
pm = dynamic
pm.max_children = 5
pm.start_servers = 1
pm.min_spare_servers = 1
pm.max_spare_servers = 3
catch_workers_output = yes
[jakub@bukka loglimit]$ php-fpm -F -y fpmi-log-long.conf 2> out.txt 
# ... now executing the file_put_contents as you provided
^C[jakub@bukka loglimit]$ 
[jakub@bukka loglimit]$ cat out.txt | awk '{ print length($0); }'
55
58
1999999
705163
46
48

Could you provide your config and steps that show the issue so I can recreate it?

@bukka
Copy link
Member Author

bukka commented Aug 19, 2018

@dzuelke I just closed https://bugs.php.net/bug.php?id=69031 . If you manage to recreate the issue above, please open a new bug and ping me in here...

@dzuelke
Copy link
Contributor

dzuelke commented Aug 22, 2018

Still happening with beta2, @bukka, yup. Even occasionally with 16384:

16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
13
16383
16383
16383
16383
16383
16383
8198
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
8205
16383
16383
16383
16383
16383
16383
16383
16383
16383
9
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
12
16383
16383
16383
16383
16383
16383
16383
7
16383
16383
16383
16383
16383
16383
16383
7
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
8203
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
22
16383
16383
16383
16383
16383
16383
16383
16383
8200
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
8203
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
16383
9875

Config:

log_level = warning
daemonize = no
log_limit = 16384
log_buffering = yes
error_log = /dev/stderr
[www]
user = nobody
listen = …
pm = static
pm.max_children = 1
catch_workers_output = yes
decorate_workers_output = no
clear_env = no
@bukka
Copy link
Member Author

bukka commented Aug 27, 2018

@dzuelke I'm not really able to recreate it but I think it might be because the message was split to multiple events in your environment. Also as discussed before, it's not ideal that the message is ended at the end of the event instead of the request. So I implemented storing the log stream in the child and it's now merged:

014b1b3

Please let me know if you can still see the issue. You can either build the latest 7.3 branch or it should be available in the next beta.

@bukka
Copy link
Member Author

bukka commented Aug 27, 2018

@dzuelke Btw. the change mentioned in the previous comment is a relatively bigger one so if you manage to test more cases like before, that would be really appreciated! Thanks!

@dzuelke
Copy link
Contributor

dzuelke commented Aug 28, 2018

@bukka by environment you mean my shell? that's not the cause; the redirect happened directly with the php-fpm call just like you also had it, and when I let FPM log straight to a file instead of stderr, I get the same result.

Testing 014b1b3 now.

@dzuelke
Copy link
Contributor

dzuelke commented Aug 28, 2018

@bukka 014b1b3 fixes it for all previously failing cases! Awesome! Want me to test anything specific around buffering on/off?

@bukka
Copy link
Member Author

bukka commented Sep 23, 2018

@dzuelke Sorry for the late replay. Nothing specific but was thinking more about some kind of regression testing that you did initially. Ideally testing mainly the buffering on as it will be the default - (off is more experimental) with checking if decoration on/off and different limits work correctly. It would be also great to keep an eye on memory usage (possibly also run it under valgrind) and see if there are any changes in perf compare to 7.2 (there shouldn't be but just in case there is some regression).

If anyone else is also interested doing some fpm log testing with latest 7.3 rc, that would be highly appreciated!

Thanks!

lcobucci added a commit to lcobucci/php that referenced this pull request Oct 10, 2018
As of PHP 7.3 we finally use STDOUT and STDERR properly in our
containers, this disables the '[pool %s] child %d said into %s: \"%s\'
format.

More info:

- php/php-src#2458
- https://bugs.php.net/bug.php?id=71880
- docker-library#207
lcobucci added a commit to lcobucci/php that referenced this pull request Oct 12, 2018
As of PHP 7.3 we finally use STDOUT and STDERR properly in our
containers, this disables the '[pool %s] child %d said into %s: \"%s\'
format.

More info:

- php/php-src#2458
- https://bugs.php.net/bug.php?id=71880
- docker-library#207
tianon added a commit to lcobucci/php that referenced this pull request Oct 16, 2018
As of PHP 7.3 we finally use STDOUT and STDERR properly in our
containers, this disables the '[pool %s] child %d said into %s: \"%s\'
format.

More info:

- php/php-src#2458
- https://bugs.php.net/bug.php?id=71880
- docker-library#207
Soulou added a commit to Scalingo/php-buildpack that referenced this pull request Feb 27, 2019
…derr)

It has been fixed in PHP 7.3: php/php-src#2458

With a custom php-fpm parameter: `decorate_workers_output`
@orlandothoeny
Copy link

orlandothoeny commented May 20, 2020

Not sure if this is related, because long log messages work, just not all the time: https://serverfault.com/q/1017932/403177
Bug or did I configure something wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

You can’t perform that action at this time.