New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel logging interference (I assume) #379
Comments
it looks like the end of a message has been completely discarded (which includes the final newline) |
Indeed it does. Which makes it even stranger. |
This is very weird assuming you use the StreamHandler or a derivative, because it writes to disk with a single fwrite() call to a file that's been opened in append mode, so the OS IO should take care of doing this atomically and not mix things in like this. Is it something that occurs regularly? |
@Seldaek it is not mixed. It is only that one of the log message is incomplete. The end is missing (which includes the newline char) |
Yup true, but still sounds like an OS fail, not something we can help. |
I'm on debian 7.0 (with a vanilla Linux 3.15.1 kernel, ext4), by the way. I also find it strange that (seemingly) a part of the message is discarded, extremely curious. Such a bug in the fs driver would have been noticed, I bet. |
One would hope so yes, but then again bugs occurring rarely with highly concurrent constraint can be hard to spot. Anyway I don't really want to blame anyone here but I just don't see what I can do about it unfortunately given the information at hand and that the code is as (AFAIK) correct. |
So yeah, locking is needed for concurrent access. |
@bobrik the example you posted above looks shorter than the 4096bytes which should then be written as a atomic operation..? maybe the buffer in your server is shorter though. |
Yup, also locking is kinda impossible to do IMO. You don't want to throw an exception OR block a request until it can acquire a lock on a log file. Then if you only release it at the end that means you end up having a concurrency limit of 1 request at a time pretty much. Quite the performance killer. |
@Seldaek concurrency limit is 1 fwrite at a time per log file and write is usually pretty fast. If it's slow, there's problem is somewhere else. |
Yeah if you flock/fwrite/unlock for every write, but I assume that's a ton more IO to solve a problem that isn't exactly proven to exist with short line lengths (which is mostly what gets logged). I mean in the 3 years that monolog has been used in production now on many sites it's the first time someone spots this problem. I don't know if it's because people weren't looking or because it didn't happen, but I don't want to penalize performance and "overreact" here if there is no issue. Doing optional locking would for sure be an option (and a good thing to have) though. |
Just measured: <?php
$fd = fopen('test.log', 'a');
$s = microtime(true);
if (flock($fd, LOCK_EX)) {
fwrite($fd, str_repeat('wooohooo', 500) . "\n");
flock($fd, LOCK_UN);
echo sprintf("%.10f\n", (microtime(true) - $s) * 1000);
} Results in microseconds (not milliseconds even!):
without
Doesn't look like performance killer to me. Light travels 200 meters in optic fiber in 1 microsecond, by the way. |
OK looks acceptable indeed, though I'd like to test some more but thanks for the quick test! |
It's possible that this has something or other to do with newer kernel versions (3.14+) (fs drivers) using more concurrency, as I had never seen this before. |
@bobrik I think the performance difference gets bigger with more concurrency and when the system is under heavy (IO) load. |
@staabm I'd love to see evidence of that. Why should linux make it slow when it could be fast? I don't see any reason. I did some more testing just to convince you. Write 1000000 messages to file with 1 process with
Write 100000 messages from each of 10 concurrent processes with
Write 1000000 messages to file with 1 process without locking:
Write 100000 messages from each of 10 concurrent processes without locking:
In all 4 cases disk was utilized by 100%. As you can imagine, with 10 concurrent writers |
@bobrik thanks! Would you or @aktau want to work on a PR for the StreamHandler's write method? The one question left is what to do in case the locking fails. I would tend towards ignoring failures and just calling fwrite anyway. It sounds a bit dirty but considering all platforms and potential use cases since flock() blocks until a lock can be acquired or it failed to do so I think ignoring is the right thing to do to keep this working everywhere. |
Good research guys. I have a little bit too little time to go through the PR waves at the moment. Bit too occupied with work and neovim. |
@aktau You can now create a StreamHandler with the last constructor arg (useLocking) set to true to make it lock before every write. |
Hey! I've been noticing something lately:
As you can see, the logs lines interfere, which of course messes with my automated log parsing as well.
This is all from exactly 1 webapp that's firing requests at the server, some of those requests may indeed be spaced very closely in time. I reckon the problem might be worse when you have a lot of concurrent web users. I have 2 PHP-FPM processes running.
What information could I provide to help better alleviate the issue?
The text was updated successfully, but these errors were encountered: