Handle newlines correctly when error_log set to "syslog" #2674

pprindeville · 2017-08-07T02:49:00Z

This fixes bz#78460 in compliance with RFC-3164.

pprindeville · 2017-08-07T03:05:14Z

Requesting reviews by @weltling, @dstogov.

nikic · 2017-08-07T08:20:18Z

main/main.c

@@ -659,7 +659,40 @@ PHPAPI ZEND_COLD void php_log_err_with_severity(char *log_message, int syslog_ty
 	if (PG(error_log) != NULL) {
 #ifdef HAVE_SYSLOG_H
 		if (!strcmp(PG(error_log), "syslog")) {
-			php_syslog(syslog_type_int, "%s", log_message);
+			static char nvt[] = " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";


Not allowing characters >= 0x80 is going to break non-English systems.

Not allowing characters >= 0x80 is going to break non-English systems.

Unfortunately, that's a basic limitation of the syslog protocol:

4.1.3 MSG Part of a syslog Packet The MSG part will fill the remainder of the syslog packet. This will usually contain some additional information of the process that generated the message, and then the text of the message. There is no ending delimiter to this part. The MSG part of the syslog packet MUST contain visible (printing) characters. The code set traditionally and most often used has also been seven-bit ASCII in an eight-bit field like that used in the PRI and HEADER parts. In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32). However, no indication of the code set used within the MSG is required, nor is it expected. Other code sets MAY be used as long as the characters used in the MSG are exclusively visible characters and spaces similar to those described above. The selection of a code set used in the MSG part SHOULD be made with thoughts of the intended receiver. A message containing characters in a code set that cannot be viewed or understood by a recipient will yield no information of value to an operator or administrator looking at it.

This sounds to me like "implementation-defined", which is different from not supported. Some (most?) syslog implementations can clearly deal with UTF-8 and I don't think we should unnecessarily interfere with that (unless there is a specific problem resolved by mangling this?)

Given that syslog is a 1-way protocol, how would syslog guarantee it's using a charset that the destination understands? Particularly in the case of remote logging.

When in doubt, I fall back on a mentor's advice, in this case, Postel's Law:

Be conservative in what you do, be liberal in what you accept from others

Of course, Jon wrote that well before zero-day exploits were a common phenomenon, if I had to recoin his advice, I'd write it as:

Be conservative in what you do, and extremely paranoid in what you accept from others

And understood or not, in UTF-8 I imagine that it's not that difficult to craft a string of executable code, or a valid Xterm escape sequence to exploit a vulnerability, etc.

As for whether it's necessary or not, I'm ambivalent. There are probably tens of thousands of syslog clients.

If it's the syslog server's responsibility to sanitize what goes into log files (and it is, hence Postel's Law), then it should be fixed there (and not all those other places). After all, it only takes one malicious or exploitable client...

But I made the mistake of opening bug 74860 that has precipitated some alarmist and generally unhelpful comments.

I don't want to be accused of throwing caution to the wind and ignoring them, hence the 2nd (and now 3rd) commits.

Frankly I would be happy with just the first one.

If the 2nd and 3rd are deemed necessary, I'll squash those two together before merge.

Why do you quote an obsolete RFC 3164?

@bukka It's not obvious to me that all loggers out there implement RFC-5424. Even syslog-ng which is fairly rich doesn't default to RFC-5424 behavior, but instead to classic BSD syslog behavior (which predates even RFC-3164).

Well I really think that the default should respect the current RFC. If you want to be more strict for some non conforming loggers, then it should be optional. I think that ini setting makes most sense in this case.

Also note that you are changing the wrong place as it is not going to work for fpm. Better solution would be to change php_syslog to function and do it in there.

Also note that you are changing the wrong place as it is not going to work for fpm. Better solution would be to change php_syslog to function and do it in there.

@bukka Not sure I follow:

$ grep -r php_syslog ext/standard/php_ext_syslog.h:#include "php_syslog.h" ext/standard/syslog.c: php_syslog(priority, "%s", message); ext/standard/mail.c:#include "php_syslog.h" ext/standard/mail.c: php_syslog(LOG_NOTICE, "%s", message); sapi/fpm/fpm/fpm_conf.c:#include "php_syslog.h" sapi/fpm/fpm/fpm_stdio.c:#include "php_syslog.h" sapi/fpm/fpm/zlog.c:#include "php_syslog.h" sapi/fpm/fpm/zlog.c: php_syslog(syslog_priorities[zlog_level], "%s", buf); main/php_syslog.h:#define php_syslog std_syslog main/php_syslog.h:#ifndef php_syslog main/php_syslog.h:#define php_syslog syslog main/php.h:#include "php_syslog.h" main/main.c:#include "php_syslog.h" main/main.c: php_syslog(syslog_type_int, "%.*s", (int)valid_len, msg); main/main.c: php_syslog(syslog_type_int, "%.*s", (int)buf.len, buf.c); $

php_syslog() is a macro pointing at either syslog() or else std_syslog(), and both are externals.

I know. That's why I said "better solution would be to change it to the function". :) That function would then call syslog or std_syslog for now. Going forward we could add support for remote logging which means reimplementing that completely in a simillar way as nginx does for example. But we could still do it inside that function so such separation would be useful.

Combining both facts, bytes above 0x7f are fine, and bytes below 0x20 should be filtered. Especially now that we know UTF-8 is explicitly supported by the newer RFC :)

Done, as a complete rewrite.

Better solution would be to change php_syslog to function and do it in there.

Done.

Also added "syslog.highbit" (bool) to control whether characters above 0x7f are allowed or not.

If someone wants to add (subsequently) utf-3 compressed parsing, they can, but it's beyond me.

nikic · 2017-08-07T08:23:37Z

main/main.c

+					char *buf, *ptr;
+					size_t buflen;
+
+					FILE *fp = open_memstream(&buf, &buflen);


This is probably non-portable. I'd suggest using smart_string instead.

By the way, I have to back-port this patch to 7.1.8, which doesn't have zend_printf_to_smart_string()... what's the workaround?

nikic · 2017-08-07T08:24:25Z

Also ping @bukka, as iirc you were working on changes regarding syslog.

pprindeville · 2017-08-07T21:03:42Z

main/main.c

@@ -659,7 +663,47 @@ PHPAPI ZEND_COLD void php_log_err_with_severity(char *log_message, int syslog_ty
 	if (PG(error_log) != NULL) {
 #ifdef HAVE_SYSLOG_H
 		if (!strcmp(PG(error_log), "syslog")) {
-			php_syslog(syslog_type_int, "%s", log_message);
+			static char nvt[] = " !\"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_`abcdefghijklmnopqrstuvwxyz{|}~";


This is probably non-portable. I'd suggest using smart_string instead.

@nikic Fixed to use smart_string... though I wonder why nothing else in main/main.c was previously using smart_string's.

KalleZ · 2017-08-10T03:08:27Z

configure.ac

@@ -1443,7 +1443,7 @@ PHP_ADD_SOURCES(main, main.c snprintf.c spprintf.c php_sprintf.c \
       php_ini.c SAPI.c rfc1867.c php_content_types.c strlcpy.c \
       strlcat.c explicit_bzero.c mergesort.c reentrancy.c php_variables.c php_ticks.c \
       network.c php_open_temporary_file.c \
-       output.c getopt.c, -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1)
+       output.c getopt.c php_syslog.c, -DZEND_ENABLE_STATIC_TSRMLS_CACHE=1)


php_syslog.c should also be added to the Windows build system in win32/build/config.w32

pprindeville · 2017-08-10T16:36:36Z

@weltling Done as a rewrite. What do you think?

weltling

IMO the further patch looks closer to what was discussed till now. Still it would be good to hear more opinions, especially about the way to handle the prohibited chars.

Thanks.

weltling · 2017-08-11T08:27:12Z

main/php_syslog.h

-/*
- * The SCO OpenServer 5 Development System (not the UDK)
- * defines syslog to std_syslog.
- */


This comment should be kept somewhere.

weltling · 2017-08-11T08:35:50Z

main/php_syslog.h

-
-#undef syslog
+BEGIN_EXTERN_C()
+PHPAPI void php_syslog(int, const char *message);


The signature of php_syslog should stay compatible to syslog() as per man page. The simplified variant can wrap a va function, please see how zend_error() is constructed.

Done. Although at this point, the string is being traversed 4 times by my count... so not very efficient.

Thanks. It's really about the API compatibility, as the even being a macro the symbol was available through the header. A simple variant could still be kept by reorganizing a bit.

weltling · 2017-08-11T11:37:04Z

main/php_syslog.c

+
+		if (((' ' <= c) && (c <= '~')) || (PG(syslog_highbit) && (c >= 0x80)))
+			smart_string_appendc(&buf, c);
+		else if (c == '\n') {


Some other new line variants possibly make sense, fe if CRLF appears in a message?

At some point one has to draw a line and say "the way this code is being called is broken" and change the calling code, rather than try to accommodate every possible broken caller.

I wrote the original fix to address the issue of the default (built-in) exception handler calling error_log() with a multiline message (i.e. a stack trace). Ideally, it should have had multiple calls to error_log(), one per line.

Maybe that was my mistake: I should have just fixed the exception handler to break up its messages...

@pprindeville yeah, so you've decided the way to solve it a good opensource way, so then everyone could profit from your work. Thanks you for that. Then it turned out there's RFC defining the exact thing, so what is the issue? A discussion usually helps to figure out the best solution and avoid common bugs.

Well, at least it doesn't affect the output this way till now. So identifiers can be output, which wouldn't be true for arbitrary content with this condition.

weltling · 2017-08-11T11:40:28Z

main/main.c

@@ -611,6 +611,7 @@ PHP_INI_BEGIN()
 #ifdef PHP_WIN32
 	STD_PHP_INI_BOOLEAN("windows.show_crt_warning",		"0",		PHP_INI_ALL,		OnUpdateBool,			windows_show_crt_warning,			php_core_globals,	core_globals)
 #endif
+	STD_PHP_INI_ENTRY("syslog.highbit",			"0",		 PHP_INI_SYSTEM,	OnUpdateBool,		syslog_highbit,		php_core_globals, 		core_globals)


I'd suggest it should not be 7-bit by default, it is the current behavior. Maybe also name it a bit clearer, like "syslog_ascii_only" or alike.

Well, 7-bit is about the only thing we know for certain isn't exploitable.

What's the point of making the code more secure, only to disable the extra checking?

Anyway, the name has been changed.

Because it's a purely theoretical concern with a very, very tenuous connection to security? If you want to go into pointless hypotheticals, why not consider EBCDIC systems where LF is not even mapped to the same code point as ASCII? If the spec does not specify an encoding, we have to make reasonable assumptions. The reasonable assumptions are a) forbid ASCII control chars and b) handle ASCII LF.

IMHO even having an ini option is completely unnecessary, but it we do add one, it should certainly allow non-ASCII codepoints by default.

@pprindeville so what is the other security aspect besides escape sequences? On the other hand, not allowing UTF-8 to pass nowadays is weird.

@nikic I'm neither a PHP proponent nor opponent. I'll use whatever I'm paid to use: C, C++, Perl, Python, Bash, PHP, Go, whatever...

I don't have a dog in this fight.

What I do know, though, is reading through HTTP server logs everything, over 60% of the attempted exploits are PHP. That's probably something that most of the people reading this message don't like to be reminded of.

I don't know if that's because the environment is inherently less secure or because, for whatever reason, people who code in PHP tend to be less aware of system exploits.

Either way, if I leave my fingerprints on PHP, it's not going to be in the form of an added exploit.

I've put instructions into the default php.ini that explain this control and how to open up non-ascii logging if one is willing to accept any consequences which might come from that.

As for EBCDIC, if you're running in an EBCDIC environment, you have to compile in an EBCDIC environment so that strings and character literals are encoded correctly, and that the correct table for <ctype.h> is used.

IETF standards are woefully unaware of non-Latin languages, I used to title at that windmill in the 90's when HTML was being finalized. If you get bored, see the postings of philipp@res.enst.fr on www-html and www-talk mailing lists.

I'm not an English language bigot: I just won't compromise security for convenience, even if it's a major inconvenience to use (say) Cyrillic or Biao Gao with IETF protocols.

If the standards are inadequate, fix them. Don't willingly write broken code that violates them.

In addition to the previous logger experiment, i now went a bit further and checked same with syslog-ng and dsyslog. Previously it was rsyslog, as it turned out. Anyway not sure there are more syslog daemons to get on Jessie by default and i've no other VMs to break ATM. At least what turned out is, that the behavior of all of them is same as shown before - an UTF-8 string lands in the syslog. Furthermore, seems it doesn't care whether BOM is present or not.

Regarding BOM - it might be more clear why the new RFC requires it for UTF_8 and why it requires UTF-8 to be the shortest form http://www.unicode.org/reports/tr36/#UTF-8_Exploit . To answer the earlier question - there is absolutely nothing in the core that deals with BOM. And i wouldn't even try to understand, what the given syslog daemon does in this case. Fe doing a write of the ill formed sequence 41 C2 3E 42 as from the UNICODE-TR36, it lands in the log file without any complain whether BOM is used or not. Apparently there is a security aspect. As per the RFC, the implementation does not have to interpret UTF-8 without BOM, but also it wouldn't care about an arbitrary string contents.

@pprindeville The topic is complex, we see. In PHP security and backward compatibility have both high priority. In this case, it seems that the backward compatibility breach were a far overreaction and thus not justified. It should really be thought through carefully, perhaps involving mbstring and default_charset or whatsoever measures. It is a fun topic and it is good that we discuss it so it reveals how complex it is and there indeed can be issues. At least we see now, that the best possible solution is not to achieve with just extending the intended fix a bit.

@pprindeville please let's not be hasty. I would suggest for now to reduce to an unavoidable and backward compatible minimum, which is

split by newline

block c < 32

syslog.ascii_only = 0 by default

This solves the initial issue and puts in some security bits. Furthermore, it seems to be the solution with the most consent so far. Anyway, having this the PR can be completed and anything beyond can be left for further consideration.

Thanks.

This solves the initial issue and puts in some security bits.

@weltling Well, it puts them in but largely disables them. Anyway, I don't care. All I ever wanted was splitting stack traces in the exception handler so they wouldn't look like:

Aug 11 15:19:01 ubuntu16 php: PHP Fatal error: Uncaught Error: Class 'NoSuchClass' not found in /home/philipp/bertram/php-src/test.php:4#012Stack trace:#012#0 /home/philipp/bertram/php-src/test.php(15): foo()#012#1 {main}#012 thrown in /home/philipp/bertram/php-src/test.php on line 4

which is what they currently look like. I've spent 6x more time on this issue than initially planned.

We've still not decided if DEL should ever get written in the clear or not. Nothing I've seen suggests that 0xf7 should be allowed.

Also, if "backward compatibility" was the vulnerability, don't you want to deliberately break that?

syslog.ascii_only = 0 by default

Okay, changed the default and updated the text in the default php.ini.

I've spent 6x more time on this issue than initially planned.

Also any participants here spend time as well :) Some cases are like this, where a big thing turns out from some relatively simple. It's the normal process we follow for reviews anyway.

Also, if "backward compatibility" was the vulnerability, don't you want to deliberately break that?

Say it plain, yes. There is this document https://wiki.php.net/security for some time already, that we use to evaluate the issue impacts. In the security regard - the known XTERM vulnerability is fixed with your patch. The other case with the UTF-8 exploit - i might be wrong, however when looking at the classifications, it's probably a low one as for me. Quite a few assumptions need to come true on the PHP side and for a vulnerable log viewer software the data source probably doesn't matter.

With 0x7f - from the RFC it is invalid and also doesn't seem a part of any code point in UTF-8. Lets see for more comments yet, otherwise the state seems quite close to finish. I'm also going for some tests next days.

Thanks.

@weltling Looks like Big5 and SJIS also skip over 0x7f, so no problem there either.

weltling · 2017-08-11T12:08:00Z

main/php_syslog.c

+			break;
+		}
+
+		if (((' ' <= c) && (c <= '~')) || (PG(syslog_highbit) && (c >= 0x80)))


I'm not sure about the >= 0x80 condition. Some encodings might still be broken by this. Fe BIG5 or other CJK. I'd suggest implementing conditions/actions like this

c < 32 and c is not a newline => filter out or replace

c has to be 7-bit only and if c > 126 then filter out or replace

otherwise pass as is

Anything more complicated than this, and we need to implement a utf-8 compressed format parser. Maybe that's the way to go, but can we do it as a follow-on enhancement?

I'm trying to fix a vulnerability, and keep the scope of the change limited.

The bigger the change, the more likelihood of injecting another vulnerability (or multiple ones) by too big of an untested change.

@weltling >= 0x80 is how PHP handles this elsewhere, e.g. in identifiers. (IIRC this was changed in 7.1, previously it was >= 0x7f).

@pprindeville the vulnerability linked was about escape sequences, so c< 32 is the only condition for that. Or do i miss yet more vulnerabilities? Also, how it makes the handling more complicated, c < 32 needs to be replaced anyway and otherwise there is no reason to parse anything except the key points as per RFC.

@nikic But we still supported other encodings than single byte and UTF-8? Fe check https://en.wikipedia.org/wiki/Big5#Organization where the second byte can be in the range of 0xa1-0xfe, so it'd result in a broken output. At least in some CJK encodnigs.

@weltling Not sure I get what you mean. 0xa1 - 0xfe will still pass the >= 0x80 check. As far as I'm aware all encodings in practical use (discounting encodings with multi-byte code units and EBCDIC, something we're assuming here anyway) are either ASCII-compatible or quasi-compatible (e.g. SJIS differs in the mapping of some characters in the ASCII-plane, but not in any way that is relevant in this context) and only do anything special for code-units >= 0x80, so allowing that range is sufficient.

@nikic ups, my c/p mistake. There are two ranges for the second byte there - 0x40-0x7e and 0xa1-0xfe, at least as per doc linked before.

@weltling Ah, I see. Would still be allowed by these checks. The ' '..'~' check is 0x20..0x7e, so that includes the range. (It would probably make sense to explicitly use the hex values here.)

@nikic indeed, thanks!

the vulnerability linked was about escape sequences, so c< 32 is the only condition for that. Or do i miss yet more vulnerabilities?

@weltling There are syslog implementations out there which don't implement RFC-5424, and others still which implement it but disable it by default, falling back on RFC-3164, which states:

4.1.3 MSG Part of a syslog Packet The MSG part will fill the remainder of the syslog packet. This will usually contain some additional information of the process that generated the message, and then the text of the message. There is no ending delimiter to this part. The MSG part of the syslog packet MUST contain visible (printing) characters. The code set traditionally and most often used has also been seven-bit ASCII in an eight-bit field like that used in the PRI and HEADER parts. In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32). However, no indication of the code set used within the MSG is required, nor is it expected. Other code sets MAY be used as long as the characters used in the MSG are exclusively visible characters and spaces similar to those described above. The selection of a code set used in the MSG part SHOULD be made with thoughts of the intended receiver. A message containing characters in a code set that cannot be viewed or understood by a recipient will yield no information of value to an operator or administrator looking at it.

The only actionable part of that paragraph is, "In this code set, the only allowable characters are the ABNF VCHAR values (%d33-126) and spaces (SP value %d32)."

The rest is unactionable. For example, "However, no indication of the code set used within the MSG is required, nor is it expected. Other code sets MAY be used as long as the characters used in the MSG are exclusively visible characters and spaces similar to those described above."

If you don't know what the code set is because it's not explicitly called out, then how do you know if a character is "printing" or not in that code set? You don't. You can't.

pprindeville · 2017-08-15T17:12:42Z

Any chance of this going in this week?

bukka · 2017-08-15T19:49:12Z

php.ini-development

@@ -517,6 +517,15 @@ report_memleaks = On
 ; This setting is on by default.
 ;report_zend_debug = 0

+; Set this to disable non-ASCII logging (more secure).


The more secure comment is very misleading and it just shows on broken loggers that can't handle UTF-8 for example which is a minority IMHO. It should be really clear that it is a work around for broken loggers.

Btw. there is no security issue in PHP. This is just a problem of the logger as RFC does not prohibit any characters.

@bukka The loggers aren't the issue, so much as terminal emulators that allow themselves to be subverted.

bukka · 2017-08-15T19:54:41Z

main/main.c

@@ -611,6 +611,7 @@ PHP_INI_BEGIN()
 #ifdef PHP_WIN32
 	STD_PHP_INI_BOOLEAN("windows.show_crt_warning",		"0",		PHP_INI_ALL,		OnUpdateBool,			windows_show_crt_warning,			php_core_globals,	core_globals)
 #endif
+	STD_PHP_INI_ENTRY("syslog.ascii_only",			"0",		 PHP_INI_SYSTEM,	OnUpdateBool,		syslog_ascii_only,		php_core_globals, 		core_globals)


I would prefer to see something more generic like syslog.encoding to allow us also properly support UTF-8 - add BOM if selected. We also have functions valildation for UTF-8 and some other encoding in html.c in standard ext so it might be also good idea to use. This is just an initial thought so needs probably more thinking and it's open for discussion ofc. :)

This is turning into a bike-shedding exercise.

The original question was trivial: how to handle strings with embedded newlines?

Splitting into two issues and two PR's.

See bug #75077 for the issue related to non-NVT ASCII.

Agreed, let's not drag more changes into this PR.

Well my point is that we introduce syslog.ascii_only and then to check utf-8, we will need to add syslog.utf8_only or something like that. That's not really flexible IMHO. I would prefer a single ini for that where you can specify required encoding.

There was a discussion some days ago in this thread #2674 (comment) where it turned out it the encoding is not really respected, specifically after this experiment is what was to see here #2674 (comment). If you mean just about checking the encoding - default_charset might suffice for that already, so no reason to introduce a complication. For encoding conversion it's probably even more complicated to start here. In general, the patch as is seemed to be OK, both for fixing the initial newline splitting issue and securing some bits, the further handling should be a matter of a presumably more complicated implementation. Since it targets master, it were OK to have this patch at least for the start, then looking whether a complicated implementation could ever make it.

Thanks.

bukka · 2017-08-15T19:55:52Z

main/php_syslog.c

+	va_end(args);
+
+	for (ptr = fbuf.c; ; ++ptr) {
+		c = *ptr;


it could be inside the for I guess :)

pprindeville · 2017-08-15T21:05:38Z

Okay, now can we proceed with a merge?

pprindeville · 2017-08-15T21:09:37Z

Btw. there is no security issue in PHP. This is just a problem of the loggers if there are some security issues as RFC does not prohibit anything.

@bukka "Defense in depth" strategy entails trying to defeat an attack in more than one place. That's what we're doing here.

No one said that the vulnerability is in PHP: it's not. The vulnerability is in the Xterm (and other terminal emulators) which don't handle device control strings (DCS's) in a sane way.

But we can try to prevent PHP from being the vector by which the vulnerability is injected.

weltling · 2017-08-15T21:14:06Z

Actually i was going to do some tests by the EOW. The last point was to exclude also 0x7f, but now the patch is completely different and there's yet another PR? What is the plan now, is this PR superseeded?

Thanks.

pprindeville · 2017-08-15T21:21:59Z

What is the plan now, is this PR superseeded?

Not at all. PR #2689 is layered on this one. You can do your testing there, and I'll rebase the PR to drop the first commit as soon as it hits master.

weltling · 2017-08-15T21:33:58Z

@pprindeville hmm, not sure to follow. #2689 is what this PR was before. Why it is split out? Or if i missed something in the correspondence, please point me there.

Thanks.

pprindeville · 2017-08-15T21:42:14Z

Why it is split out? Or if i missed something in the correspondence, please point me there.

Because there are two problems: (1) how to handle strings with newlines in them, (2) how to handle non-NVT ASCII.

(1) is a trivial issue easily solved by the original first commit, with hardly any discussion required.
(2) is a more complex issue, which seems to have still more discussion required to arrive at a consensus.

This PR reverts to the original fix (and the original commit) which handles just issue (1).

The other PR is based on this PR plus the two additional commits which handle (a) strict NVT-ASCII when strict NVT-ASCII is all that is explicitly permitted/configured, and (b) either escaping non NVT-ASCII characters into safe octal notation--which I'm figuring won't be the ultimate solution-- or (b.2) translating the string into utf-8 compressed format with a BOM prefix... which is the only explicitly and concretely defined of handling anything other than NVT-ASCII.

I'll revise PR #2689 to (b.2) once it's agreed that this is the way to proceed and further, someone shows me what the libraries are for translating to utf8 and for validating utf8 correctness (in case the app is logging utf8 natively).

pprindeville · 2017-08-15T21:46:53Z

That's not really flexible IMHO. I would prefer a single ini for that where you can specify required encoding.

@bukka That's fine. Let's discuss that on PR #2689 since that's where this fix will end up.

weltling · 2017-08-15T21:48:29Z

@pprindeville but the patch didn't touch any encoding, did it? It fixed the initial issue and the terminal issue as a bonus. Anything else was passed through for now, no BOM, etc. :) The only point came from @bukka just now about the ini naming, either it could be fixed as well or later in master, since that's what the patch was targeting. Anyway, lets see it this way now.

Thanks.

pprindeville · 2017-08-15T21:53:46Z

but the patch didn't touch any encoding, did it? It fixed the initial issue and the terminal issue as a bonus. Anything else was passed through for now, no BOM, etc. :) The only point came from @bukka just now about the ini naming, either it could be fixed as well or later in master, since that's what the patch was targeting. Anyway, lets see it this way now.

@welting If you mean "was it codeset aware", then the answer was "no". It assumed everything was (or should be) NVT-ASCII.

We could, in PR #2689, look at the default_charset as you say, and convert the string from that to utf-8 and append a BOM. Then we'd be honoring Section 6.4 in the only concrete terms it has.

weltling · 2017-08-15T22:06:29Z

@pprindeville converting everything to UTF-8 might be not that doable - it'll likely require some heavy API and fe ext/mbstring is not an obligatory dependency. There are yet more questions than answers, as for me. I can only applaud you still strive a wider solution on this, anyway. I'm going for tests closer to the EOW, so if everything ok at least the newline part can be handled already.

Thanks.

pprindeville · 2017-08-18T00:43:59Z

@weltling Hi. Any news?

weltling · 2017-08-19T14:01:40Z

Merged with master, adapted for the Windows side.

Thanks.

pprindeville · 2017-08-19T17:08:59Z

Merged with master, adapted for the Windows side.

Were additional changes required?

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

pprindeville · 2017-08-19T17:40:48Z

Anyway, thanks!

weltling · 2017-08-19T17:48:55Z

@pprindeville yeah, splitting doesn't seem to be suitable for the event log. It's record based, not line based. The parts about sanitizing might still make sense, but the MSDN docs don't mention much on that.

Thanks.

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

pprindeville force-pushed the bug/74860 branch from 842060b to 531b03f Compare August 7, 2017 02:54

laruence requested a review from weltling August 7, 2017 05:42

krakjoe added Bug Waiting on Review labels Aug 7, 2017

nikic reviewed Aug 7, 2017

View reviewed changes

pprindeville commented Aug 7, 2017

View reviewed changes

pprindeville force-pushed the bug/74860 branch 2 times, most recently from ef8b017 to 14fef0d Compare August 10, 2017 03:00

KalleZ reviewed Aug 10, 2017

View reviewed changes

pprindeville force-pushed the bug/74860 branch 6 times, most recently from d462649 to dd552c2 Compare August 10, 2017 04:57

weltling reviewed Aug 11, 2017

View reviewed changes

pprindeville force-pushed the bug/74860 branch 3 times, most recently from a594f73 to 715a831 Compare August 11, 2017 16:06

Turn php_syslog() into wrapper for syslog and split lines

f11d40e

pprindeville force-pushed the bug/74860 branch 2 times, most recently from f2ec1ba to 3dd32e4 Compare August 11, 2017 21:32

bukka reviewed Aug 15, 2017

View reviewed changes

main/php_syslog.c

va_end(args);

for (ptr = fbuf.c; ; ++ptr) {

c = *ptr;

Copy link

Member

bukka Aug 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it could be inside the for I guess :)

pprindeville force-pushed the bug/74860 branch from 3dd32e4 to f11d40e Compare August 15, 2017 20:52

pprindeville changed the title ~~Handle newlines and illegal characters safely when error_log set to "syslog"~~ Handle newlines correctly when error_log set to "syslog" Aug 15, 2017

pprindeville mentioned this pull request Aug 15, 2017

handle non-NVT ASCII messages per the RFC's #2689

Closed

nikic assigned weltling Aug 15, 2017

weltling closed this Aug 19, 2017

pprindeville mentioned this pull request Aug 19, 2017

php7: error_log shouldn't squirt multi-line messages into syslog() openwrt/packages#4550

Merged

pprindeville added a commit to pprindeville/packages that referenced this pull request Aug 19, 2017

php7: backport upstream fix for multiline logging

f2641e9

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

pprindeville added a commit to pprindeville/packages that referenced this pull request Aug 19, 2017

php7: backport upstream fix for multiline logging

8110e2f

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

pprindeville added a commit to pprindeville/packages that referenced this pull request Aug 19, 2017

php7: backport upstream fix for multiline logging

3012a82

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

mkrkn pushed a commit to mkrkn/packages that referenced this pull request Aug 23, 2017

php7: backport upstream fix for multiline logging

7fc0d7d

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

salzmdan pushed a commit to salzmdan/packages that referenced this pull request Jan 8, 2018

php7: backport upstream fix for multiline logging

e4f99e2

See: php/php-src#2674 Signed-off-by: Philip Prindeville <philipp@redfish-solutions.com>

Handle newlines correctly when error_log set to "syslog" #2674

Handle newlines correctly when error_log set to "syslog" #2674

Uh oh!

Conversation

pprindeville commented Aug 7, 2017

Uh oh!

pprindeville commented Aug 7, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pprindeville Aug 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pprindeville Aug 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic commented Aug 7, 2017

Uh oh!

pprindeville Aug 7, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pprindeville commented Aug 10, 2017

Uh oh!

weltling left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weltling Aug 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pprindeville Aug 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic Aug 11, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

pprindeville Aug 7, 2017 •

edited

Loading

pprindeville Aug 7, 2017 •

edited

Loading

pprindeville Aug 7, 2017 •

edited

Loading

weltling left a comment •

edited

Loading

weltling Aug 11, 2017 •

edited

Loading

pprindeville Aug 11, 2017 •

edited

Loading

nikic Aug 11, 2017 •

edited

Loading

pprindeville Aug 11, 2017 •

edited

Loading

weltling Aug 11, 2017 •

edited

Loading

nikic Aug 11, 2017 •

edited

Loading