-
Notifications
You must be signed in to change notification settings - Fork 301
Description
@lkarsten commented on Tue Aug 01 2017
Hi.
This was originally reported in #59 , but after a bit of digging I think it belongs here.
Issue: The E segment of serial logging contains binary data, possibly of uninitialized memory.
Expected: Contents of the response body, mostly HTML and other human readable responses.
Setup: libmodsecurity from v3/master (currently on 02426466) with modsecurity-nginx on abbf2c4.
While developing rules I'm using the Serial audit log format, since it is easy to tail -f and truncate. The output of the E block of an audit entry looks suspicious:
---9vOuhfZZ---A--
[28/Jul/2017:09:34:02 +0200] 150122724251.842785 127.0.0.1 52804 127.0.0.1 8085
---9vOuhfZZ---B--
GET /foo?file=/../../etc/passwd HTTP/1.1
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: HTTPie/0.9.8
X-Forwarded-For: x.x.x.x
host: example.com
---9vOuhfZZ---D--
---9vOuhfZZ---E--
<B3><C9>(<C9>ͱ<E3><E5><B2><C9>HML<B1><B3>)<C9>,<C9>I<B5>310Vp<CB>/J<CA>LIIͳ<DA>胕^@<95>&<E5><A7>T*$<A5>'<E7><E7><E4>^W<D9>*<95>gd<96><A4>*<81>
<8C>HN<CD>+I-<B2><B3><C9>0D7^A(b<A3>^O<95>^F<D9>^ET^D<E5><E5><A5>g<E6>U<E8>ESC<EA>^Y^Z<EB>^Y!+<D1>^GY^B2T^_<EA>@^@a^Qs<8F><A9>^@^@^@<FF><FF>
<FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF>
<FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF>
<FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF><FF>
[cut]
(The large number of 0xFF here is because my nginx was just restarted, later requests have more entropy)
According to https://github.com/SpiderLabs/ModSecurity/wiki/ModSecurity-2-Data-Formats#intended-response-body-e the E block is the intended response body.
The response body for the /foo URI is the stock nginx 403 Forbidden page. I don't think the binary representation above is of that HTML page.
To me it looks like uninitialized memory is being logged. If that is the case, it can be both confusing and downright misleading to read, depending on what the allocation heap was used for last time.
@zimmerle commented on Tue Aug 01 2017
Hi @lkarsten,
There is also the possibility of this data being gzip/encoded. Can you disable the gzip compression in the end server to test it?
@lkarsten commented on Wed Aug 02 2017
Hi. Thanks for replying.
This may be the case that it is gzip. Having downloaded the Forbidden page body, gziped it and looked at it, the binary sequence looks familiar:
$ hexdump -C foo.gz
00000000 1f 8b 08 08 61 a2 81 59 00 03 66 6f 6f 00 b3 c9 |....a..Y..foo...|
00000010 28 c9 cd b1 e3 e5 b2 c9 48 4d 4c b1 b3 29 c9 2c |(.......HML..).,|
00000020 c9 49 b5 33 31 30 51 f0 cb 2f 51 70 cb 2f cd 4b |.I.310Q../Qp./.K|
Note the be c9 28 c9 sequence around the end of the first line.
I notice that the E block is not logged to the audit log now that gzip is disabled on the server and the response is human readable. SecAuditLogParts ABIJDEFHZ is set. Is this to be expected / as documented?
I attempted to request a binary file (favicon.ico, 76 bytes long) to see if that tickled it into logging the E block again, and my original issue reappeared. This is with the CRS3.0 rule set loaded, that triggers a forbidden response when "<script" is in the GET arguments.
Here is from the log:
---iplJSGEO---A--
[02/Aug/2017:12:06:15 +0200] 150166837544.813536 127.0.0.1 39146 127.0.0.1 8085
---iplJSGEO---B--
GET /favicon.ico?%3Cscript HTTP/1.1
Accept-Encoding: gzip, deflate
Accept: */*
User-Agent: HTTPie/0.9.8
host: example.com
---iplJSGEO---D--
---iplJSGEO---E--
<B3><C9>(<C9>ͱ<E3><E5><B2><C9>HML<B1><B3>)<C9>,<C9>I<B5>310Vp<CB>/J<CA>LIIͳ<DA>胕^@<95>&<E5><A7>T*$<A5>'<E7><E7><E4>^W<D9>*<95>gd<96><A4>*<81><8C>HN<CD>+I-<B2><B3><C9>0D7^A(b<A3>^O<95>^F<D9>^ET^D<E5><E5><A5>g<E6>U<E8>ESC<EA>^Y^Z<EB>^Y!+<D1>^GY^B2T^_<EA>@^@a^Qs<8F><A9>^@^@^@w<A4><C7>U^@^@' is not interesting to audit logs, relevant code(s): `^(?:5|4(?!04))'.^@`^@^@
^@^@^@^@^@^@^@^@^@^@^@^@^@g: TX.OUTBOUND_ANOMALY_SCORE to: 0^@ param "4" Wa!^@^@^@^@^@^@^@X;'z,^?^@^@X;'z,^?^@^@<A0>^A^@^@^@^@^@^@@^@^@^@^@^@^@^@Ўw<A4><C7>U^@^@<E8><C6>v<A4><C7>U^@^@
^C^@^@^@^@^@^@^@msg^@ALY_SCORE.^@:P<B0>v<A4><C7>U^@^@G^A^@^@^@^@^@^@<E1>^C^@^@^@^@^@^@0<93>v<A4><C7>U^@^@<80><A6>v<A4><C7>U^@^@ ^@^@^@^@^@^@^@0^@^@^@^@^@^@^@<90><B1>v<A4><C7>U^@^@^M^@^@^@^@^@^@^@RULE:severity^@^@^@ "Eq" wi<91>^B^@^@^@^@^@^@<80><A6>v<A4><C7>U^@^@ <CB>v<A4><C7>U^@^@980110^@g operator "^@^@^@^@^@a^B^@^@^@^@^@^@<A8>='z,^?^@^@<A8>='z,^?^@^@_score_threshold^@ator "G!^@^@^@^@^@^@^@^@<C7>v<A4><C7>U^@^@^@<AE>v<A4><C7>U^@^@P^@^@^@^@^@^@^@0^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@und_anomaly_score_threshold^@UTBO1^@^@^@^@^@^@^@<A0><C3>v<A4><C7>U^@
^@X;'z,^?^@^@4^@^@^@^@^@^@^@ ^A^@^@^@^@^@^@<B0>^@^@^@^@^@^@^@P^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ if this request is relevant to be part of the audit logs.^@ly_sca^A^@^@^@^@^@^@X;'z,^?^@^@P<A1>v<A4><C7>U^@^@TX:OUTBOUND_ANOMALY_SCORE.
Notice the TX:OUTBOUND_ANOMALY_SCORE and "RULE:severity" strings. These are not a part of the ~150 byte "403 Forbidden" response body I expected. The body is correctly received on the client side.
The logged E block length is a whole lot longer, roughly 4100 bytes in the example above.
@zimmerle commented on Mon Sep 25 2017
Hi @lkarsten,
While downloading a binary file what do you expect to have on the response body if not the file itself?
@lkarsten commented on Mon Sep 25 2017
Hi @zimmerle ,
The main issue here, if I recall correctly, is that there seem to be extra non-related data logged as part of the body. My favicon.ico file does not contain "TX.OUTBOUND_ANOMALY_SCORE to: 0".
This is in my opinion misleading and an information leak.
If your question was serious, I think the least surprising would be if I got the binary file. Not sure about encoding, as the log is ascii elsewhere. How are one supposed to write a parser for this file? Line ending may be a part of the binary body. Should the parser switch from line mode to searching for a symbol based on which section it sees? Won't (isn't) that overly complicated?
For gzipped bodies .. always a pain, I think I'd expect the logged version to be unzipped and a flag set somewhere that it was done. This log is for in the end humans to consume, and it should be possible to understand without external tools. (like gunzip)
@zimmerle commented on Thu Sep 28 2017
Hi @lkarsten,
I've got your point. At first sight, I didn't notice the collections messages in the middle of the binary data.
I am trying to reproduce it here without success. Do you mind to share a little bit more information about your env? nginx version, ModSecurity compilation flags and what else you think it is relevant to replicate the problem...