Conversation
|
[approve ci] |
|
We had a fairly long conversation about this in the weekly Monday meeting. We talked about whether there might be a more general solution, but couldn't immediately think of one. Does the escaping assume the message has no deviations from the IETF standards? |
|
|
||
| for (int i = 0; i < len; i++) { | ||
| char c = buf[i]; | ||
| switch (c) { |
There was a problem hiding this comment.
Here are two alternatives to a switch statement: https://godbolt.org/z/G3fz3469v
There was a problem hiding this comment.
Thanks for your suggestion.
I tried to modify escape_json using bitset but I found that I need the mapping like \b to b.
I made a modified version at
https://github.com/hnakamur/json-escape-cpp-experiment/blob/main/test.cpp
This version properly escapes the DEL 0x7f character too.
Can I replace escape_json with this one?
There was a problem hiding this comment.
Standard headers with a .h extension are deprecated and replaced in C++, so it's best to use rather than <string.h>, and std::strlen rather than strlen. https://stackoverflow.com/questions/8380805/difference-between-string-h-and-cstring .
There was a problem hiding this comment.
How about this?
namespace
{
class EscLookup
{
public:
static const char NO_ESCAPE{'\0'};
static const char LONG_ESCAPE{'\x01'};
static char result(char c)
{
return _lu.table[static_cast<unsigned char>(c)];
}
private:
struct _LUT
{
_LUT();
char table[1 << 8];
};
inline static _LUT const _lu;
};
EscLookup::_LUT::_LUT()
{
for (unsigned i = 0; i < ' '; ++i) {
table[i] = LONG_ESCAPE;
}
for (unsigned i = '\x7f'; i < sizeof(table); ++i) {
table[i] = LONG_ESCAPE;
}
// Short escapes.
//
table['\b'] = 'b';
table['\t'] = 't';
table['\n'] = 'n';
table['\f'] = 'f';
table['\r'] = 'r';
table['\\'] = '\\';
table['\"'] = '\"';
}
char nibble(int nib)
{
return nib >= 0xa ? 'a' + (nib - 0xa) : '0' + nib;
}
} // end anonymous namespace
static int
escape_json(char *dest, const char *buf, int len)
{
int escaped_len = 0;
for (int i = 0; i < len; i++) {
char c = buf[i];
char ec = EscLookup::result(c);
if (__builtin_expect(EscLookup::NO_ESCAPE == ec, 1)) {
if (dest) {
*dest++ = c;
}
escaped_len++;
} else if (EscLookup::LONG_ESCAPE == ec) {
if (dest) {
*dest++ = '\\';
*dest++ = 'u';
*dest++ = '0';
*dest++ = '0';
*dest++ = nibble(static_cast<unsigned char>(c) >> 4);
*dest++ = nibble(c & 0x0f);
}
escaped_len += 6;
} else { // Short escape.
if (dest) {
*dest++ = '\\';
*dest++ = ec;
}
escaped_len += 2;
}
} // end for
return escaped_len;
}
There was a problem hiding this comment.
Thanks! I added commit ab97729 before noticing your code.
I'll update it later.
Also I noticed slicing is done after escape now and produces incorrect strings like \u0 or so.
I'll looking into it later this too.
There was a problem hiding this comment.
Copied the code at #8886 (comment) to a65b8b4 with specifying the author ywkaras. Is it OK with you?
There was a problem hiding this comment.
Copied the code at #8886 (comment) to a65b8b4 with specifying the author ywkaras. Is it OK with you?
Yes OK.
|
Thanks for taking time to discuss log escape in the meeting.
Could you tell me what do you mean by "the IETF standards"? |
apache/trafficserver#8886 (comment) with table['\"'] value '\"' instead of '\\'
|
That page seems to say / should also be escaped. It doesn't say that " should be escaped as \ . Seems like that would confuse parses of the log, if \ and " both are mapped to \. I looked through the IETF standards a little, they are not very clear. I think the function should handle bytes in the 0x80 to 0xff . traffic_dump is only uses once in a while. It would not do harm to make the function more efficient, since logging is often done for each and every transaction. |
ECMA-404 The JSON Data Interchange Standard (linked from https://www.json.org/json-en.html) p.11
|
|
|
||
| char *p = dest; | ||
|
|
||
| // int res1 = unmarshal_http_method (buf, p, len); |
|
As well as fixing the compiler warnings, I think we need some Au test coverage of this new feature before it can be merged. |
Sorry I read the code wrong.
The goal is that someone could write a script to parse a log without having to read the source code to find the details of how the escaping works. If the escaping standard has different versions and options, we should say in the ATS documentation which version and options ATS implements. |
(Copied from apache#8886 (comment)) Co-authored-by: Walt Karas <wkaras@yahooinc.com>
|
Changed to escape forward slash at 110ce54 The tasks left:
|
|
|
||
| // Short escapes. | ||
| // | ||
| table['\b'] = 'b'; |
There was a problem hiding this comment.
Our PR checks are now tricky to use. After you click on details, you must then again find the URL for the check in the list of URLs, and click on it. Then you can click Console Output.
With the change above, you can change these to be like:
table('\\b') = 'b';
to get rid of the warning that is treated as an error.
|
Thanks for your review! |
* Add escape json for logging * Modify escape_json to escape DEL (0x7f) correctly * Modify escape_json for logging (Copied from #8886 (comment)) Co-authored-by: Walt Karas <wkaras@yahooinc.com> * Fix escape of doublequote in escape_json * Escape forward slash in escape_json * Removed a comment-outed line in LogAccess::unmarshal_http_text_json * Fix slicing for escape_json in LogAccess.cc * Cast char index to int to avoid char-subscripts warnings * Add AuTest for json log escape Co-authored-by: Walt Karas <wkaras@verizonmedia.com> Co-authored-by: Walt Karas <wkaras@yahooinc.com> (cherry picked from commit d187218)
|
Cherry-picked to v9.2.x |
* asf/9.2.x: Updated ChangeLog Modifying array outside bounds (apache#8806) Add escape json for logging (apache#8886) Removes remaining vestiges to the backdoor port (apache#8793) Add Au test for prefetch plugin in simple mode. (apache#8810) uri_signing plugin: Fix missing payload validation for the iss field. (apache#8901) Cleanup SNIConfig (apache#8892) Fix autest uses of File exists parameter (apache#8906) Do not modify Transfer-Encoding header on retry (apache#8899) Fix plugin stats_over_http OK reason phrase (apache#8902) Add AuTest for stats-over-http plugin (apache#8422)
This pull requests adds an optional
escapeattribute informatentries inlogging.yaml.The possible values are
jsonandnone.An example
logging.yamlAn example request:
An example log output: