New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logmsg: sanitize sdata keys #1650
Conversation
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
ccf8776
to
4146ac9
Compare
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
lib/logmsg/logmsg.c
Outdated
* The goal is to create syntactically valid structured data fields. */ | ||
const guchar *ustr = (const guchar *) sstr; | ||
|
||
for (gint i = 0; i < len; i++) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please consider using gsize
, gssize
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Any JSON object can be parsed to SDATA, so the key could contain any character, while the specification does not declare any way to encode the the keys, just the values. The goal is to have a syntactically valid syslog message. Fixes syslog-ng#1583 Signed-off-by: Máté Farkas <mate.farkas@balabit.com>
4146ac9
to
eee3bb4
Compare
Build SUCCESS, the tests were executed on test branch: master and test suite: functions |
|
||
for (gssize i = 0; i < len; i++) | ||
{ | ||
if (!isascii(ustr[i]) || ustr[i] == '=' || ustr[i] == ' ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion:
Not sure how much performance penalty is included due to this, but maybe it would worth build a global constant table: ascii->bool which can tell in one step if escape is needed, instead of the 4 comparison above.
|
||
msg = log_msg_new_empty(); | ||
log_msg_set_value_by_name(msg, ".SDATA.foo.bácsi", "bácsi", -1); | ||
assert_sdata_value_equals(msg, "[foo b%C3%A1csi=\"bácsi\"]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little bit strange to see here that the key is urlencoded, but the value isn't. Why do not we have the same problem with the values?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The RFC 5424 says (on page 7):
STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
SD-ELEMENT = "[" SD-ID *(SP SD-PARAM) "]"
SD-PARAM = PARAM-NAME "=" %d34 PARAM-VALUE %d34
SD-ID = SD-NAME
PARAM-NAME = SD-NAME
PARAM-VALUE = UTF-8-STRING ; characters '"', '\' and
; ']' MUST be escaped.
SD-NAME = 1*32PRINTUSASCII
; except '=', SP, ']', %d34 (")
We do the escaping in value, and any UTF-8 character is accepted there.
In the param name (key) there is no escaping mechanism, and only ASCII characters are allowed, so we have to replace the non-valid characters with valid ones. We did not want to loss data, so sanitizing with a constant character wouldn't be a good idea (for example with _
, both "ár" and "űr" would be "_r").
Any JSON object can be parsed to SDATA, so the key could contain any character, while the specification does not declare any way to encode the the keys, just the values.
The goal is to have a syntactically valid syslog message.
Fixes #1583
SDATA keys in the RFC 5424