Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logmsg: sanitize sdata keys #1650

Merged
merged 1 commit into from Aug 25, 2017
Merged

Conversation

presidento
Copy link
Contributor

@presidento presidento commented Aug 22, 2017

Any JSON object can be parsed to SDATA, so the key could contain any character, while the specification does not declare any way to encode the the keys, just the values.

The goal is to have a syntactically valid syslog message.

Fixes #1583

SDATA keys in the RFC 5424

...
      STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
      SD-ELEMENT      = "[" SD-ID *(SP SD-PARAM) "]"
      SD-PARAM        = PARAM-NAME "=" %d34 PARAM-VALUE %d34
      SD-ID           = SD-NAME
      PARAM-NAME      = SD-NAME
...
      SD-NAME         = 1*32PRINTUSASCII
                        ; except '=', SP, ']', %d34 (")
...
      SP              = %d32
      PRINTUSASCII    = %d33-126
...

@kira-syslogng
Copy link
Contributor

Build SUCCESS, the tests were executed on test branch: master and test suite: functions

@kira-syslogng
Copy link
Contributor

Build SUCCESS, the tests were executed on test branch: master and test suite: functions

* The goal is to create syntactically valid structured data fields. */
const guchar *ustr = (const guchar *) sstr;

for (gint i = 0; i < len; i++)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please consider using gsize, gssize here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Any JSON object can be parsed to SDATA, so the key could contain any
character, while the specification does not declare any way to encode
the the keys, just the values.

The goal is to have a syntactically valid syslog message.

Fixes syslog-ng#1583

Signed-off-by: Máté Farkas <mate.farkas@balabit.com>
@kira-syslogng
Copy link
Contributor

Build SUCCESS, the tests were executed on test branch: master and test suite: functions


for (gssize i = 0; i < len; i++)
{
if (!isascii(ustr[i]) || ustr[i] == '=' || ustr[i] == ' '
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a suggestion:
Not sure how much performance penalty is included due to this, but maybe it would worth build a global constant table: ascii->bool which can tell in one step if escape is needed, instead of the 4 comparison above.


msg = log_msg_new_empty();
log_msg_set_value_by_name(msg, ".SDATA.foo.bácsi", "bácsi", -1);
assert_sdata_value_equals(msg, "[foo b%C3%A1csi=\"bácsi\"]");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little bit strange to see here that the key is urlencoded, but the value isn't. Why do not we have the same problem with the values?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RFC 5424 says (on page 7):

      STRUCTURED-DATA = NILVALUE / 1*SD-ELEMENT
      SD-ELEMENT      = "[" SD-ID *(SP SD-PARAM) "]"
      SD-PARAM        = PARAM-NAME "=" %d34 PARAM-VALUE %d34
      SD-ID           = SD-NAME
      PARAM-NAME      = SD-NAME
      PARAM-VALUE     = UTF-8-STRING ; characters '"', '\' and
                                     ; ']' MUST be escaped.
      SD-NAME         = 1*32PRINTUSASCII
                        ; except '=', SP, ']', %d34 (")

We do the escaping in value, and any UTF-8 character is accepted there.
In the param name (key) there is no escaping mechanism, and only ASCII characters are allowed, so we have to replace the non-valid characters with valid ones. We did not want to loss data, so sanitizing with a constant character wouldn't be a good idea (for example with _, both "ár" and "űr" would be "_r").

@furiel furiel merged commit c167662 into syslog-ng:master Aug 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants