Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add_contextual_data() - database.csv parsing issues #1179

Closed
mitzkia opened this issue Aug 23, 2016 · 6 comments
Closed

add_contextual_data() - database.csv parsing issues #1179

mitzkia opened this issue Aug 23, 2016 · 6 comments
Assignees

Comments

@mitzkia
Copy link
Contributor

mitzkia commented Aug 23, 2016

I have found some issues while tested database.csv file for add_contextual_data() parser with possible rfc4180 inputs.
RFC 4180 reference: [https://tools.ietf.org/html/rfc4180#section-2]

I have used the following general syslog-ng configuration files:

[syslog-ng.conf]

@version: 3.8
options { keep_hostname(yes); };
source s_file { file("/tmp/input.log"); };
parser p_acd { add_contextual_data(database(/tmp/database.csv) selector("${HOST}"));};
destination d_file { file("/tmp/output.log" template("$(format-json --scope dot-nv-pairs)\n")); };
destination d_file2 { file("/tmp/output2.log" template(".bbb: ${.bbb}, .eee: ${.eee}\n") ); };
log { source(s_file); parser(p_acd); destination(d_file); destination(d_file2); flags(flow-control); };

[input.log]

<38>Feb 25 14:09:07 10.10.10.10 testapp: test message

Found issue 1

  • CR not stripped from line endings

[database.csv] - with CRLF line endings on both lines

10.10.10.10,.bbb,ccc
10.10.10.10,.eee,fff

[syslog-ng console log] - CR (\r) not stripped, only LF (\n)

Outgoing message; message='.bbb: ccc\x0d, .eee: fff\x0d\x0a'
Outgoing message; message='{"_eee":"fff\r","_bbb":"ccc\r"}\x0a'

[output2.log]

.bbb: ccc
, .eee: fff

expected output would be:

Outgoing message; message='{"_eee":"fff","_bbb":"ccc"}\x0a'

Found issue 2

  • Fields containing line breaks (CRLF) not allowed
  • syslog-ng did not start with the following database.csv file
  • expected behavior: syslog-ng can start

[database.csv]

"10.10.10.10",".b
bb","cc c"

[syslog-ng console error]

[2016-08-23T16:30:52.128575] Error while parsing add_contextual_data database;
[2016-08-23T16:30:52.128607] Failed to load the database file.;

Found issue 3

  • Escaped double-quotes added other double-quote at the end of the field

[database.csv]

"10.10.10.10",".bbb","c""cc"

[syslog-ng.conf]

@version: 3.8
options { keep_hostname(yes); };
source s_file { file("/tmp/input.log"); };
parser p_acd { add_contextual_data(database(/tmp/database.csv) selector("${HOST}"));};
destination d_file { file("/tmp/output.log" template("$(format-json --scope dot-nv-pairs)\n")); };
destination d_file2 { file("/tmp/output2.log" template(".bbb: ${.bbb}\n") ); };
log { source(s_file); parser(p_acd); destination(d_file); destination(d_file2); flags(flow-control); };

[syslog-ng console output]

[2016-08-23T16:40:02.275031] Outgoing message; message='.bbb: c"cc"\x0d\x0a'
[2016-08-23T16:40:02.275180] Outgoing message; message='{"_bbb":"c\"cc\"\r"}\x0a'

[output2.log]

.bbb: c"cc"

expected output:

.bbb: c"cc

syslog-ng version:
syslog-ng 3.8.1
Installer-Version: 3.8.1
Revision: 3.8.1-1

@MrAnno MrAnno assigned MrAnno and unassigned MrAnno Aug 25, 2016
@MrAnno
Copy link
Collaborator

MrAnno commented Aug 25, 2016

Thanks, @mitzkia. 😄

Issue 1: I'll send a quick fix tomorrow.


Issue 2: ContextInfoDB operates on lines (calls contextual_data_record_scanner_get_next() line by line). To solve this issue, ContextualDataRecordScanner should receive a 'stream' instead of these EOL-separated 'data chunks', or at least it should be able to ask for more data (but as I see, CSVScanner is not really prepared for this idea).


Issue 3: This is a CSVScanner "bug" around _parse_value_with_whitespace_and_delimiter(), I'll also try to fix this tomorrow.
I was wrong, we can just set the dialect to CSV_SCANNER_ESCAPE_DOUBLE_CHAR.

@bazsi
Copy link
Collaborator

bazsi commented Aug 26, 2016

Hmmm do we really need to support the embedded new line stuff? I mean I
understand that csv as a format does support it this way, but I dont think
we need that as an actual use-case, so we could just document it.
On Aug 25, 2016 23:54, "László Várady" notifications@github.com wrote:

Thanks, @mitzkia https://github.com/mitzkia. 😄

Issue 1: I'll send a quick fix tomorrow.

Issue 2: ContextInfoDB operates on lines (calls contextual_data_record_
scanner_get_next() line by line). To solve this issue,
ContextualDataRecordScanner should receive a 'stream' instead of these
EOL-separated 'data chunks', or at least it should be able to ask for more
data (but as I see, CSVScanner is not really prepared for this idea).

Issue 3: This is a CSVScanner "bug" around parse_value_with_whitespace
and_delimiter(), I'll also try to fix this tomorrow.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#1179 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AArldj7cU3hgLcpchpvI3eo12ZkvAFMoks5qjg8OgaJpZM4JrBTF
.

@lbudai
Copy link
Collaborator

lbudai commented Aug 26, 2016

@bazsi : agree with you :-)

MrAnno added a commit to MrAnno/syslog-ng that referenced this issue Aug 26, 2016
…e-char

Fixes syslog-ng#1179

Signed-off-by: László Várady <laszlo.varady@balabit.com>
@kvch
Copy link
Contributor

kvch commented Aug 26, 2016

@bazsi @lbudai Initially, I agreed with you. However, I can imagine a situation when the CSV was generated on a Windows and then moved to a Linux machine where syslog-ng operates. So those few lines in the PR could make syslog-ng a bit more user-friendly. Because of this, I would say that it should be fixed. Of course, I understand that it is not the main problem in this issue.

@bazsi
Copy link
Collaborator

bazsi commented Aug 26, 2016

I wasn't referring to the CR LF sequence of newlines, but the fact whether
we support multiple-line entries at all. CR LF sequences should be
supported as record terminators, I agree on that front, but newlines within
a specific CSV field is not really needed.

Bazsi

On Fri, Aug 26, 2016 at 3:00 PM, Noémi Ványi notifications@github.com
wrote:

@bazsi https://github.com/bazsi @lbudai https://github.com/lbudai
Initially, I agreed with you. However, I can imagine a situation when the
CSV was generated on a Windows and then moved to a Linux machine where
syslog-ng operates. So those few lines in the PR could make syslog-ng a bit
more user-friendly. Because of this, I would say that it should be fixed.
Of course, I understand that it is not the main problem in this issue.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#1179 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AArldqIvQBgcfN5AI1TgF6YgJIXCjSbxks5qjuNggaJpZM4JrBTF
.

@pzoleex
Copy link
Collaborator

pzoleex commented Sep 2, 2016

I tried it, the fix works

MrAnno added a commit to MrAnno/syslog-ng that referenced this issue Sep 2, 2016
…e-char

Fixes syslog-ng#1179

Signed-off-by: László Várady <laszlo.varady@balabit.com>
@lbudai lbudai removed the in progress label Sep 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants