-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IMPROVEMENT] add separator character option to 'name-value-list' parser #335
[IMPROVEMENT] add separator character option to 'name-value-list' parser #335
Conversation
Thank you for this work, I was fighting this just the other day and thinking
that I needed to dig into it again.
rather than using extradata, I would suggest using the new json support in the
v2 syntax
so instead of %key:name-value-list:|% do %key:name-value-list:{"separator":"|"}%
that way we can add 'assignment' (instead of it always being '=' allow for lists
like foo:bar,baz:abc)
also, can this support a quoted string as the value? This would possibly be a
third parameter "valuetype"
I can also see us adding 'ignore-whitespace' as an optional parameter.
these things don't need to be added now, but rather than using 'extradata', I'd
like us to use a named field to support adding the others later without the
separator being such an oddball.
if this only accepts a single character separator, it should complain if you
feed it more. what would it take to support longer separators?
David Lang
…On Tue, 17 Mar 2020, frikilax wrote:
Date: Tue, 17 Mar 2020 03:00:59 -0700
From: frikilax ***@***.***>
Reply-To: rsyslog/liblognorm
***@***.***>
To: rsyslog/liblognorm ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [rsyslog/liblognorm] [IMPROVEMENT] add separator character option to
'name-value-list' parser (#335)
# Description
An **extradata** parameter is now available for the experimental perser **name-value-list**, this parameter allows to set a different character as separator between key/value couples.
# What's new
Before, this parser could only handle logs like
`a=1 b=2 c=text_without_spaces`
But with this parameter, a character can be specified to act as the matching separator between key/value couples:
rulesbase configuration:
```
rule=:%keyval:name-value-list:|%
```
or
```
rule=:%{"name": "keyval", "type": "name-value-list", "extradata": "|"}%
```
the sample:
`a=1|b=2|c=text_without_spaces|d=text with spaces`
# Details
This change only takes the first character of your extradata field, so
```
rule=:%keyval:name-value-list:test%
```
would not throw an error, but would split only on "t"
# Credits
thanks to @KGuillemot for the help
You can view, comment on, or merge this pull request online at:
#335
-- Commit Summary --
* name-value-list type:: add optional separator to override default space separation between key/value instances
* FIX:: compilation errors
-- File Changes --
M src/parser.c (46)
M src/parser.h (2)
M src/pdag.c (2)
-- Patch Links --
https://github.com/rsyslog/liblognorm/pull/335.patch
https://github.com/rsyslog/liblognorm/pull/335.diff
|
Hey @davidelang! We were going to give this as a simple improvement as we didn't need more yet, but after coming across some other problems, I think we'll still add support for a separator and an assignement character. I think I will also add errors when those fields are more than 1 char, to give some feedback to the users (as it's easy/fast to add). We also encountered problems with escaped characters, so I will also add support to ignored matched characters that are escaped. However, quoted strings and multibytes separators might need a bit more work ,which we don't have at the moment sorry... |
quotes support has been added |
Hi, I did a PR (#234) a long time ago on the same topic but I can no longer update it. Benoit |
This PR tries to achieve more than just quoting detection (the main goal wasn't quoting to begin with) so I'm not sure it's relevant to rebase this work on your PR. |
My PR is probably dead because I can't update it. Benoit |
Frankly, what is done in your PR has been added more or less in this PR, you can check this commit |
My goal is not credit but having such feature merged into main branch and I wish it in the next release. You treat the '\' and allow changing separator, it's better than mine. To make it accepted, I recommand send only clean commits (merge errors & repush with --force). There is some coding-style errors. I think you sould replace "ass" variable. In the last commit, you can group all specific "quoting" code into one "if" terminated with "goto". It make a more lisible commit. |
Hi Rainer, As you tell us in #234 ("if you don't hear back pls ping me"), do you have some new about this PR & state of your centos7 buildbot ? Regards Benoit |
Hi @frikilax , any updates about this PR ? |
6755059
to
8113a7e
Compare
Hi, any updates ? |
That PR needs a fresh push to restart the current CI. If this will not happen, I'll branch it off to a new branch, submit that as PR and see what happens. Feel free to remind me next Tuesday if nothing happend until then. |
…TING - new parameter 'separator': defines a custom character to look for between key/value pairs (replaces whitepsace when defined) - new parameter 'assignator': defines a custom character to look for between key and value (replaces '=' when defined, but DISABLES key name validation) - add support for (simple and double) quoting in values - add support for escaped characters in values
8113a7e
to
a55b814
Compare
Sorry, didn't have time to come back to this PR and completely forgot about it, I will try to get back to it before next Monday |
…lue is not valid - to allow parsing rest of line
Description
An extradata parameter is now available for the experimental parser name-value-list, this parameter allows to set a different character as separator between key/value couples.
What's new
Before, this parser could only handle logs like
a=1 b=2 c=text_without_spaces
But with this update, a parameter 'separator' can be specified to act as the matching separator between key/value couples, and a parameter 'assignator' as a separator between key and value.
for example:
or
are equivalent (by default, legacy 'extradata' is assigned to the 'separator' parameter in the first case)
the matching samples will then be:
a=1|b=2|c=text_without_spaces|d=text with spaces
key=42|other_key=hello there|yet-another.key=i'm a value|
When 'assignator' is left by default, it will match '=' and will check for valid naming of keys (allowing only alphanumeric characters, '.', '-' and '_').
In the case someone wants to match another character by setting 'assignator', the parser will loosen the restriction and only cut keys and values by separating between 'assignator' and 'separator'. However, key name validation will NOT occur !
Rulebase with 'assignator' defined:
examples of matching sample:
a=1|key_two=hello world|thirdkey:=a valid value containing an \| escaped pipe||fourth#key=42
and result:
One can note character escaping will still be respected.
Credits
thanks to @KGuillemot for the help
thanks to @bdolez for the original work on quoting (see #234)