New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parsing error - value including a space followed by a token with a dot in it #54
Comments
The parser is pretty naive, doing a whole lot of superfluous splitting and replacing and then more work to put things back how it found them when it hits escapes and stuff, all of which could be done more simply with a scanning parser that has simple lookaheads that account for escaping. This is quite similar to recent work that I did in the KF Filter Plugin, so I'm assigning this ticket to myself. |
You may find this helpful. I haven't tested it extensively, but I replaced everything from line 167to line 188 with one message.split():
As far as I can tell, it works for everything except one corner case where the key has an escaped = sign in it. |
@yaauie I thought a little bit about your above comment as well as the comment on the PR 55 and I have to admit, that I feel offended and I think that those comments are not in good accordance with the Elastic Community Code of Conduct. It is ok to propose improvements and we are all happy, if the code becomes better (bug-free, maintainable, more performant). But there is no value add (nor need) in judging the existing code in such a negative way (e.g. "naive" and "superfluous"). CC: @jordansissel, @suyograo |
@breml I did not mean to insult you or the other contributors to this project, and appreciate that you were willing to call out that the effect of my words was offensive. I am sorry. You are absolutely right that this codec has been useful to many people, and that the extensive tests that you and others put in the effort to maintain are what allowed me to confidently build a scanning parser that would be non-breaking. In retrospect, my tone was not respectful of the other contributors and their efforts; in future I will be more intentional with how I communicate. |
@yaauie thanks for the apology, I really appreciate. |
This line:
Should result in this field:
But instead it results in these two fields:
The issue is this regex on line 174 of cef.rb:
It breaks for the case where a value has a space in it followed by a token with a dot in it. That line of code seems specific to one ArcSight mode. Perhaps add a configuration flag for people who aren't using ArcSight to skip it?
The text was updated successfully, but these errors were encountered: