Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grok pattern for NCSA log #60

Open
JulienPalard opened this issue Jun 22, 2015 · 0 comments
Open

Grok pattern for NCSA log #60

JulienPalard opened this issue Jun 22, 2015 · 0 comments

Comments

@JulienPalard
Copy link

I was reading grok patterns and found your pattern to parse combined apache log
your pattern to parse combined apache log, namely the NCSA log format, does not accept spaces in the authuser field:

USERNAME [a-zA-Z0-9._-]+
USER %{USERNAME}
[...]
COMMONAPACHELOG %{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] ...

I read every documentations I could find on the subject, even NCSA HTTPd source code, and could not find any reference of escaping, forbidding, or replacing spaces in the authuser field.

Yet a lot of people do it wrong too (starting by me, I personally have the habit to split my NCSA log lines on space, or use awk, cut, etc on them).

I personally thought at first I was right (to split on spaces, use awk, etc...), so I even opened a ticket on Varnish bug tracker.

I think we should open a conversation here on this subject, in one hand we either have a lot of people doing the same thing wrong, on the other hand we'll have hard times finding a clean way to encode authuser without dropping information, and convincing Apache, Microsoft, and Varnish, nginx, etc... to change their log handling code to change this...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant