Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown operating system #1573

Open
szazo opened this issue Oct 18, 2019 · 2 comments
Open

Unknown operating system #1573

szazo opened this issue Oct 18, 2019 · 2 comments

Comments

@szazo
Copy link

@szazo szazo commented Oct 18, 2019

Firstly thank you for your hard work.

I parsed an IIS log file (in W3C format), and in the result, most of the operating system was "Unknown"

In the log file, user agents stored in the following format:
Mozilla/5.0+(Windows;+U;+Windows+NT+6.0;en-US;+rv:1.9.2)+Gecko/20100115+Firefox/3.6)

I checked the source and found that however in the code (parser.c line 1226) there is a comment that + symbols should be decoded, they are not actually decoded in decode_url():

      /* Make sure the user agent is decoded (i.e.: CloudFront)
       * and replace all '+' with ' ' (i.e.: w3c) */
      logitem->agent = decode_url (tkn);

If I add the following line after this, then user agents are successfully parsed:

      logitem->agent = char_replace (logitem->agent, '+', ' ');
@allinurl

This comment has been minimized.

Copy link
Owner

@allinurl allinurl commented Oct 21, 2019

Thanks for reporting this. Interesting, are you in the latest version?

@szazo

This comment has been minimized.

Copy link
Author

@szazo szazo commented Oct 21, 2019

I am using 1.3 version from release page (GoAccess 1.3 - Friday, November 23, 2018)

Now I checked parser.c on master, it also doesn't contain code for decoding + into space for the UserAgent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.