Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IIS_31547: The iis parsing for IPv6 logs #3315

Merged
merged 5 commits into from
May 17, 2022
Merged

Conversation

ishleenk17
Copy link
Contributor

@ishleenk17 ishleenk17 commented May 10, 2022

What does this PR do?

This PR addresses the issue of parsing access logs in IIS with IPv6 addresses.

The below log is one of the failure scenario.

2022-03-13 02:04:11 fe81::63ae:94c0:196e:8adf%3 GET /pbserver/..Áœ..Áœ..Áœ..Áœ..Áœ../winnt/system32/cmd.exe /c+dir+c:+/OG 8080 - fe81::63ae:94c0:196e:8adf%3 Mozilla/4.0+(compatible;+MSIE+8.0;+Windows+NT+5.1;+Trident/4.0) - 200 0 0 12 81.2.69.143,81.2.69.144

Pipeline Test Results:
Run pipeline tests for the package
--- Test results for package: iis - START ---
FAILURE DETAILS:
iis/access test-iis-access-72.log:
[0] unexpected pipeline error: For input string: "81.2"

This new IPV6 pattern changes the zone_id to match non-space characters

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

To test this PR, test the ingest pipeline by running elastic-package test pipeline -v -g

Related issues

@ishleenk17 ishleenk17 requested a review from a team as a code owner May 10, 2022 14:10
@ishleenk17 ishleenk17 requested review from andrewkroh, a team and jsoriano May 10, 2022 14:11
@ishleenk17 ishleenk17 requested review from mtojek and sayden May 10, 2022 14:18
@elasticmachine
Copy link

elasticmachine commented May 10, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-05-16T15:18:31.103+0000

  • Duration: 13 min 33 sec

Test stats 🧪

Test Results
Failed 0
Passed 82
Skipped 0
Total 82

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@elasticmachine
Copy link

elasticmachine commented May 10, 2022

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (2/2) 💚
Files 100.0% (2/2) 💚 3.695
Classes 100.0% (2/2) 💚 3.695
Methods 85.714% (30/35) 👎 -2.527
Lines 91.398% (255/279) 👍 2.466
Conditionals 100.0% (0/0) 💚

Copy link
Contributor

@mtojek mtojek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ishleenk17, could you please add the error message that can be observed without the fix?

Hey @marc-gr, what is the usual way you deal with IPv6+Zone addresses?

@@ -12,6 +12,9 @@ processors:
- grok:
field: event.original
ignore_missing: true
pattern_definitions:
#This IPV6 pattern changes the zone_id to match non-space characters
IPV6: ((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%[^ ]+)?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's one approach. I found in Beats also:

I don't see that you're referring to IPV6. Is it a predefined pattern or you're overwriting it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a predefined pattern which I am overwriting here.
The error message is mentioned in the bug id associated with this (elastic/beats#31547).
Do I need to add it elsewhere ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably a personal preference, but sometimes it's easier to read the PR when you've all details in the PR description and don't need to jump to the issue discussion. Definitely not a bug!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, got it. Added it to the PR description.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer to define a new pattern instead of overwriting a existing one. If there are fixes in the original pattern we would be missing them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer to define a new pattern instead of overwriting a existing one.

If we do that, then it needs to overwrite that patterns that incorporate this like IPORHOST and IP.

If there are fixes in the original pattern we would be missing them.

If there are fixes in IPV6 we won't get them if we define a new pattern either. By redefining the pattern at least we can still get fixes for the patterns that build upon IPV6 (like IPORHOST).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, then as you prefer :)

Copy link
Member

@andrewkroh andrewkroh May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say I have a strong preference 😄 , but I did want to make a case for the other side.

I wish there was as built-in pattern for IPV6 without the zone so we would not need to redefine such a long pattern (like IPV6 = BUILTIN_IPV6_NOZONE + CUSTOM_IPV6_ZONE_NOTSPACE).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, if we don't change in IPV6, the it would be like overwriting IPOHOST type of patterns.
Hence it would be better to overwrite IPV6.

@andrewkroh @jsoriano What does it take to add a new pattern to the builtin patterns ? If we want to add the zone specific pattern there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it take to add a new pattern to the builtin patterns ? If we want to add the zone specific pattern there.

We could open an issue (or pull request) in Elasticsearch for this change, but we would need a solution in any case till the change arrives to a released version.

But looking through issues in Elasticsearch it seems that this format is not going to be supported: elastic/elasticsearch#22400 (comment)

Copy link
Contributor

@marc-gr marc-gr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure how likely could be that this catches a value that is not an ip, do you think might be worth adding a convert processor to force a cast to an ip type to handle these gracefully?

other than that and the nit, lgtm

@@ -12,6 +12,9 @@ processors:
- grok:
field: event.original
ignore_missing: true
pattern_definitions:
#This IPV6 pattern changes the zone_id to match non-space characters
IPV6: ((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%[^ ]+)?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer to define a new pattern instead of overwriting a existing one.

If we do that, then it needs to overwrite that patterns that incorporate this like IPORHOST and IP.

If there are fixes in the original pattern we would be missing them.

If there are fixes in IPV6 we won't get them if we define a new pattern either. By redefining the pattern at least we can still get fixes for the patterns that build upon IPV6 (like IPORHOST).

ishleenk17 and others added 2 commits May 12, 2022 13:22
Usually no space between the `#` and the comment is reserved for commented out code/data.
Copy link
Collaborator

@lalit-satapathy lalit-satapathy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes looks good as jointly reviewed with @ishleenk17

@ishleenk17 ishleenk17 merged commit 3f553fe into elastic:main May 17, 2022
@ishleenk17 ishleenk17 deleted the iis_31547 branch May 26, 2022 08:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The iis exchange parsing for IPv6 fails to work correctly
7 participants