Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ascii/Wide matching bug on some regexes #1933

Closed
vthib opened this issue Jul 19, 2023 · 4 comments
Closed

Ascii/Wide matching bug on some regexes #1933

vthib opened this issue Jul 19, 2023 · 4 comments
Labels

Comments

@vthib
Copy link
Contributor

vthib commented Jul 19, 2023

Describe the bug

With some regexes marked as "ascii wide", no matches are reported on inputs that should match.

To Reproduce

Here are some tests cases showing the issue:

// Works
assert_true_rule_blob(
    "rule test { strings: $a = /a.{1}c/ ascii wide condition: $a }",
    "a\0b\0c\0");

// Fails
assert_true_rule_blob(
    "rule test { strings: $a = /a.{1}1/ ascii wide condition: $a }",
    "a\0b\01\0");

Expected behavior

The second test case should match.

Additional context

The regex might seem a bit far-fetched, I have actually stumbled upon the issue with the regex a.{5}i, and just reduced for the reproducibility.

The bug does not exhibit when using "a.i". From what I can understand from some debugging:

  • For the "a.{1}c" regex, the "a" literal is extracted. On a match, we do:
    • an ascii forward match: this fails
    • a wide forward match: this succeeds.
    • the backward validator does nothing
    • there is match
  • For the "a.{1}1" regex, the "1" literal is extracted. On a match, we do:
    • an ascii forward match: this succeeds as it matches "1"
    • an ascii backward match: this fails
    • there is no match
  • The "a.i" regex gets expanded into literals, hence no bug as it is related to ascii/wide detection on the AC literal matched.

So from what i can tell, the issue can happen when the ascii forward matcher is a prefix of the wide forward matcher. I should thus be possible to trigger with raw strings or hex strings, or even with some clever xor transformation, however I haven't tried to come up with more "realistic" strings that could trigger this bug.

@vthib vthib added the bug label Jul 19, 2023
plusvic added a commit that referenced this issue Jul 21, 2023
@plusvic
Copy link
Member

plusvic commented Jul 21, 2023

Fixed in 95dafbd

@plusvic plusvic closed this as completed Jul 21, 2023
@vthib
Copy link
Contributor Author

vthib commented Jul 23, 2023

Thanks for the fix! Do you have a released planned in the near future? I would love to get one to get the fix for this issue and #1928

@vthib
Copy link
Contributor Author

vthib commented Aug 20, 2023

ping @plusvic, would it be possible to make a new release given the numerous fixes since the last one? That would be great :)

@plusvic
Copy link
Member

plusvic commented Aug 22, 2023

Sorry for the delay @vthib, I'm starting to work in a 4.4.0 release that I hope to release soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants