Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

integer overflow on a particular regex #995

Closed
finnbear opened this issue May 22, 2023 · 7 comments · Fixed by #996
Closed

integer overflow on a particular regex #995

finnbear opened this issue May 22, 2023 · 7 comments · Fixed by #996
Labels

Comments

@finnbear
Copy link

What version of regex are you using?

1.8.1 (latest)

Describe the bug at a high level.

The following code causes an integer-overflow panic in debug mode:

fn main() {
    regex::Regex::new("  {2147483516}{2147483416}{5}").unwrap();
}

What are the steps to reproduce the behavior?

https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0f0b53f5c300a71ce6098e767d13cbc2

What is the actual behavior?

  • integer overflow
    • panic in debug mode
    • seemingly returns the right answer (Err(CompiledTooBig(10485760))) anyway in release mode

What is the expected behavior?

Return the right answer (Err(CompiledTooBig(10485760))) without integer overflow.

Acknowledgement

The fuzzing infrastructure that found this bug was provided to and integrated into surrealdb by Google OSS Fuzz and @silvergasp, respectively.

The consensus from reporting this as a security issue was that it shouldn't be treated as one as users can avoid panics using catch_unwind. The security policy and/or documentation may be updated to reflect this.

BurntSushi added a commit that referenced this issue May 22, 2023
This fixes a bug where the calculation for the min/max length of a regex
could overflow if the counted repetitions in the pattern are big enough.
The panic only happens when debug assertions are enabled, which means
there is no panic by default in release mode.

One may wonder whether other bad things happen in release mode though,
since in that case, the arithmetic will wrap around instead. Since this
is in new code and since the regex crate doesn't yet utilize the min/max
attributes of an Hir, the wrap around in this case is completely
innocuous.

Fixes #995
@BurntSushi BurntSushi added the bug label May 22, 2023
@nathaniel-brough
Copy link
Contributor

Hmm it might be worth adding --debug-assertions to the regex fuzzer in OS's fuzz here https://github.com/google/oss-fuzz/blob/master/projects/rust-regex/build.sh#L18

I would guess that's why it shows up in the surrealdb fuzzer but not the regex fuzzer.

BurntSushi added a commit that referenced this issue May 22, 2023
This fixes a bug where the calculation for the min/max length of a regex
could overflow if the counted repetitions in the pattern are big enough.
The panic only happens when debug assertions are enabled, which means
there is no panic by default in release mode.

One may wonder whether other bad things happen in release mode though,
since in that case, the arithmetic will wrap around instead. Since this
is in new code and since the regex crate doesn't yet utilize the min/max
attributes of an Hir, the wrap around in this case is completely
innocuous.

Fixes #995
BurntSushi added a commit that referenced this issue May 22, 2023
This fixes a bug where the calculation for the min/max length of a regex
could overflow if the counted repetitions in the pattern are big enough.
The panic only happens when debug assertions are enabled, which means
there is no panic by default in release mode.

One may wonder whether other bad things happen in release mode though,
since in that case, the arithmetic will wrap around instead. Since this
is in new code and since the regex crate doesn't yet utilize the min/max
attributes of an Hir, the wrap around in this case is completely
innocuous.

Fixes #995
@BurntSushi
Copy link
Member

This is fixed in regex 1.8.2 (and regex-syntax 0.7.2) on crates.io.

@silvergasp For some reason I thought the fuzzer infrastructure enabled debug assertions automatically somehow. I'll investigate that.

@nathaniel-brough
Copy link
Contributor

nathaniel-brough commented May 22, 2023

Yeah the -O flag is equivalent to compiling in release mode (which is what you want for fuzzing, faster is better). To add debug assertions you'll need cargo fuzz build -O --debug-assertions. There is a note about it in the oss-fuzz documentation

@BurntSushi
Copy link
Member

Yeah, sorry, what I meant is that in the course of fuzzing I had thought I saw panics that were only produced by debug_assertions being enabled. But my memory is fallible.

Would it be better to add debug-assertions = true to the Cargo.toml instead of changing the OSS-fuzz invocation? The former is under my control. The latter isn't.

@nathaniel-brough
Copy link
Contributor

Up to you really, another alternative would be to open a PR with oss-fuzz. For projects that need to be able to easily edit the build.sh file on a regular basis I've actually moved that file from oss-fuzz->upstream_project. You just need to copy it in from the git repo in the Dockerfile. Happy to open a couple of PR's (1 here and 1 on oss-fuzz) to do that if you like?

@nathaniel-brough
Copy link
Contributor

^^ this would also make it pretty trivial to add new fuzzers to oss-fuzz on a regular basis.

@BurntSushi
Copy link
Member

Yeah I need to add new fuzzers to OSS-fuzz for the regex 1.9 release. PRs would be great, thank you!

crapStone added a commit to Calciumdibromid/CaBr2 that referenced this issue Jun 12, 2023
This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
| [regex](https://github.com/rust-lang/regex) | dependencies | patch | `1.8.1` -> `1.8.4` |

---

### Release Notes

<details>
<summary>rust-lang/regex</summary>

### [`v1.8.4`](https://github.com/rust-lang/regex/blob/HEAD/CHANGELOG.md#&#8203;184-2023-06-05)

[Compare Source](rust-lang/regex@1.8.3...1.8.4)

\==================
This is a patch release that fixes a bug where `(?-u:\B)` was allowed in
Unicode regexes, despite the fact that the current matching engines can report
match offsets between the code units of a single UTF-8 encoded codepoint. That
in turn means that match offsets that split a codepoint could be reported,
which in turn results in panicking when one uses them to slice a `&str`.

This bug occurred in the transition to `regex 1.8` because the underlying
syntactical error that prevented this regex from compiling was intentionally
removed. That's because `(?-u:\B)` will be permitted in Unicode regexes in
`regex 1.9`, but the matching engines will guarantee to never report match
offsets that split a codepoint. When the underlying syntactical error was
removed, no code was added to ensure that `(?-u:\B)` didn't compile in the
`regex 1.8` transition release. This release, `regex 1.8.4`, adds that code
such that `Regex::new(r"(?-u:\B)")` returns to the `regex <1.8` behavior of
not compiling. (A `bytes::Regex` can still of course compile it.)

Bug fixes:

-   [BUG #&#8203;1006](rust-lang/regex#1006):
    Fix a bug where `(?-u:\B)` was allowed in Unicode regexes, and in turn could
    lead to match offsets that split a codepoint in `&str`.

### [`v1.8.3`](https://github.com/rust-lang/regex/blob/HEAD/CHANGELOG.md#&#8203;183-2023-05-25)

[Compare Source](rust-lang/regex@1.8.2...1.8.3)

\==================
This is a patch release that fixes a bug where the regex would report a
match at every position even when it shouldn't. This could occur in a very
small subset of regexes, usually an alternation of simple literals that
have particular properties. (See the issue linked below for a more precise
description.)

Bug fixes:

-   [BUG #&#8203;999](rust-lang/regex#999):
    Fix a bug where a match at every position is erroneously reported.

### [`v1.8.2`](https://github.com/rust-lang/regex/blob/HEAD/CHANGELOG.md#&#8203;182-2023-05-22)

[Compare Source](rust-lang/regex@1.8.1...1.8.2)

\==================
This is a patch release that fixes a bug where regex compilation could panic
in debug mode for regexes with large counted repetitions. For example,
`a{2147483516}{2147483416}{5}` resulted in an integer overflow that wrapped
in release mode but panicking in debug mode. Despite the unintended wrapping
arithmetic in release mode, it didn't cause any other logical bugs since the
errant code was for new analysis that wasn't used yet.

Bug fixes:

-   [BUG #&#8203;995](rust-lang/regex#995):
    Fix a bug where regex compilation with large counted repetitions could panic.

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check this box

---

This PR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4xMTEuMCIsInVwZGF0ZWRJblZlciI6IjM1LjExMS4wIiwidGFyZ2V0QnJhbmNoIjoiZGV2ZWxvcCJ9-->

Co-authored-by: cabr2-bot <cabr2.help@gmail.com>
Co-authored-by: crapStone <crapstone01@gmail.com>
Reviewed-on: https://codeberg.org/Calciumdibromid/CaBr2/pulls/1912
Reviewed-by: crapStone <crapstone@noreply.codeberg.org>
Co-authored-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Co-committed-by: Calciumdibromid Bot <cabr2_bot@noreply.codeberg.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants