Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tighter bound to range check for matching Regex char classes #67133

Merged
merged 1 commit into from
Mar 25, 2022

Conversation

stephentoub
Copy link
Member

When we emit a bitmap lookup for character classes containing only ASCII characters, we currently bound the check by 128, e.g.

if (ch < 128 && lookupTable[...])

but we can easily lower that 128 to instead be the actual exclusive upper bound based on the char set. Doing so means we don't need to hit the lookup table for a larger set of characters.

(We could also actually shrink the size of the lookup table itself, but doing so would only save a few bytes, and it didn't seem worth the complexity right now. We could also add a lower range check, but that's also additional checks to execute whereas this one is just improving an existing check that's also required for correctness.)

Contributes to #67056

When we emit a bitmap lookup for character classes containing only ASCII characters, we currently bound the check by 128, e.g.
```C#
if (ch < 128 && lookupTable[...])
```
but we can easily lower that 128 to instead be the actual exclusive upper bound based on the char set.  Doing so means we don't need to hit the lookup table for a larger set of characters.

(We could also actually shrink the size of the lookup table itself, but doing so would only save a few bytes, and it didn't seem worth the complexity right now.  We could also add a lower range check, but that's also additional checks to execute whereas this one is just improving an existing check that's also required for correctness.)
@ghost
Copy link

ghost commented Mar 25, 2022

Tagging subscribers to this area: @dotnet/area-system-text-regularexpressions
See info in area-owners.md if you want to be subscribed.

Issue Details

When we emit a bitmap lookup for character classes containing only ASCII characters, we currently bound the check by 128, e.g.

if (ch < 128 && lookupTable[...])

but we can easily lower that 128 to instead be the actual exclusive upper bound based on the char set. Doing so means we don't need to hit the lookup table for a larger set of characters.

(We could also actually shrink the size of the lookup table itself, but doing so would only save a few bytes, and it didn't seem worth the complexity right now. We could also add a lower range check, but that's also additional checks to execute whereas this one is just improving an existing check that's also required for correctness.)

Contributes to #67056

Author: stephentoub
Assignees: -
Labels:

area-System.Text.RegularExpressions, tenet-performance

Milestone: 7.0.0

@stephentoub stephentoub merged commit 55e012e into dotnet:main Mar 25, 2022
@stephentoub stephentoub deleted the regexboundsimprove branch March 25, 2022 12:40
radekdoulik pushed a commit to radekdoulik/runtime that referenced this pull request Mar 30, 2022
…net#67133)

When we emit a bitmap lookup for character classes containing only ASCII characters, we currently bound the check by 128, e.g.
```C#
if (ch < 128 && lookupTable[...])
```
but we can easily lower that 128 to instead be the actual exclusive upper bound based on the char set.  Doing so means we don't need to hit the lookup table for a larger set of characters.

(We could also actually shrink the size of the lookup table itself, but doing so would only save a few bytes, and it didn't seem worth the complexity right now.  We could also add a lower range check, but that's also additional checks to execute whereas this one is just improving an existing check that's also required for correctness.)
@ghost ghost locked as resolved and limited conversation to collaborators Apr 24, 2022
@AndyAyersMS
Copy link
Member

Similar regression report here (ubuntu x64, 3/25), looks like whatever it was got fixed subsequently.

dotnet/perf-autofiling-issues#4280
newplot - 2022-04-27T161827 065

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants