Skip to content

[Analyzer]: Flag non-productive uses of RegexOptions.NonBacktracking. #114831

@teo-tsirpanis

Description

@teo-tsirpanis

I have seen code like this in an online course:

partial class Regexes
{
    [GeneratedRegex("[^a-z]", RegexOptions.NonBacktracking)]
    public static partial Regex MyRegex { get; }
}

The regex is matched with the non-backtracking engine, with the justification of guaranteeing linear match times. However, the pattern is so trivial that it can be matched in linear time even without the non-backtracking engine, and in fact using it causes the generator to emit just a cached singleton, negating most of the benefits of source generation.

I propose an analyzer that indicates to the user that the use of RegexOptions.NonBacktracking for regexes that are not susceptible to catastrophically backtrack, will not bring any benefits and might actually decrease performance.

Questions

  • How accurately can we detect patterns that are guaranteed to not exhibit catastrophic backtracking with the regular engine?
  • Should we do this only in the source generator, or also in the Regex constructor with a constant pattern and options?
  • If the user later changes their regex and it's no longer guaranteed to never catstrophically backtrack, they might not know that, and be susceptible to vulnerabilities. This might also necessitate another analyzer to suggest adding NonBacktracking or a timeout for such regexes, which I'm going to propose depending on the answer to the first question.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions