Skip to content

re: avoid per-call mark-array allocation for patterns with no capturing groups #150717

@gaborbernat

Description

@gaborbernat

Feature or enhancement

Every match, search, or fullmatch on a pattern with no capturing groups allocates capture-group bookkeeping, then frees it without ever reading it. Group-less patterns are common in validation and scanning code, so this runs often.

Examples:

  • Checking the format of millions of records during an import, e.g. re.match(r"\d{4}-\d{2}-\d{2}", value).
  • Scanning each log line with re.search(r"ERROR|WARN", line).
  • Routers and frameworks testing many small patterns per request.

Proposed change: skip that allocation when a pattern has no capturing groups. Patterns with groups stay untouched, and results are identical.

On a local optimized build, re.match and re.search run 11 to 13 percent faster for group-less patterns on short inputs, with no change for patterns that use groups.

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions