Skip to content

investigate optimizations for patterns like (?m)^\w+$ #1301

@BurntSushi

Description

@BurntSushi

In particular, PCRE2's JIT seems to optimize this particular pattern quite well, even when its UCP mode is enabled. See BurntSushi/ripgrep#3167 for an example.

Note that if you disable Unicode mode for Rust regex and have the perf-dfa-full crate feature enabled, then performance for (?m)^\w+$ improves quite a bit. This is because the pattern becomes so small that a full DFA is built. This in turn permits state acceleration, which lets it skip around using memchr on \n.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions