PCRE.NET - Perl Compatible Regular Expressions for .NET
C# C++ PowerShell Other
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
build Update NUnit and rewrite build script using Cake Feb 18, 2018
src Update version to 0.7.0 Feb 18, 2018
.gitattributes .gitattributes for Linguist Apr 15, 2015
.gitignore Update NUnit and rewrite build script using Cake Feb 18, 2018
LICENCE Update version to 0.7.0 Feb 18, 2018
README.md Add AppVeyor Mar 25, 2018
appveyor.yml

README.md

PCRE.NET

Perl Compatible Regular Expressions for .NET

NuGet package Build status GitHub release PCRE License

PCRE.NET is a .NET wrapper for the PCRE library. The goal of this project is to bring most of PCRE's features for use from .NET applications with as little overhead as possible.

The current version on the master branch is 0.7.0, based on PCRE 10.31.

Dependencies

Features

The following regex operations are supported:

  • NFA matching and substring extraction:
    • PcreRegex.Matches
    • PcreRegex.Match
    • PcreRegex.IsMatch
  • Matched string replacement: PcreRegex.Replace
    • Callbacks: Func<PcreMatch, string>
    • Replacement strings with placeholders: $n ${name} $& $_ $` $' $+
  • String splitting on matches: PcreRegex.Split
    • Captured groups are included in the result
  • DFA matching:
    • regexInstance.Dfa.Matches
    • regexInstance.Dfa.Match
  • Partial matching (when the subject is too short to match the pattern)

Library highlights:

  • No marshaling - uses a mixed mode assembly to maximize performance
  • Support for compiled patterns (x86/x64 JIT)
  • Lazy evaluation whenever possible (for instance PcreRegex.Matches returns IEnumerable<PcreMatch>)
  • The API is similar to .NET's System.Text.RegularExpressions
  • Callout support (numbered and string-based)
  • Mark retrieval support
  • Conversion from POSIX BRE, POSIX ERE and glob patterns (PcreConvert class)

Example usage

  • Extract all words except those within parentheses:
var matches = PcreRegex.Matches("(foo) bar (baz) 42", @"\(\w+\)(*SKIP)(*FAIL)|\w+")
                       .Select(m => m.Value)
                       .ToList();
// result: "bar", "42"
  • Enclose a series of punctuation characters within angle brackets:
var result = PcreRegex.Replace("hello, world!!!", @"\p{P}+", "<$&>");
// result: "hello<,> world<!!!>"
  • Partial matching:
var regex = new PcreRegex(@"(?<=abc)123");
var match = regex.Match("xyzabc12", PcreMatchOptions.PartialSoft);
// result: match.IsPartialMatch == true
  • Validate a JSON string:
const string jsonPattern = @"
    (?(DEFINE)
        # An object is an unordered set of name/value pairs.
        (?<object> \{
            (?: (?&keyvalue) (?: , (?&keyvalue) )* )?
        (?&ws) \} )
        (?<keyvalue>
            (?&ws) (?&string) (?&ws) : (?&value)
        )

        # An array is an ordered collection of values.
        (?<array> \[
            (?: (?&value) (?: , (?&value) )* )?
        (?&ws) \] )

        # A value can be a string in double quotes, or a number,
        # or true or false or null, or an object or an array.
        (?<value> (?&ws)
            (?: (?&string) | (?&number) | (?&object) | (?&array) | true | false | null )
        )

        # A string is a sequence of zero or more Unicode characters,
        # wrapped in double quotes, using backslash escapes.
        (?<string>
            "" (?: [^""\\\p{Cc}]++ | \\u[0-9A-Fa-f]{4} | \\ [""\\/bfnrt] )* ""
            # \p{Cc} matches control characters
        )

        # A number is very much like a C or Java number, except that the octal
        # and hexadecimal formats are not used.
        (?<number>
            -? (?: 0 | [1-9][0-9]* ) (?: \. [0-9]+ )? (?: [Ee] [-+]? [0-9]+ )?
        )

        # Whitespace
        (?<ws> \s*+ )
    )

    \A (?&ws) (?&object) (?&ws) \z
";

var regex = new PcreRegex(jsonPattern, PcreOptions.IgnorePatternWhitespace);

const string subject = @"{
    ""hello"": ""world"",
    ""numbers"": [4, 8, 15, 16, 23, 42],
    ""foo"": null,
    ""bar"": -2.42e+17,
    ""baz"": true
}";

var isValidJson = regex.IsMatch(subject);
// result: true

To do

  • Implement a less verbose managed/native interface
  • Include more tests from the PCRE test suite
  • Documentation
  • XML doc comments