Introduction

almson-regex is a simple library for writing readable regular expressions.

The goals of this library are:

Descriptive syntax
Good documentation
Support of all Java regex features

The following are not goals:

Type safety (keep it simple, use Strings!)
Extreme brevity (if you want super-compact, illegible regular expressions, write them the old way!)
Allow user to avoid learning regular expressions (fundamentally, you're still writing and reading regexes, but the descriptive names and good documentation makes it much easier!)

almson-regex is based on string operations, and is easy to use for all, some, or parts of your regular expressions. almson-regex is compiled with Java 8 and supports all Java 17 regex features, such as named capturing groups and glyph cluster matchers.

The documentation for the library doesn't replace knowledge of how regular expressions work. However, this library succeeds in making your regular expressions easy to read by those who do not have expert knowledge. For the best reference on Java regular expressions, see the java.util.regex.Pattern documentation.

Installation with Maven

<dependency>
    <groupId>net.almson</groupId>
    <artifactId>almson-regex</artifactId>
    <version>1.5.1</version>
</dependency>

Examples

Match leading or trailiing whitespace:

String regex = "^[ \t]+|[ \t]+$"

becomes:

String regex = 
    either(sequence(START_BOUNDARY, oneOrMore(HORIZONTAL_WHITESPACE)),
           sequence(oneOrMore(HORIZONTAL_WHITESPACE), END_BOUNDARY))

but since sequence simply does string concatenation, we can also write:

either(START_BOUNDARY + oneOrMore(HORIZONTAL_WHITESPACE),
       oneOrMore(HORIZONTAL_WHITESPACE) + END_BOUNDARY)

Match an IP address (simple version):

"\\b(\\d{1,3}\\.){3}\\d{1,3}\\b"

becomes:

WORD_BOUNDARY +
exactly(3, between(1, 3, DIGIT) + text(".")) +
between(1, 3, DIGIT) +
WORD_BOUNDARY

Match an email address (simple version):

"\\b(<user>[a-zA-Z0-9._%+-]+)@(?<domain>[A-Z0-9.-]+\.\\p{L}{2,})\\b"

becomes:

WORD_BOUNDARY + 
namedGroup("user", 
           oneOrMore(charclassUnion(LETTER, DIGIT, charclass('.', '_', '%', '+', '-')))) + 
text("@") + 
namedGroup("domain", 
           oneOrMore(charclassUnion(LETTER, DIGIT, charclass('.', '-'))) + 
text(".") + 
atLeast(2, LETTER)) + 
WORD_BOUNDARY

Select consecutive duplicates from a comma-delimited list

"(?<=,|^)([^,]*)(,\1)+(?=,|$)"

becomes:

precededBy (either (START_BOUNDARY, text (","))) + 
group (zeroOrMore(charclassComplement(charclass(',')))) + 
oneOrMore (text(",") + backreference(1)) + 
followedBy (either (text(","), END_BOUNDARY))

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
src		src
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
nb-configuration.xml		nb-configuration.xml
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Installation with Maven

Examples

Match leading or trailiing whitespace:

Match an IP address (simple version):

Match an email address (simple version):

Select consecutive duplicates from a comma-delimited list

About

Releases

Packages

Languages

License

almson/almson-regex

Folders and files

Latest commit

History

Repository files navigation

Introduction

Installation with Maven

Examples

Match leading or trailiing whitespace:

Match an IP address (simple version):

Match an email address (simple version):

Select consecutive duplicates from a comma-delimited list

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages