Skip to content

elzinko/kexpresso

Kexpresso ☕

A fluent Kotlin DSL that makes regular expressions readable.

CI Maven Central JitPack License: MIT Kotlin API docs CodeQL OpenSSF Scorecard Coverage codecov Buy me a coffee


Why kexpresso?

Raw regular expressions are write-only. A week after authoring one, even the writer struggles to remember what it does:

// Raw regex — what does this match?
val emailRegex = Regex("[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}")

With kexpresso the same constraint reads like English:

// kexpresso — self-documenting and composable
val emailPattern = kexpresso {
    email()
}

Or, for a richer pattern that you build up incrementally:

val strictEmail = kexpresso {
    startOfText()
    email()
    endOfText()
}
strictEmail.matches("barista@coffee.shop") // true
strictEmail.matches("not an email")        // false

Two equivalent entry pointskexpresso { } (top-level function) and Kexpresso.pattern { } (object-oriented style) produce the same KexpressoPattern. See Object-oriented entry point below.

Benefits at a glance:

  • Readable — the DSL reads top-to-bottom like a description of what you want to match.
  • Type-safe — the compiler catches typos that a raw string never would.
  • Composable — build complex patterns from simple named primitives.
  • Zero runtime overhead — the DSL compiles to a plain Regex at construction time (measured: 0 % match-time overhead vs raw Regex — see benchmarks).

Is kexpresso right for your case? We're honest about it: it's great for complex, maintained patterns and a poor fit for trivial ones. Read When to use kexpresso — and when not to before adopting. Where we're headed: the Roadmap.


Try it in 30 seconds

Clone and run the guided-tour sample — no extra setup, no credentials:

git clone https://github.com/elzinko/kexpresso && cd kexpresso && ./gradlew :samples:run

The console output walks you through every headline feature: building patterns, domain helpers, typed captures, describe(), reverse-engineering a raw regex with Kexpresso.from(), and ReDoS analysis.

Once io.github.elzinko:kexpresso lands in your own project (see Install below), the same capabilities are one dependency away.


Install

Kexpresso is published to Maven Central — no repository configuration and no token required. Just add the dependency.

groupId is now io.github.elzinko (it was com.github.elzinko on JitPack/GitHub Packages). Maven Central requires the io.github.* namespace.

Gradle (Kotlin DSL)

// build.gradle.kts
dependencies {
    implementation("io.github.elzinko:kexpresso:0.9.0")
}

mavenCentral() is in your repositories by default in most projects; add it if needed:

repositories {
    mavenCentral()
}

Maven

<dependency>
    <groupId>io.github.elzinko</groupId>
    <artifactId>kexpresso</artifactId>
    <version>0.9.0</version>
</dependency>

Alternative repositories

Maven Central is the recommended source. Two alternatives remain available:

  • JitPack — builds on demand from a git tag/commit; coordinate com.github.elzinko:kexpresso:<tag>. Serves jvm, js, wasmJs, linuxX64, mingwX64 only (no Apple/iOS — JitPack builds on Linux). Add maven { url = uri("https://jitpack.io") } to your repositories.
  • GitHub Packages — all targets incl. Apple/iOS; coordinate io.github.elzinko:kexpresso:0.9.0. Requires a GitHub token (a GitHub limitation, even for public packages) — see Where the artifacts are hosted.

Multiplatform

Kexpresso is a Kotlin Multiplatform library. The full DSL is written in commonMain, so the builder, describe(), analyze(), captures, and the reverse (regex → DSL) API are available on every supported target:

Target Status
JVM ✅ published
JS (IR, Node.js) ✅ published
Wasm (wasmJs, Node.js) ✅ published
Native — linuxX64, mingwX64 ✅ published
Native — macosX64, macosArm64 ✅ published
Native — iosArm64, iosX64, iosSimulatorArm64 ✅ published

Built per host; published from macOS. Kotlin/Native targets only cross-compile from a capable host, so the build registers them conditionally: the Linux CI gate builds linuxX64 + mingwX64 (fast), while macOS — the most capable host — builds every target. The release therefore runs on a macos-latest runner and publishes the complete, consistent multiplatform metadata (JVM, JS, Wasm, Linux, Windows, macOS, iOS) from that single host, so a consumer resolving the root module sees every variant. A dedicated Apple & Native workflow exercises the Apple/iOS targets on every PR.

(Building the Apple targets locally requires a full Xcode install — a Command-Line-Tools-only macOS box still builds jvm/js/wasmJs and skips the Apple/Native targets with a warning.)

For a Gradle Multiplatform consumer, the dependency resolves automatically per target via Gradle module metadata:

kotlin {
    sourceSets {
        commonMain.dependencies {
            implementation("io.github.elzinko:kexpresso:0.9.0")
        }
    }
}

A plain-Maven (JVM-only) consumer must use the target-suffixed coordinate instead:

<dependency>
    <groupId>io.github.elzinko</groupId>
    <artifactId>kexpresso-jvm</artifactId>
    <version>0.9.0</version>
</dependency>

Breaking change (since the multiplatform release): artifact coordinates now carry a target suffix. Gradle resolves io.github.elzinko:kexpresso:0.9.0 to the right target automatically through Gradle metadata, but tools that ignore Gradle metadata (e.g. plain Maven) must reference kexpresso-jvm directly.

Where the artifacts are hosted (important for Apple/iOS)

Not every target is available from every repository — choose your source accordingly:

Repository Targets served Auth
Maven Central (the Install section) all targets, incl. macosX64/macosArm64/iosArm64/iosX64/iosSimulatorArm64 none
GitHub Packages all targets, incl. macosX64/macosArm64/iosArm64/iosX64/iosSimulatorArm64 a GitHub token
JitPack jvm, js, wasmJs, linuxX64, mingwX64 none

Maven Central is the recommended source for every target (including Apple/iOS) with no authentication — the release runs on macOS and publishes the complete, signed multiplatform metadata. GitHub Packages serves the same full set but requires a token. JitPack builds on demand on Linux, so it can never produce the Apple/iOS artifacts.

To consume from GitHub Packages instead (a personal-access token with read:packages is required even for public packages — a GitHub limitation):

// settings.gradle.kts
dependencyResolutionManagement {
    repositories {
        maven("https://maven.pkg.github.com/elzinko/kexpresso") {
            credentials {
                username = providers.gradleProperty("gpr.user").orNull ?: System.getenv("GITHUB_ACTOR")
                password = providers.gradleProperty("gpr.key").orNull ?: System.getenv("GITHUB_TOKEN")
            }
        }
    }
}

Honest platform caveats (JS / Wasm / Native)

The DSL builds the same regex string on every platform, but each non-JVM target uses its own regex engine rather than the JVM's java.util.regex (PCRE-like) engine, so the supported feature set narrows the further you get from the JVM. The portable common API — primitives, quantifiers, character classes, alternation, simple/named groups, named & numeric backreferences, lookahead, \b, literal escaping, describe(), toKexpressoCode(), captures, and analyze() — works everywhere. Some JVM-flavoured constructs remain JVM-only at runtime: they build fine but throw when compiled to a Regex on the smaller engines.

  • JS (ECMAScript engine): the most restrictive. startOfText() / endOfText() (the \A / \z anchors) are not valid ECMAScript — use startOfLine() / endOfLine() (^ / $) for portable code. Atomic groups (?>…), possessive quantifiers (a++, a*+), and some lookbehind forms are also JVM-only and only ever appear via raw(...) or Kexpresso.from(...).
  • Wasm (wasmJs): runs on the same ECMAScript engine via the host; same caveats as JS.
  • Native (kotlin.text.Regex): ships a capable pure-Kotlin engine that is actually a superset of ECMAScript here — it accepts the \A / \z / \Z / \G anchors, named groups, named/numeric backreferences, lookahead, lookbehind, and atomic groups. Even so, treat the JVM as the reference engine; exotic PCRE-only constructs reachable via raw(...) may still differ.

The whole commonTest portable suite (31 tests) passes identically on JVM, JS, Wasm, and the built native targets. JVM-only constructs are exercised in the JVM-only jvmTest suite.

Literal escaping is portable: literal("a.b") renders as a\.b (a per-character escaper) rather than the JVM-only \Qa.b\E; matching behaviour is identical everywhere.

toPattern() (conversion to java.util.regex.Pattern) is a JVM-only extension and is not available on JS, Wasm, or Native.


Quickstart

1 — Compile a pattern and test a full match

val drinkName = kexpresso {
    uppercaseLetter()
    oneOrMore { letter() }
}

drinkName.matches("Espresso")   // true
drinkName.matches("espresso")   // false  (no capital first letter)
drinkName.matches("Espresso42") // false  (digit at the end)

2 — Extract all words from a coffee order

val wordPattern = kexpresso { word() }

val order = "Espresso Latte Cappuccino"
val drinks = wordPattern.findAll(order).map { it.value }.toList()
// ["Espresso", "Latte", "Cappuccino"]

3 — Validate an email address

val emailValidator = kexpresso {
    startOfText()
    email()
    endOfText()
}

emailValidator.matches("barista@coffee.shop")       // true
emailValidator.matches("barista@coffee.shop extra") // false
emailValidator.matches("not-an-email")              // false

4 — Match a well-formed sentence

val sentencePattern = kexpresso { sentence() }

sentencePattern.matches("Espresso is perfect!")       // true
sentencePattern.matches("espresso is lowercase.")     // false
sentencePattern.matches("No punctuation at the end")  // false

DSL reference

Primitives

Method Regex produced Notes
literal(text) escaped text (e.g. a\.b) Escapes each regex metacharacter
char(c) escaped char Escapes metacharacters
digit() \d Decimal digit 0–9
nonDigit() \D Any non-digit
whitespace() \s Space, tab, newline, …
nonWhitespace() \S Any non-whitespace
wordChar() \w Letter, digit, or _
nonWordChar() \W Not a word character
anyChar() . Any character except newline
letter() [a-zA-Z] ASCII letters only
uppercaseLetter() [A-Z] ASCII uppercase letters
lowercaseLetter() [a-z] ASCII lowercase letters
alphanumeric() [a-zA-Z0-9] ASCII letter or digit
tab() \t Horizontal tab
newline() \n Newline
carriageReturn() \r Carriage return
nonWordBoundary() \B Non-word boundary position
endPunctuation() [.!?] Sentence-ending punctuation

Character classes

Method Regex produced Notes
anyOf(chars) [chars] One character from the given set; metacharacters escaped
noneOf(chars) [^chars] One character NOT in the given set
inRange(from, to) [from-to] One character in the inclusive range

Anchors

Method Regex produced Notes
startOfLine() ^ Use with RegexOption.MULTILINE for per-line anchoring
endOfLine() $ Use with RegexOption.MULTILINE for per-line anchoring
startOfText() \A Anchors to the very beginning of the input
endOfText() \z Anchors to the very end of the input
wordBoundary() \b Transition between word and non-word character

Quantifiers

All quantifiers accept an optional greedy: Boolean parameter (default true). Pass greedy = false to make the quantifier lazy (matches as few characters as possible).

Method Regex produced Notes
optional { } (?:...)? Zero or one occurrence
zeroOrMore { } (?:...)* Zero or more occurrences
oneOrMore { } (?:...)+ One or more occurrences
exactly(n) { } (?:...){n} Exactly n occurrences
atLeast(n) { } (?:...){n,} At least n occurrences
between(min, max) { } (?:...){min,max} Between min and max occurrences (inclusive)

Lazy example:

val lazyDigits = kexpresso {
    startOfText()
    oneOrMore(greedy = false) { digit() }
    endOfText()
}
lazyDigits.matches("42") // true

Grouping and alternation

Method Regex produced Notes
group { } (?:...) Non-capturing group
capture { } (...) Numbered capturing group
capture("name") { } (?<name>...) Named capturing group
oneOf({ }, { }, …) (?:a|b|…) Alternation: matches any one of the given patterns

Named capture example:

val orderPattern = kexpresso { literal(": "); capture("drink") { word() } }

val result = orderPattern.find("Order: Cappuccino please")
result?.groups?.get("drink")?.value // "Cappuccino"

Alternation example:

val drinkMenu = kexpresso {
    oneOf(
        { literal("Espresso") },
        { literal("Latte") },
        { literal("Cappuccino") },
    )
}

drinkMenu.matches("Latte")     // true
drinkMenu.matches("Americano") // false

Lookarounds

Lookarounds assert a condition at the current position without consuming any characters. They are zero-width: the matched text is not included in the result.

Method Regex produced Notes
followedBy { } (?=...) Positive lookahead — position must be followed by the pattern
notFollowedBy { } (?!...) Negative lookahead — position must NOT be followed by the pattern
precededBy { } (?<=...) Positive lookbehind — position must be preceded by the pattern
notPrecededBy { } (?<!...) Negative lookbehind — position must NOT be preceded by the pattern

Example — extract the numeric part of a measurement:

// Match digits only when immediately followed by "ml"
val mlAmount = kexpresso {
    oneOrMore { digit() }
    followedBy { literal("ml") }
}

mlAmount.find("250ml")?.value // "250"  (lookahead consumed nothing: "ml" stays in input)
mlAmount.find("250g")         // null   (not followed by "ml")

Note: The JVM regex engine requires lookbehind patterns to be bounded in length. precededBy { oneOrMore { digit() } } (unbounded +) will throw a PatternSyntaxException at compile time. Use a bounded form instead: precededBy { between(1, 10) { digit() } }.

Composition & escape hatch

Method Regex produced Notes
raw(pattern) pattern verbatim No escaping — use only for raw regex fragments the DSL cannot yet express
include(pattern) (?:pattern.source) Embed a compiled [KexpressoPattern] as a non-capturing group
backreference(n) \n Numeric back-reference to the nth capturing group (n ≥ 1)
backreference(name) \k<name> Named back-reference; name must start with a letter and contain only letters or digits

raw example — inject a verbatim date fragment:

val datePattern = kexpresso { raw("\\d{4}-\\d{2}-\\d{2}") }
datePattern.matches("2026-06-03") // true

include example — compose a reusable octet pattern into an IP address:

val octet = kexpresso { between(1, 3) { digit() } }
val ip = kexpresso {
    include(octet)
    exactly(3) { char('.'); include(octet) }
}
ip.matches("192.168.1.1") // true

backreference example — detect repeated words:

val repeated = kexpresso {
    capture { oneOrMore { wordChar() } }
    whitespace()
    backreference(1)
}
repeated.containsMatchIn("latte latte") // true
repeated.containsMatchIn("latte mocha") // false

Domain helpers

These extension functions on KexpressoBuilder compose common real-world patterns from the primitives above.

Text helpers (Text.kt)

Method Pattern Matches
word() [a-zA-Z0-9]+ One or more alphanumeric characters (e.g. Espresso, Cappuccino42)
handle() [a-zA-Z0-9_-]+ Like word() but also allows _ and - — usernames and slugs (e.g. cold-brew_2024)
email() see source A broadly valid email address (e.g. barista@coffee.shop)
url() see source An HTTP or HTTPS URL (e.g. https://coffee.shop/menu)

email() and url() are intentionally permissive. Pair with startOfText()/endOfText() for strict whole-string validation.

Writing helpers (Writing.kt)

Method What it matches
sentence() A capital-letter-led sequence of words ending with ., !, or ?
paragraph() One or more sentences separated by single spaces
val paragraphPattern = kexpresso { paragraph() }

paragraphPattern.matches("Latte is smooth. Espresso is bold!") // true
paragraphPattern.matches("latte is lowercase.")                 // false

Note: sentence() builds the first word as uppercaseLetter() + word(), so the first word must be at least two characters long (one uppercase letter followed by at least one alphanumeric character).

Ready-to-use patterns

These helpers in Domains.kt let you match common real-world formats in one call. Pair with startOfText()/endOfText() for whole-string validation.

Helper Matches Caveats
ipv4() IPv4 address, e.g. 192.168.1.1 Decimal only; no CIDR notation
uuid() RFC 4122 UUID versions 1–5, e.g. 550e8400-e29b-41d4-a716-446655440000 Nil UUID and versions 6+ rejected
slug() URL/CMS slug, e.g. cold-brew Lowercase only; no underscores
hexColor() CSS hex color #RGB, #RGBA, #RRGGBB, #RRGGBBAA, e.g. #1a2b3c 5- and 7-digit forms are invalid CSS and do not match
semanticVersion() SemVer 2.0.0 string, e.g. 1.0.0-rc.1+build.42 No leading v; partial forms like 1.0 rejected
isoDate() ISO-8601 date YYYY-MM-DD, e.g. 2024-01-15 Does NOT validate day-of-month (Feb 30 passes)
isoTime() ISO-8601 time HH:MM[:SS][Z|±HH:MM], e.g. 14:30:00Z Leap seconds and fractional seconds not supported
integerNumber() Signed/unsigned integer without leading zeros, e.g. -7, 42 No upper bound on digit count
decimalNumber() Decimal with optional fractional part, e.g. 3.14, -0.5 Bare .5 and scientific notation not supported
hashtag() Social-media hashtag #word, e.g. #Espresso First char after # must be a letter, not a digit
mention() @mention (Twitter/X), 1–50 chars, e.g. @barista Other platforms may allow longer names
e164Phone() E.164 phone number, e.g. +14155552671 Compact form only — no separators; no country-code validation
ipv6() IPv6 address — full or :: -compressed, e.g. 2001:db8::1, ::1 Embedded IPv4 (::ffff:192.168.1.1) and zone IDs (%eth0) not supported
macAddress() IEEE 802 MAC address, colon- or hyphen-separated, e.g. 01:23:45:67:89:AB Cisco dot notation not supported; mixed separators rejected
base64() Standard Base64 string with optional =/== padding, e.g. S2V4cHJlc3Nv Also matches empty string; URL-safe Base64 (-/_) not matched
jwt() JSON Web Token — three base64url segments separated by dots Structural only — signature not verified, payload not decoded

Example — validate an IPv4 address:

val ipValidator = kexpresso {
    startOfText()
    ipv4()
    endOfText()
}

ipValidator.matches("192.168.1.1") // true
ipValidator.matches("256.0.0.1")   // false — octet out of range

Example — extract all hashtags from a post:

val hashtagPattern = kexpresso { hashtag() }

val post = "Loving my #Espresso and #ColdBrew today! #Coffee"
val tags = hashtagPattern.findAll(post).map { it.value }.toList()
// ["#Espresso", "#ColdBrew", "#Coffee"]

Working with results

kexpresso { } returns a KexpressoPattern — an immutable, thread-safe wrapper around a compiled Regex.

Matching

val p = kexpresso { oneOrMore { letter() } }

p.matches("Espresso")                       // true  — entire string must match
p.containsMatchIn("Order: Espresso please") // true  — match anywhere in the string

Searching

val wordPattern = kexpresso { oneOrMore { letter() } }

// First match only
val first = wordPattern.find("Espresso Latte")
first?.value // "Espresso"

// Skip ahead with startIndex
val second = wordPattern.find("Espresso Latte", startIndex = 9)
second?.value // "Latte"

// All non-overlapping matches (returns a lazy Sequence)
val drinks = wordPattern.findAll("Espresso Latte Cappuccino").map { it.value }.toList()
// ["Espresso", "Latte", "Cappuccino"]

String operations

KexpressoPattern exposes convenience methods that delegate to the underlying Regex:

replaceFirst — replace the first match:

val drink = kexpresso { oneOrMore { letter() } }
drink.replaceFirst("espresso latte", "ESPRESSO") // "ESPRESSO latte"

replaceAll with a fixed string — replace every match:

val drink = kexpresso { oneOrMore { letter() } }
drink.replaceAll("espresso latte", "brew") // "brew brew"

replaceAll with a transform — compute the replacement per match:

val drink = kexpresso { oneOrMore { letter() } }
drink.replaceAll("espresso latte") { it.value.uppercase() } // "ESPRESSO LATTE"

split — split around matches:

val sep = kexpresso { literal(", ") }
sep.split("Espresso, Latte, Cappuccino") // ["Espresso", "Latte", "Cappuccino"]
sep.split("Espresso, Latte, Cappuccino", limit = 2) // ["Espresso", "Latte, Cappuccino"]

matchEntire — full-string match with group access:

val drinkOrder = kexpresso {
    capture("drink") { oneOrMore { letter() } }
    whitespace()
    capture("size") { oneOrMore { letter() } }
}
val result = drinkOrder.matchEntire("Latte Large")
result?.groups?.get("drink")?.value // "Latte"
result?.groups?.get("size")?.value  // "Large"

Typed captures

Reading captured groups from a MatchResult is normally verbose and stringly-typed: result.groups["year"]?.value?.toInt(). The Captures API wraps any MatchResult and provides type-safe accessors:

val datePattern = kexpresso {
    capture("year")  { exactly(4) { digit() } }
    literal("-")
    capture("month") { exactly(2) { digit() } }
    literal("-")
    capture("day")   { exactly(2) { digit() } }
}

val caps = datePattern.find("2026-06-03")?.captures
caps?.int("year")   // 2026
caps?.int("month")  // 6
caps?.int("day")    // 3
caps?.string("day") // "03"

Use ...OrThrow variants when the group is guaranteed to be present — they give clear error messages instead of silent nulls:

val pricePattern = kexpresso {
    literal("\$")
    capture("dollars") { oneOrMore { digit() } }
}

val caps = pricePattern.find("\$42")?.captures ?: error("no match")
caps.intOrThrow("dollars")    // 42
caps.intOrThrow("missing")    // throws NoSuchElementException: "Named group 'missing'…"
caps.intOrThrow("dollars")    // throws NumberFormatException if value isn't an Int

By index — index 0 is the whole match, 1 is the first capturing group, etc.:

val pricePattern = kexpresso {
    literal("\$")
    capture { oneOrMore { digit() } }
}
val caps = pricePattern.find("\$42")?.captures
caps?.string(0) // "\$42"  — whole match
caps?.int(1)    // 42      — first capture group

Supported typesstring, int, long, double, boolean (strict: "true"/"false" only). All nullable variants return null on absent/unparseable values; ...OrThrow variants throw NoSuchElementException, NumberFormatException, or IllegalArgumentException with a message that names the group and the offending value.

Inspecting the pattern

val p = kexpresso { digit(); letter() }

p.source  // "\\d[a-zA-Z]"   — raw regex string
p.options // emptySet()       — Set<RegexOption>

Explain a pattern (describe())

Every pattern can explain itself in plain English. describe() walks the internal AST (the same representation that renders the regex) and returns a deterministic, comma-joined phrase — handy for code review, logging, or learning what a pattern does:

val p = kexpresso { startOfText(); oneOrMore { digit() }; endOfText() }

p.source     // "\\A(?:\\d)+\\z"
p.describe() // "start of text, one or more of (a digit), end of text"

Domain helpers (e.g. email()) are emitted as raw fragments, so they describe as raw regex `…` rather than a fully decomposed phrase.

Generate matching examples

examples() walks the internal AST and produces strings that satisfy matches():

val drinkCode = kexpresso {
    uppercaseLetter()
    oneOrMore { lowercaseLetter() }
}

drinkCode.examples(3) // e.g. ["Ac", "Bs", "Ct"] — each passes drinkCode.matches(it)

// Deterministic: same seed → same list
drinkCode.examples(5, seed = 42)

// Exact repetition
val pinPattern = kexpresso { exactly(4) { digit() } }
pinPattern.examples(3) // e.g. ["5279", "1836", "4021"]

// Alternation
val drinkMenu = kexpresso { oneOf({ literal("Espresso") }, { literal("Latte") }) }
drinkMenu.examples(5) // ["Espresso", "Latte"]

Honesty contract — when examples are guaranteed to match: examples() guarantees that every returned string satisfies matches() when the pattern's AST contains only supported nodes: Sequence, Literal, Token primitives (digit, letter, whitespace, …), Quantifier, Group, and Alternation.

Best-effort cases (no match guarantee):

  • Raw fragments — including domain helpers (email(), ipv4(), isoDate(), …) and Kexpresso.from(rawRegex), which all use raw nodes internally.
  • Lookarounds (followedBy, precededBy, …) — zero-width; skipped during generation.
  • Backreferences — the captured text is not tracked; an empty string is emitted.

In best-effort mode examples() still returns without throwing — the results simply may not satisfy matches().

Interoperability

val p = kexpresso { literal("Cappuccino") }

val kotlinRegex:   Regex                  = p.toRegex()           // all targets
val javaPattern:   java.util.regex.Pattern = p.toPattern()        // JVM only

toRegex() is available on every target. toPattern() is a JVM-only extension (java.util.regex.Pattern does not exist on Kotlin/JS).

RegexOption

Pass any number of RegexOption values to the kexpresso { } call or to Kexpresso.pattern { }:

val caseInsensitive = kexpresso(RegexOption.IGNORE_CASE) {
    literal("espresso")
}
caseInsensitive.matches("ESPRESSO") // true
caseInsensitive.matches("Espresso") // true

val multiline = kexpresso(RegexOption.MULTILINE) {
    startOfLine()
    literal("Espresso")
    endOfLine()
}
multiline.containsMatchIn("Espresso\nCappuccino") // true

Object-oriented entry point

If you prefer an object-oriented style, use Kexpresso.pattern { } — it is identical to the top-level kexpresso { } function:

val p = Kexpresso.pattern(RegexOption.IGNORE_CASE) { literal("Ristretto") }
p.matches("ristretto") // true

Safety: ReDoS analysis

Certain regex patterns can cause catastrophic backtracking — an attacker who controls input can make the regex engine take exponential time. The classic shape is nested unbounded quantifiers such as (?:a+)+.

Kexpresso provides a best-effort static analyzer to catch this shape at development time:

// DSL produces (?:(?:[a-zA-Z])+)+ — nested unbounded quantifiers
val risky = kexpresso { oneOrMore { oneOrMore { letter() } } }

val report = risky.analyze()
if (report.isPotentiallyVulnerable) {
    println("Findings:")
    report.findings.forEach { println("  [${it.severity}] ${it.message}") }
}
// Findings:
//   [WARNING] Nested unbounded quantifier at index 0: (?:(?:[a-zA-Z])+)+ …

// Convenience shorthand
if (risky.isPotentiallyVulnerable) { /* warn or reject */ }

This is a best-effort heuristic, not a guarantee. It detects the canonical "evil regex" shape — a group with an outer unbounded quantifier (*, +, or {n,}) whose body also contains an inner unbounded quantifier. It does NOT detect all ReDoS patterns (e.g. alternation-based catastrophic backtracking), and a clean result does not prove the pattern is safe. Use it as an early-warning signal alongside proper input constraints and performance testing.


Reverse: read an existing regex

Inherited a cryptic regex? Kexpresso.from(...) reads it back: it compiles the regex and lets you explain it (describe()) or rewrite it as kexpresso DSL (toKexpressoCode()).

val pattern = Kexpresso.from("\\d{4}-\\d{2}-\\d{2}")

pattern.describe()
// "exactly 4 of (a digit), the literal "-", exactly 2 of (a digit), the literal "-", exactly 2 of (a digit)"

println(pattern.toKexpressoCode())
// kexpresso {
//     exactly(4) { digit() }
//     literal("-")
//     exactly(2) { digit() }
//     literal("-")
//     exactly(2) { digit() }
// }

toKexpressoCode() works on any KexpressoPattern — whether you built it with the DSL or parsed it with from — so you can round-trip between the two representations.

Matching is always exact; parsing is best-effort. Kexpresso.from(r) compiles r verbatim, so Kexpresso.from(r).matches(x) is always identical to Regex(r).matches(x). The structural parse that powers describe() and toKexpressoCode() models the common constructs (literals, predefined classes, anchors, quantifiers, groups, lookarounds, alternation, back-references) and honestly degrades anything it doesn't model to raw("…") (e.g. possessive quantifiers, atomic groups (?>…), inline-flag groups (?i)). The generated code stays compilable and never changes match behaviour. An invalid regex throws PatternSyntaxException, exactly as Regex(...) would.


Building and contributing

# Compile, run all tests, Detekt static analysis, and the Kover coverage check
./gradlew build

# Run tests only
./gradlew test

# Run Detekt only
./gradlew detekt

See CONTRIBUTING.md for the full contributor guide — including how to add a new DSL primitive — and docs/ARCHITECTURE.md for a map of how the codebase fits together.


Security

Found a vulnerability? Please report it privately — see SECURITY.md. The project also runs CodeQL static analysis and an OpenSSF Scorecard supply-chain check on every push, and uses Dependabot to keep dependencies and GitHub Actions current.

For catastrophic-backtracking (ReDoS) risk in your own patterns, kexpresso ships a best-effort analyzer — see Safety: ReDoS analysis.

Code of Conduct

This project follows the Contributor Covenant. By participating, you are expected to uphold it.

Support

kexpresso is free and open source. If it saves you time, you can say thanks:

Buy me a coffee

https://buymeacoffee.com/elzinko — every coffee fuels another release.

License

MIT — Copyright (c) 2026 Thomas Couderc.

About

Fluent Kotlin Multiplatform DSL for readable regular expressions — build them, explain them (describe()), and reverse existing regex back into DSL. JVM · JS · Wasm · Native.

Topics

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors