Skip to content

feat(jq): replace regex backend with fancy-regex for advanced patterns#1508

Merged
chaliy merged 1 commit intomainfrom
feat/jq-fancy-regex
May 2, 2026
Merged

feat(jq): replace regex backend with fancy-regex for advanced patterns#1508
chaliy merged 1 commit intomainfrom
feat/jq-fancy-regex

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 2, 2026

Summary

jaq-std uses the regex crate, which lacks lookahead, lookbehind, backreferences, and atomic groups — features real jq supports via Oniguruma and that LLM-generated filters frequently use. This swaps in fancy-regex (already a workspace dep, pure-Rust, no FFI) for the three native filters jaq-std exposes: matches, split_matches, split_.

The user-facing match/scan/test/capture/sub/gsub filters are defined in jaq-std's defs.jq on top of these natives, so they all benefit.

What works now

  • (?=...) positive lookahead
  • (?!...) negative lookahead
  • (?<=...) positive lookbehind
  • (?<!...) negative lookbehind
  • \1 backreferences
  • (?>...) atomic groups

Compatibility

  • Output shape is byte-for-byte identical to jaq-std's natives so compat.rs consumes the new output unchanged.
  • Offsets are character indices (matches jq docs and Oniguruma).
  • l (swap-greed) flag is silently no-op since fancy-regex doesn't expose it. All other jq flags (g, n, i, m, s, x, p) work.
  • Errors stay short and jq-shaped (TM-INF-022 invariant preserved); fancy-regex internals never reach stderr.

Tests

  • 25 new unit tests covering positive and negative cases of each new feature plus regression tests for basic patterns and unicode offsets.
  • 8 differential tests in the differential mod that byte-compare against the real jq binary when present in $PATH.
  • 2473 lib tests pass, 218 jq submodule tests pass.

Test plan

  • cargo build -p bashkit --features jq
  • cargo test -p bashkit --features "jq,http_client,ssh" --tests — only the unrelated SSH-network test fails (no network in sandbox)
  • cargo clippy --workspace --all-targets --features "http_client,ssh" -- -D warnings
  • cargo fmt --all -- --check
  • jq spec tests pass

Deferred follow-up

  • --stream mode for huge JSON inputs (still pending; this PR only addresses the regex backend).

Generated by Claude Code

jaq-std uses the `regex` crate, which lacks lookahead, lookbehind,
backreferences, and atomic groups — features real jq supports via
Oniguruma and that LLM-generated filters frequently use. This swaps
in fancy-regex (already a workspace dep, pure-Rust, no FFI) for the
three native filters jaq-std exposes — matches, split_matches, split_.
The user-facing match/scan/test/capture/sub/gsub filters are defined
in jaq-std's defs.jq on top of these natives, so they all benefit.

What works now:
- (?=...) positive lookahead
- (?!...) negative lookahead
- (?<=...) positive lookbehind
- (?<!...) negative lookbehind
- \1 backreferences
- (?>...) atomic groups

Output shape is byte-for-byte identical to jaq-std's natives so the
compat-defs in compat.rs consume the new output unchanged. Offsets
are character indices (matches jq docs and Oniguruma). 'l' (swap-greed)
flag is silently no-op since fancy-regex doesn't expose it.

Tests: 25 new unit tests covering lookahead/lookbehind/backref/atomic-
group positive and negative cases plus regression tests for basic
patterns and unicode offsets. Differential mod adds 8 byte-comparison
tests against the real jq binary.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit e3c6afb Commit Preview URL

Branch Preview URL
May 02 2026, 04:00 PM

@chaliy chaliy merged commit debdcaa into main May 2, 2026
34 checks passed
@chaliy chaliy deleted the feat/jq-fancy-regex branch May 2, 2026 16:11
@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant