feat(jq): replace regex backend with fancy-regex for advanced patterns#1508
Merged
feat(jq): replace regex backend with fancy-regex for advanced patterns#1508
Conversation
jaq-std uses the `regex` crate, which lacks lookahead, lookbehind, backreferences, and atomic groups — features real jq supports via Oniguruma and that LLM-generated filters frequently use. This swaps in fancy-regex (already a workspace dep, pure-Rust, no FFI) for the three native filters jaq-std exposes — matches, split_matches, split_. The user-facing match/scan/test/capture/sub/gsub filters are defined in jaq-std's defs.jq on top of these natives, so they all benefit. What works now: - (?=...) positive lookahead - (?!...) negative lookahead - (?<=...) positive lookbehind - (?<!...) negative lookbehind - \1 backreferences - (?>...) atomic groups Output shape is byte-for-byte identical to jaq-std's natives so the compat-defs in compat.rs consume the new output unchanged. Offsets are character indices (matches jq docs and Oniguruma). 'l' (swap-greed) flag is silently no-op since fancy-regex doesn't expose it. Tests: 25 new unit tests covering lookahead/lookbehind/backref/atomic- group positive and negative cases plus regression tests for basic patterns and unicode offsets. Differential mod adds 8 byte-comparison tests against the real jq binary.
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
bashkit | e3c6afb | Commit Preview URL Branch Preview URL |
May 02 2026, 04:00 PM |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
jaq-std uses the
regexcrate, which lacks lookahead, lookbehind, backreferences, and atomic groups — features real jq supports via Oniguruma and that LLM-generated filters frequently use. This swaps infancy-regex(already a workspace dep, pure-Rust, no FFI) for the three native filters jaq-std exposes:matches,split_matches,split_.The user-facing
match/scan/test/capture/sub/gsubfilters are defined in jaq-std'sdefs.jqon top of these natives, so they all benefit.What works now
(?=...)positive lookahead(?!...)negative lookahead(?<=...)positive lookbehind(?<!...)negative lookbehind\1backreferences(?>...)atomic groupsCompatibility
compat.rsconsumes the new output unchanged.l(swap-greed) flag is silently no-op since fancy-regex doesn't expose it. All other jq flags (g,n,i,m,s,x,p) work.Tests
differentialmod that byte-compare against the realjqbinary when present in$PATH.Test plan
cargo build -p bashkit --features jqcargo test -p bashkit --features "jq,http_client,ssh" --tests— only the unrelated SSH-network test fails (no network in sandbox)cargo clippy --workspace --all-targets --features "http_client,ssh" -- -D warningscargo fmt --all -- --checkDeferred follow-up
--streammode for huge JSON inputs (still pending; this PR only addresses the regex backend).Generated by Claude Code