Skip to content

feat(builtins): add shuf via codegen with helper-fn inlining#1538

Merged
chaliy merged 2 commits intomainfrom
fix/issue-1532-shuf
May 5, 2026
Merged

feat(builtins): add shuf via codegen with helper-fn inlining#1538
chaliy merged 2 commits intomainfrom
fix/issue-1532-shuf

Conversation

@chaliy
Copy link
Copy Markdown
Contributor

@chaliy chaliy commented May 5, 2026

Summary

Implements shuf end-to-end. The second genuinely-missing util from
#1532's list (truncate landed in #1536); drops "command not found" for
shuf invocations.

Codegen tool extension

bashkit-coreutils-port now scans uu_app() for plain identifier
references (parse_range) and inlines any matching free fn defined at
the source file's top level. The inlined fn runs through the same
translate!() rewriter so its FTL keys resolve too. shuf's
parse_range was the motivating case; the same path generalizes to
any util whose value_parser refers to a local helper.

Generated preamble now always includes clap::builder::ValueParser,
std::ops::RangeInclusive, and std::str::FromStr — three symbols
utils in the wild reach for via value_parser(...) calls that pass
either a path-style parser or a typed parser.
#![allow(unused_imports)] keeps the imports inert for utils that
don't reach for them.

syn dep gains the visit feature for the read-only AST walk.

All existing generated files regenerated via just regen-coreutils-args
to pick up the new imports — no semantic drift.

Builtin behaviour

  • -e ARG... shuffles positional args.
  • -i LO-HI produces the integer range, then shuffles.
  • -n N caps output to N lines.
  • -r samples with replacement (requires -n in bashkit's safe
    mode; GNU loops forever without -n — a non-starter inside an
    embedded VFS shell).
  • -z switches separator to NUL.
  • -o FILE writes to a VFS file.
  • --random-seed, --random-source rejected with explicit
    "not yet implemented" error, matching the existing tac -b/-r/-s
    pattern.

Tiny xorshift64* RNG inlined to avoid pulling rand into the
always-on dep set (rand is currently feature-gated behind
bot-auth). Quality is fine for line shuffling; never used
cryptographically.

Test plan

  • cargo test -p bashkit --test spec_tests bash_spec_tests
    includes new shuf.test.sh (7 cases).
  • cargo test -p bashkit --lib builtins::shuf — 3 unit tests
    cover split_separated and the small RNG.
  • every_builtin_handles_bogus_flag_cleanly — passes (shuf added
    to the list).
  • generated_args_headers_match_pinned_uutils_revision — passes
    (all four <util>_args.rs align to the pinned rev).
  • cargo clippy -p bashkit -p bashkit-coreutils-port --all-targets --all-features -- -D warnings — green.
  • cargo fmt --check — green.

Scope (Part of #1532)

After this PR, both genuinely-missing utils on #1532's list are
implemented (truncate in #1536, shuf here). The remaining items in
that issue are codegen migrations of existing builtins (tee,
mktemp, realpath, readlink, stat, od) — separate PRs.


Generated by Claude Code

Implements `shuf` end-to-end. Adds the second genuinely-missing util
from #1532's list (truncate landed in #1536); drops "command not
found" for `shuf` invocations.

Codegen tool extension:

- bashkit-coreutils-port now scans uu_app() for plain identifier
  references (`parse_range`) and inlines any matching free fn defined
  at the source file's top level. The inlined fn runs through the
  same translate!() rewriter so its FTL keys resolve too. shuf's
  parse_range was the motivating case; the same path generalizes to
  any util whose value_parser refers to a local helper.
- Generated preamble now always includes `clap::builder::ValueParser`,
  `std::ops::RangeInclusive`, and `std::str::FromStr` — three
  symbols utils in the wild reach for via `value_parser(...)` calls
  that pass either a path-style parser or a typed parser.
  `#![allow(unused_imports)]` keeps the imports inert for utils that
  don't reach for them.
- syn dep gains the `visit` feature (in addition to existing
  `visit-mut`) for the read-only AST walk.
- All existing generated files regenerated via `just regen-coreutils-args`
  to pick up the new imports — no semantic drift.

Builtin behaviour (`crates/bashkit/src/builtins/shuf.rs`):

- `-e ARG...` shuffles positional args.
- `-i LO-HI` produces the integer range, then shuffles.
- `-n N` caps output to N lines.
- `-r` samples with replacement (requires `-n` in bashkit's safe mode;
  GNU loops forever without it — a non-starter inside an embedded VFS
  shell).
- `-z` switches separator to NUL.
- `-o FILE` writes to a VFS file.
- `--random-seed`, `--random-source` rejected with explicit
  "not yet implemented" error, matching the existing tac -b/-r/-s
  pattern.
- Tiny xorshift64* RNG inlined to avoid pulling rand into the
  always-on dep set (rand is currently feature-gated behind bot-auth).
  Quality is fine for line shuffling; never used cryptographically.

Tests:

- `crates/bashkit/tests/spec_cases/bash/shuf.test.sh` covers happy
  paths (`-e`, `-i`, `-n`, `-r -n`, `-z`), the `-r without -n`
  rejection, and unknown-flag rejection (with `### bash_diff` for the
  clap-vs-GNU exit-code divergence).
- 3 unit tests in `shuf.rs` cover `split_separated` and the small RNG.

Registration:

- interpreter/mod.rs dispatch table.
- compgen.rs allowlist (alphabetical).
- every_builtin_handles_bogus_flag_cleanly list.
- generated/mod.rs (`pub mod shuf_args;`).

Spec: implementation-status.md lists the new test file with summary.

Part of #1532.
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 5, 2026

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
bashkit 657495b Commit Preview URL

Branch Preview URL
May 05 2026, 09:38 PM

The strict bash_comparison_tests step in CI runs every spec test
against real bash. GNU shuf -r without -n loops forever; bashkit's
safe mode rejects it with exit 1. Without ### bash_diff the parity
runner blocks until the GH Actions job timeout.

Add the directive so bashkit's expected output (exit=1) is still
asserted, but the test is excluded from real-bash comparison. Same
pattern as `shuf_unknown_flag_rejected` and the existing tac
unimplemented-flag rows.

Part of #1532.
@chaliy chaliy merged commit d1661a2 into main May 5, 2026
34 checks passed
@chaliy chaliy deleted the fix/issue-1532-shuf branch May 5, 2026 21:57
@chaliy chaliy mentioned this pull request May 6, 2026
7 tasks
chaliy added a commit that referenced this pull request May 6, 2026
Minor release `0.4.1` → `0.5.0`. Two new builtins (`shuf`, `truncate`),
a coreutils-codegen pipeline that ports uutils' `uu_app()` clap
definitions for `cat`/`tac`/`truncate`/`shuf`/`readlink`, and `tool_def`
flag-syntax improvements.

## Highlights

- **Coreutils argument surface via codegen** — Ports uutils' `uu_app()`
clap definitions into bashkit so builtins share the real coreutils
argument shape; `cat`, `tac`, `truncate`, `shuf`, and `readlink` now
flow through this surface, with a coreutils differential testing harness
to catch parity drift. Pipeline reads a single pinned uutils revision so
generated builtins, the differential harness, and CI all agree on the
upstream source of truth (#1529, #1535, #1536, #1537, #1538, #1542).
- **Site updates** — Bashkit agent skill is now published on the site,
alongside rustdoc guides and content signal declarations for
discoverability.

## What's Changed

* refactor(builtins): migrate readlink to codegen-ported argument
surface (#1542)
* chore(site): publish bashkit agent skill (#1541)
* chore(site): declare content signals
* docs(site): publish rustdoc guides
* feat(builtins): add shuf via codegen with helper-fn inlining (#1538)
* chore(builtins): pin uutils revision as single source of truth (#1537)
* feat(builtins): add truncate via codegen-ported argument surface
(#1536)
* test(builtins): add coreutils differential testing harness (#1535)
* feat(builtins): port uutils argument surfaces via codegen (POC: cat,
tac) (#1529)
* feat(tool_def): accept --flag key=value... syntax for object/array
flags (#1528)
* fix(tool_def): coerce stringified JSON for array/object flag schemas
(#1527)

## Publish-readiness report

Per the updated `specs/release-process.md`:

- [x] `cargo fmt --check` clean
- [x] `cargo clippy --all-targets --all-features -- -D warnings` clean
- [x] `cargo build` clean
- [x] Versions synced across `Cargo.toml`,
`crates/bashkit-cli/Cargo.toml`, `crates/bashkit-js/package.json`,
`package-lock.json`, `Cargo.lock`
- [x] `cargo publish --dry-run -p bashkit --allow-dirty` (after CI's
monty-strip): **success**
- [x] `cargo publish --dry-run -p bashkit-cli`: blocked only on ordering
(`bashkit 0.5.0` not yet on crates.io) — resolves at real publish time
when `publish-bashkit` runs first per `publish.yml`'s `needs:` chain.
- [x] New `0.5.0` > latest published versions on crates.io / PyPI / npm
(`0.4.1`).

## Companion change

This branch also includes `chore(specs): document publish verification
and post-merge monitoring`, codifying the verify-before-tag and
watch-after-merge flow that this release follows.

On merge, `release.yml` will create the GitHub Release `v0.5.0` and
dispatch publish workflows for crates.io, PyPI, npm, and Homebrew.

**Full Changelog**:
v0.4.1...v0.5.0

---
_Generated by [Claude
Code](https://claude.ai/code/session_01SvuLdA8pMAmP4woG2HqxKw)_
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant