Skip to content

feature: ls walk glob filesystem builtins#407

Merged
danieljohnmorris merged 6 commits into
mainfrom
feature/fs-builtins
May 18, 2026
Merged

feature: ls walk glob filesystem builtins#407
danieljohnmorris merged 6 commits into
mainfrom
feature/fs-builtins

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

Three new Result-typed builtins for directory enumeration, closing the categorical gap personas have hit repeatedly: rd exists for single files but there was no way to list a directory or recursively find files matching a pattern (a basic facility every shell and scripting language ships).

  • ls dir > R (L t) t - non-recursive listing, filenames only, sorted lexicographically. Empty dir returns [], missing dir / permission denied surface as Err.
  • walk dir > R (L t) t - depth-first recursive walk, paths relative to dir with forward slashes (deterministic across OSes). Symlinks not followed.
  • glob dir pat > R (L t) t - shell-style filter. */?/[abc] within a segment, ** recursive across segments. No matches returns [], not Err.

All three are tree-bridge eligible so VM and Cranelift inherit them through the same path rd/rdb/sleep already use - no new opcodes, no Cranelift helpers, no AOT touch.

Why

Originating: filesystem-walk persona reports in ilo_assessment_feedback.md. Token-cost framing: walk dir is 8 chars vs the alternative of asking the user to pre-compute the file list and pass it as an arg, or hand-rolling a recursive list-pass with rd on each file. Plus the matcher in glob collapses the most common shape (find . -name '*.ext') to a single call.

Repro before / after

Before (no way to list a directory at all):

```
$ ilo 'main>n;len ls "."'
ILO-T004: undefined variable 'ls'
```

After:

```
$ ilo 'main>R (L t) t;ls "examples"'
[02-with-dependencies.ilo, at-float-index.ilo, ...]

$ ilo 'main>R (L t) t;glob "examples" "**/*.ilo"'
[02-with-dependencies.ilo, ..., window-listview-perf.ilo]
```

What's in the diff

Four commits, one logical change per commit:

  1. free the ls binding name ahead of fs builtins - rename four pre-existing ls= bindings (two test fixtures, one example, plus the bindings inside the builtin-binding-name-rename.ilo example which itself uses ls as a local) to non-builtin names. Needed before ls lands as a reserved short name.
  2. add ls / walk / glob filesystem builtins - Builtin enum entries + from_name/name/ALL/round-trip-tests in src/builtins.rs, signatures in src/verify.rs, tree-walker impl + helper walk_collect + helper glob_match in src/interpreter/mod.rs, tree-bridge eligibility in src/vm/mod.rs.
  3. test ls / walk / glob across every engine - 11 cross-engine tests in tests/regression_fs_builtins.rs using a per-test tempdir fixture, plus examples/fs-builtins.ilo for the engine harness (uses the repo's own examples/ dir as a stable fixture).
  4. sync docs for ls / walk / glob - SPEC.md builtins table + reserved-names list updated, ai.txt and skills/ilo/SKILL.md regenerate from SPEC via build.rs.

Test plan

  • cargo test --release --features cranelift - whole suite green
  • cargo fmt --check clean
  • cargo clippy --release --features cranelift -- -D warnings clean
  • tests/regression_fs_builtins.rs - 11 tests across --run-tree, --run-vm, --jit
  • tests/examples_engines.rs - new examples/fs-builtins.ilo runs through every engine
  • tests/regression_reserved_names_doc.rs - SPEC reserved-names list matches Builtin::ALL

Follow-ups

  • Site doc (docs/builtins/data-io.md in the ilo-lang/site repo) edit prepared at /Users/dan/code/ilo-lang/site-docs-refresh/src/content/docs/docs/builtins/data-io.md but couldn't push - the site main checkout is not present on this machine; needs landing in a separate PR over there.

`ls` is becoming a builtin (directory listing) alongside `walk` and
`glob`. Four spots in the tree still use `ls` as a local binding name -
two regression-test fixtures, one example, and the dedicated
builtin-rename example whose body itself uses `ls` as a list local.
Rename each to a non-builtin, non-reserved short name (`xs`, `ws`,
`lims`) so the upcoming reserved-namespace check stays green.
Three new Result-typed builtins for directory enumeration. Closes the
categorical gap personas have hit repeatedly: `rd` exists for single
files but there was no way to list a directory or recursively find
files matching a pattern - the kind of thing every shell and every
scripting language ships out of the box.

- `ls dir > R (L t) t` lists immediate entries (filenames only, not full
  paths). Sorted lexicographically so output is deterministic across
  runs and filesystems. Empty dirs return `[]`, not Err. Missing dirs
  and permission errors surface as Err so the caller can branch.
- `walk dir > R (L t) t` does a depth-first recursive walk, returning
  paths relative to `dir` with forward-slash separators (deterministic
  across OSes, composable with `cat`/`fmt` for downstream reads).
  Symlinks are not followed to avoid cycles on typical project trees.
- `glob dir pat > R (L t) t` filters `walk` output by a shell-style
  pattern. `*`/`?`/`[abc]` within a path segment, `**` recursive across
  segments. Hand-rolled matcher rather than the `glob` crate to keep
  the build lean - it's small enough that pulling a crate isn't worth
  it.

All three are tree-bridge eligible (`is_tree_bridge_eligible`), so VM
and Cranelift inherit them through the same path `rd`/`rdb`/`sleep`
already use. No new opcodes, no Cranelift helpers, no AOT changes.
`tree_bridge_returns_result` picks them up so `!`-unwrap composes
correctly through the bridge.
11 cross-engine cases against a per-test tempdir fixture:
- ls: basic listing (sorted, files + dirs), empty dir returns `[]`,
  missing dir surfaces as Err.
- walk: full recursive output sorted lexicographically, missing dir as
  Err, single-file dir (regression guard for an early DFS that pushed
  the file as a directory).
- glob: single-segment `*`, recursive `**`, character class `[ab]`,
  missing dir as Err, no-matches returns `[]` (not Err).

Every assertion runs against tree, VM, and Cranelift JIT. The bridge
is exactly where past cross-engine drift has hidden, so the assertions
care more about parity than about exhaustive matcher coverage.

`examples/fs-builtins.ilo` shows the four call shapes (`ls!`, `walk!`,
`glob!`, top-level Err propagation) using the repo's own `examples/`
directory as a stable fixture so the engine-harness can run it without
setup.
SPEC.md gets three new rows in the builtins table next to the rest of
the file-I/O family, plus `ls` added to the 2-char reserved-names list
and `walk` / `glob` mentioned in the longer-builtins prose. `ai.txt`
and `skills/ilo/SKILL.md` regenerate from SPEC.md via build.rs.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 78.78788% with 42 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/interpreter/mod.rs 77.04% 42 Missing ⚠️

📢 Thoughts on this report? Let us know!

…s-builtins

# Conflicts:
#	SPEC.md
#	ai.txt
#	skills/ilo/SKILL.md
#	src/builtins.rs
#	src/vm/mod.rs
@danieljohnmorris danieljohnmorris merged commit 73ca38c into main May 18, 2026
4 of 5 checks passed
@danieljohnmorris danieljohnmorris deleted the feature/fs-builtins branch May 18, 2026 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant