Skip to content

feature: modular skills (Phase 1)#397

Merged
danieljohnmorris merged 9 commits into
mainfrom
feature/modular-skills
May 18, 2026
Merged

feature: modular skills (Phase 1)#397
danieljohnmorris merged 9 commits into
mainfrom
feature/modular-skills

Conversation

@danieljohnmorris
Copy link
Copy Markdown
Collaborator

Summary

Splits ilo's monolithic ~18,710-token ai.txt compact spec into six Anthropic-Skill-format modules so agents load only the slice their current task needs. Adds an ilo skill list/get/path/show CLI accessor, bundles the skills into the binary via include_str!, and gates the per-module / aggregate token budget in CI.

This is Phase 1 of the strategic work programme (zero-gap-specs/01-modular-skills.md). Without it, per-task token cost is dominated by the spec load and ilo is non-viable against Zero on cached agent workflows regardless of how dense the source is.

Token counts

  • before: 18,710 tokens (whole ai.txt, always fully loaded)
  • after: 4,906 tokens total across six modules
  • typical per-task load: 1-2 modules (~ 1,500-2,000 tokens)
  • reduction on typical agent load: ~9-12x

Per-module breakdown (cl100k_base):

module tokens
ilo-language 994
ilo-builtins 830
ilo-engines 801
ilo-agent 851
ilo-tools 745
ilo-errors 685
TOTAL 4,906

Every module is under the 1,000-token per-module cap. Total is under the 5,000-token aggregate cap. The hard limits are enforced by scripts/check-skill-tokens.py in CI plus a byte-proxy + structural tests/skill_modular.rs in the Rust test suite.

What's in the diff

8 commits, one per logical step:

  1. add ilo-language modular skill - core syntax: prefix notation, types, guards, match, pipes, records, Results, lambdas
  2. add ilo-builtins modular skill - signatures + examples for math, text, list, HOF, map, IO, HTTP, JSON, time
  3. add ilo-errors modular skill - ILO-X### codes with one-line cause + fix, JSON diagnostic shape
  4. add ilo-tools modular skill - tool keyword, MCP/HTTP providers, runtime flow
  5. add ilo-engines modular skill - tree/VM/JIT/AOT backend selection, feature matrix
  6. add ilo-agent modular skill - skill discovery, run/check invocation, repair loop, serv mode
  7. wire \ilo skill list/get/path/show` and update marketplace` - CLI subcommand, marketplace.json lists all six
  8. enforce modular-skill token budget in CI and Rust tests - tiktoken script + Rust structural tests

ilo -ai still emits the full concatenated compact spec for back-compat, so existing agents that load everything keep working. The new ilo skill get <name> surface is the recommended path for token-sensitive agents.

skills/ilo/SKILL.md stays in place (still auto-regenerated from SPEC.md by build.rs) as the legacy single-skill entry point. The six modules sit alongside it. Retiring SKILL.md would require unwinding ~3 dozen test references and is deferred to Phase 1b.

Test plan

  • Six skill files exist at skills/ilo/ilo-*.md with valid Anthropic YAML frontmatter (name, description)
  • Every description starts with "Use this when..."
  • Each module <= 1,000 tokens (cl100k_base); total <= 5,000 tokens
  • ilo skill list prints all six with name + description
  • ilo skill get <name> prints content, unknown name exits 1
  • ilo skill path <name> prints skills/ilo/<name>.md
  • ilo skill show <name> prints content with header
  • ilo -ai still works (back-compat); ai.txt unchanged on disk
  • marketplace.json lists all six skills
  • cargo fmt --all -- --check clean
  • cargo clippy --release --features cranelift --all-targets -- -D warnings clean
  • cargo test --release --features cranelift passes (the existing regression_inline_lambda failures pre-date this branch and are unrelated; they reproduce on main)
  • python3 scripts/check-skill-tokens.py exits 0

Follow-ups

  • Phase 1b: retire skills/ilo/SKILL.md and rewire the ~3 dozen test references to point at the modular files, or replace SKILL.md with a thin router that links to the six.
  • Phase 2 (closed-loop benchmark): verify in a real agent session that the descriptions route reliably and that the agent loads 1-2 modules per task, not all six.

First of six modules splitting the monolithic ai.txt into Anthropic Skill
format. Covers the core syntax + semantics reference: prefix notation,
type sigils, function decls, guards, match, pipes, records, Results.

Self-contained: an agent can write or review .ilo source from this module
alone. 994 tokens (cl100k_base), under the 1,000-token per-module cap.
Builtin signatures, error codes, tool decls, and engine choice live in
the other five modules.
Signatures and one-line examples for the most-used builtins: math, text,
list, HOF, map, I/O, HTTP, JSON, time. No prose explanation, that lives
in SPEC.md and the docs site. 830 tokens.

Includes the asin/acos/sqrt/log silent-NaN gotcha because validating at
the call boundary belongs with the call list, not buried in errors.
ILO-L###/P###/T###/R### codes with one-line cause + fix, plus the JSON
diagnostic shape so an agent can route on code and patch at span without
loading the long-form explanation. 685 tokens.

Long-form explanations stay behind ilo --explain ILO-XXXX.
How to declare and call external tools (HTTP, MCP) from ilo: tool keyword,
provider config, discovery commands, runtime flow, failure handling. 745
tokens. Most programs don't touch tools, so this module is loaded only
when the task does.
Backend selection (tree, VM, JIT, AOT): when each matters, feature
matrix, benchmark + compile invocation. VM is the default since v0.11
(#390); JIT and tree are opt-in. 801 tokens.

This module is rarely needed by most agents, so keep it small and load
it only when picking execution mode actually matters.
The agent integration surface: skill discovery, running programs,
reading JSON diagnostics, the repair loop, top-level Result contract,
serv mode. Teaches an agent how to USE ilo rather than how the language
works. 851 tokens.

This is the entry-point skill an agent loads first; it then pulls the
others on demand.
Bundles the six skill modules into the binary via include_str! so they
travel with every install and stay version-locked to the compiler. The
new `ilo skill` subcommand is the canonical surface for granular
loading:

  ilo skill list             name + description per skill
  ilo skill get <name>       full content (stdout)
  ilo skill path <name>      bundled filesystem path
  ilo skill show <name>      content with a formatted header
  exit 1 + 'unknown skill'   on bad names

`ilo -ai` still emits the full concatenated compact spec for
back-compat, so existing agents that load everything keep working.

marketplace.json grows a `skills` array listing all six so Claude Code's
skill discovery picks them up automatically when the plugin is
installed.
Two layers of defence so the per-task economics don't regress:

  1. scripts/check-skill-tokens.py uses tiktoken cl100k_base, the same
     tokeniser an agent host uses for context-window sizing, and is the
     canonical 1,000-tokens-per-module / 5,000-tokens-total gate.
  2. tests/skill_modular.rs runs inside cargo test, no Python required;
     uses a conservative byte proxy plus structural checks (every file
     exists, every frontmatter valid, every description starts with
     'Use this when', name matches filename, marketplace lists them
     all).

The Rust test catches drift even when CI is bypassed; the Python step
catches the edge cases the byte proxy can't see.
@codecov
Copy link
Copy Markdown

codecov Bot commented May 18, 2026

Codecov Report

❌ Patch coverage is 0% with 46 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/main.rs 0.00% 46 Missing ⚠️

📢 Thoughts on this report? Let us know!

@danieljohnmorris danieljohnmorris merged commit f596b08 into main May 18, 2026
4 checks passed
@danieljohnmorris danieljohnmorris deleted the feature/modular-skills branch May 18, 2026 21:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant