feature: modular skills (Phase 1) by danieljohnmorris · Pull Request #397 · ilo-lang/ilo

danieljohnmorris · 2026-05-18T19:39:43Z

Summary

Splits ilo's monolithic ~18,710-token ai.txt compact spec into six Anthropic-Skill-format modules so agents load only the slice their current task needs. Adds an ilo skill list/get/path/show CLI accessor, bundles the skills into the binary via include_str!, and gates the per-module / aggregate token budget in CI.

This is Phase 1 of the strategic work programme (zero-gap-specs/01-modular-skills.md). Without it, per-task token cost is dominated by the spec load and ilo is non-viable against Zero on cached agent workflows regardless of how dense the source is.

Token counts

before: 18,710 tokens (whole ai.txt, always fully loaded)
after: 4,906 tokens total across six modules
typical per-task load: 1-2 modules (~ 1,500-2,000 tokens)
reduction on typical agent load: ~9-12x

Per-module breakdown (cl100k_base):

module	tokens
ilo-language	994
ilo-builtins	830
ilo-engines	801
ilo-agent	851
ilo-tools	745
ilo-errors	685
TOTAL	4,906

Every module is under the 1,000-token per-module cap. Total is under the 5,000-token aggregate cap. The hard limits are enforced by scripts/check-skill-tokens.py in CI plus a byte-proxy + structural tests/skill_modular.rs in the Rust test suite.

What's in the diff

8 commits, one per logical step:

add ilo-language modular skill - core syntax: prefix notation, types, guards, match, pipes, records, Results, lambdas
add ilo-builtins modular skill - signatures + examples for math, text, list, HOF, map, IO, HTTP, JSON, time
add ilo-errors modular skill - ILO-X### codes with one-line cause + fix, JSON diagnostic shape
add ilo-tools modular skill - tool keyword, MCP/HTTP providers, runtime flow
add ilo-engines modular skill - tree/VM/JIT/AOT backend selection, feature matrix
add ilo-agent modular skill - skill discovery, run/check invocation, repair loop, serv mode
wire \ilo skill list/get/path/show` and update marketplace` - CLI subcommand, marketplace.json lists all six
enforce modular-skill token budget in CI and Rust tests - tiktoken script + Rust structural tests

ilo -ai still emits the full concatenated compact spec for back-compat, so existing agents that load everything keep working. The new ilo skill get <name> surface is the recommended path for token-sensitive agents.

skills/ilo/SKILL.md stays in place (still auto-regenerated from SPEC.md by build.rs) as the legacy single-skill entry point. The six modules sit alongside it. Retiring SKILL.md would require unwinding ~3 dozen test references and is deferred to Phase 1b.

Test plan

Follow-ups

Phase 1b: retire skills/ilo/SKILL.md and rewire the ~3 dozen test references to point at the modular files, or replace SKILL.md with a thin router that links to the six.
Phase 2 (closed-loop benchmark): verify in a real agent session that the descriptions route reliably and that the agent loads 1-2 modules per task, not all six.

First of six modules splitting the monolithic ai.txt into Anthropic Skill format. Covers the core syntax + semantics reference: prefix notation, type sigils, function decls, guards, match, pipes, records, Results. Self-contained: an agent can write or review .ilo source from this module alone. 994 tokens (cl100k_base), under the 1,000-token per-module cap. Builtin signatures, error codes, tool decls, and engine choice live in the other five modules.

Signatures and one-line examples for the most-used builtins: math, text, list, HOF, map, I/O, HTTP, JSON, time. No prose explanation, that lives in SPEC.md and the docs site. 830 tokens. Includes the asin/acos/sqrt/log silent-NaN gotcha because validating at the call boundary belongs with the call list, not buried in errors.

ILO-L###/P###/T###/R### codes with one-line cause + fix, plus the JSON diagnostic shape so an agent can route on code and patch at span without loading the long-form explanation. 685 tokens. Long-form explanations stay behind ilo --explain ILO-XXXX.

How to declare and call external tools (HTTP, MCP) from ilo: tool keyword, provider config, discovery commands, runtime flow, failure handling. 745 tokens. Most programs don't touch tools, so this module is loaded only when the task does.

Backend selection (tree, VM, JIT, AOT): when each matters, feature matrix, benchmark + compile invocation. VM is the default since v0.11 (#390); JIT and tree are opt-in. 801 tokens. This module is rarely needed by most agents, so keep it small and load it only when picking execution mode actually matters.

The agent integration surface: skill discovery, running programs, reading JSON diagnostics, the repair loop, top-level Result contract, serv mode. Teaches an agent how to USE ilo rather than how the language works. 851 tokens. This is the entry-point skill an agent loads first; it then pulls the others on demand.

Bundles the six skill modules into the binary via include_str! so they travel with every install and stay version-locked to the compiler. The new `ilo skill` subcommand is the canonical surface for granular loading: ilo skill list name + description per skill ilo skill get <name> full content (stdout) ilo skill path <name> bundled filesystem path ilo skill show <name> content with a formatted header exit 1 + 'unknown skill' on bad names `ilo -ai` still emits the full concatenated compact spec for back-compat, so existing agents that load everything keep working. marketplace.json grows a `skills` array listing all six so Claude Code's skill discovery picks them up automatically when the plugin is installed.

Two layers of defence so the per-task economics don't regress: 1. scripts/check-skill-tokens.py uses tiktoken cl100k_base, the same tokeniser an agent host uses for context-window sizing, and is the canonical 1,000-tokens-per-module / 5,000-tokens-total gate. 2. tests/skill_modular.rs runs inside cargo test, no Python required; uses a conservative byte proxy plus structural checks (every file exists, every frontmatter valid, every description starts with 'Use this when', name matches filename, marketplace lists them all). The Rust test catches drift even when CI is bypassed; the Python step catches the edge cases the byte proxy can't see.

codecov · 2026-05-18T19:43:16Z

Codecov Report

❌ Patch coverage is 0% with 46 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
src/main.rs	0.00%	46 Missing ⚠️

📢 Thoughts on this report? Let us know!

danieljohnmorris added 8 commits May 18, 2026 20:38

add ilo-errors modular skill

9f5e60c

ILO-L###/P###/T###/R### codes with one-line cause + fix, plus the JSON diagnostic shape so an agent can route on code and patch at span without loading the long-form explanation. 685 tokens. Long-form explanations stay behind ilo --explain ILO-XXXX.

add ilo-tools modular skill

88d5181

How to declare and call external tools (HTTP, MCP) from ilo: tool keyword, provider config, discovery commands, runtime flow, failure handling. 745 tokens. Most programs don't touch tools, so this module is loaded only when the task does.

Merge remote-tracking branch 'origin/main' into feature/modular-skills

26494f4

danieljohnmorris merged commit f596b08 into main May 18, 2026
4 checks passed

danieljohnmorris deleted the feature/modular-skills branch May 18, 2026 21:19

danieljohnmorris mentioned this pull request May 19, 2026

skills CLI Phase 2: add ilo-examples, ilo-edit-loop, thin-bootstrap SKILL.md #425

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: modular skills (Phase 1)#397

feature: modular skills (Phase 1)#397
danieljohnmorris merged 9 commits into
mainfrom
feature/modular-skills

danieljohnmorris commented May 18, 2026

Uh oh!

codecov Bot commented May 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

danieljohnmorris commented May 18, 2026

Summary

Token counts

What's in the diff

Test plan

Follow-ups

Uh oh!

codecov Bot commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov Bot commented May 18, 2026 •

edited

Loading