Compile API specs into token-efficient contracts for LLM agents.
Website · Registry · Benchmarks · Docs
LLMs waste thousands of tokens parsing bloated API specs. Stripe's OpenAPI spec is 1M+ tokens — mostly YAML scaffolding, nested wrappers, and repeated schemas. LAP compiles any API spec into a typed, flat format that preserves every endpoint, parameter, and constraint in a fraction of the tokens.
Not minification — a purpose-built compiler with its own grammar.
LAP Lean scored 0.851 (vs 0.825 raw) while using 35% less cost and 29% less time -- same accuracy, far fewer tokens.
Full benchmark report (500 runs, 50 specs, 5 formats) · Benchmark methodology and data
# Try without installing
npx @lap-platform/lapsh compile api.yaml
# Or install
pip install lapsh
lapsh compile api.yaml -o api.lap- 🗜️ 5.2× median compression on OpenAPI, up to 39.6× on large specs — 35% cheaper, 29% faster (benchmarks)
- 📐 Typed contracts —
enum(a|b|c),str(uuid),int=10prevent agent hallucination - 🔌 6 input formats — OpenAPI, GraphQL, AsyncAPI, Protobuf, Postman, Smithy
- 🎯 Zero information loss — every endpoint, param, and type constraint preserved
- 🔁 Round-trip — convert back to OpenAPI with
lapsh convert - 📦 Registry — browse and install pre-compiled specs at lap.sh
- 🤖 Skill generation —
lapsh skillcreates agent-ready skills from any spec - 🔗 Integrations — LangChain, CrewAI, OpenAI function calling, MCP
Five compression stages, each targeting a different source of token waste:
| Stage | What it does | Savings |
|---|---|---|
| Structural removal | Strip YAML scaffolding — paths:, requestBody:, schema: wrappers vanish |
~30% |
| Directive grammar | @directives replace nested structures with flat, single-line declarations |
~25% |
| Type compression | type: string, format: uuid → str(uuid) |
~10% |
| Redundancy elimination | Shared fields extracted once via @common_fields and @type |
~20% |
| Lean mode | Strip descriptions — LLMs infer meaning from well-named parameters | ~15% |
162 specs · 5,228 endpoints · 4.37M → 423K tokens
| Format | Specs | Median | Best |
|---|---|---|---|
| OpenAPI | 30 | 5.2× | 39.6× |
| Postman | 36 | 4.1× | 24.9× |
| Protobuf | 35 | 1.5× | 60.1× |
| AsyncAPI | 31 | 1.4× | 39.1× |
| GraphQL | 30 | 1.3× | 40.9× |
Verbose formats compress most — they carry the most structural overhead. Already-concise formats like GraphQL still benefit from type deduplication.
LAP is more than a compiler:
| Component | What | Command |
|---|---|---|
| Compiler | Any spec → .lap |
lapsh compile api.yaml |
| Registry | Browse & install pre-compiled specs | lapsh skill-install stripe |
| Skill Generator | Create agent-ready skills from any spec | lapsh skill api.yaml --install |
| API Differ | Detect breaking API changes | lapsh diff old.lap new.lap |
| Round-trip | Convert LAP back to OpenAPI | lapsh convert api.lap -f openapi |
| Publish | Share specs to the registry | lapsh publish api.yaml --provider acme |
lapsh compile api.yaml # OpenAPI 3.x / Swagger
lapsh compile schema.graphql # GraphQL SDL
lapsh compile events.yaml # AsyncAPI
lapsh compile service.proto # Protobuf / gRPC
lapsh compile collection.json # Postman v2.1
lapsh compile model.smithy # AWS SmithyFormat is auto-detected. Override with -f openapi|graphql|asyncapi|protobuf|postman|smithy.
# LangChain
from lap.integrations import LAPLoader
docs = LAPLoader("stripe.lap").load()
# OpenAI function calling
from lap.integrations import to_openai_functions
functions = to_openai_functions("stripe.lap")Also: CrewAI tool, MCP server and compression proxy. See integration docs.
How is this different from OpenAPI?
LAP doesn't replace OpenAPI — it compiles FROM it. Like TypeScript → JavaScript: you keep your OpenAPI specs, your existing tooling, everything. LAP adds a compilation step for the LLM runtime.
How is this different from MCP?
MCP defines how agents discover and invoke tools (the plumbing). LAP compresses the documentation those tools expose (the payload). They're complementary — LAP can compress MCP tool schemas.
Why not just minify the JSON?
Minification removes whitespace — that's ~10% savings. LAP performs semantic compression: flattening nested structures, deduplicating schemas, compressing type declarations, and stripping structural overhead. That's 5-40× savings. Different class of tool.
What about prompt caching?
Use both. Compress with LAP first, then cache the compressed version. LAP reduces the first-call cost and frees context window space. Caching reduces repeated-call cost. They stack.
Will LLMs understand this format?
Yes. LAP uses conventions LLMs already know — @directive syntax, {name: type} notation, HTTP methods and paths. In blind tests, agents produce identical correct output from LAP and raw OpenAPI. The typed contracts actually reduce hallucination.
What if token costs keep dropping?
Cost is the least important argument. The core value is typed contracts: enum(succeeded|pending|failed) prevents hallucinated values regardless of token price. Plus: formal grammar (parseable by code, not just LLMs), schema diffing, and faster inference from fewer input tokens.
See CONTRIBUTING.md. The test suite has 545 tests across 177 example specs in 6 formats.
git clone https://github.com/Lap-Platform/lap.git
cd lap
pip install -e ".[dev]"
pytestApache 2.0 — See NOTICE for attribution.
lap.sh · Built by the LAP team

