A protocol adapter that turns a locally-executed Skill into GEP (Genome Evolution Protocol) assets:
- Gene -- a compact strategy template (
signals_match+strategy+AVOID:+validation) - Capsule -- an auditable record of one real execution of that Gene (
outcome+execution_trace+env_fingerprint)
Skills are written for humans; GEP assets are written for model execution. skill2gep takes a Skill plus the evidence of one concrete run (trace, blast radius, exit codes) and packages both sides into GEP. Capsules fabricated from the document alone are rejected by the validator -- the Capsule only exists when a real execution backs it.
Based on Wang, Ren, Zhang, From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution (Infinite Evolution Lab / EvoMap x Tsinghua University), arXiv:2604.15097.
Read this before you assume the tool does more than it does.
- This is a protocol adapter, not a magic distiller. Phases 1-8 of the workflow (read Skill, dedup, draft candidate Genes, score) are still executed by the calling agent against the SKILL.md text.
skill2gepformats the result into GEP schema, runs hard-gate validators, and ships it. Output quality tracks the quality of the source Skill and of the execution evidence the agent provides. - The paper's empirical scope is narrow. arXiv:2604.15097 validates Gene-as-control-interface vs. procedural Skills on 45 scientific code-solving scenarios with Gemini 3.1 Pro and Flash Lite. Claims about other agent domains (web automation, long tool chains, multi-agent negotiation, human support workflows) are extrapolation and are flagged as such in the Gene's
_source.claims_outside_scope: "assumption"field. - "Gene is better than Skill" is not the same as "skill2gep reliably produces high-quality Genes." The paper shows the target format outperforms procedural Skills on its task set. This tool tries to produce that format from an arbitrary Skill; whether the resulting Gene is high-quality on your workload is an empirical question that depends on your Skill, your execution trace, and the EvoMap community consensus review.
When you publish to EvoMap, community validators score the asset. Low-quality Genes get rejected or down-weighted. Think of skill2gep as the submission pipeline, not the grader.
Inside any Cursor-compatible agent session:
Apply the skill2gep skill to ~/.cursor/skills/<some_other_skill>
The agent will then:
- Read and analyze the source Skill
- Dedup against local + community Gene pools
- Draft candidate Genes, split by dimension
- Hard-gate validate each candidate (schema + dry-run + scenario replay)
- Install accepted Genes into the local Gene pool
- (Optional Path B) Run an accepted Gene on a real scenario and collect a Capsule
- (Optional Phase 10) Publish the Skill, Gene, or Capsule to EvoMap
See SKILL.md for the full 10-phase workflow and all the hard rules.
Copy this directory into your Cursor skills folder:
git clone https://github.com/EvoMap/skill2gep.git ~/.cursor/skills/skill2gepThe two validator scripts are zero-dependency Node.js programs. Requires Node >= 18.
If you have @evomap/evolver installed, you can run the reverse-distillation pipeline directly, with sandboxed storage and forgery guards, without involving an agent loop:
# Minimal: Gene-only, no publish
evolver skill2gep ./path/to/skill --no-publish
# With real execution evidence -> Gene + Capsule
evolver skill2gep ./path/to/skill \
--execution=./skill-run.json \
--platform=cursor
# Strict mode: refuse Skills whose validation is not node/npm/npx
# (GEP only executes those prefixes; non-runnable validations would silently
# fall back to `node --version`, which defeats real coverage)
evolver skill2gep ./path/to/skill --strictExecution JSON shape:
{
"status": "success",
"score": 0.9,
"started_at": "2026-04-21T00:00:00Z",
"trace": [{ "step": 1, "cmd": "node --test test/foo.test.js", "exit": 0 }],
"blast_radius": { "files": 3, "lines": 42 },
"trigger": ["log_error"],
"signals": ["log_error"]
}Skills are fine prose. At execution time, agents pay for every token they read. GEP splits the experience into two addressable asset types:
| Asset | Serves | Derived from |
|---|---|---|
| Gene | Model execution | An abstracted strategy template |
| Capsule | Audit and reproducibility | Gene + at least one real run |
The paper reports that Gene-as-control-interface outperforms procedural Skills on its 45-task benchmark; the rationale is that a Gene encodes the minimum decision surface an agent needs at runtime, without narrative scaffolding. Whether that generalizes to your workload is the empirical question the honest-scope note above is about. What is guaranteed by the adapter is:
- every published Gene carries its source Skill hash and the paper scope disclaimer in
_source.paper_scope, - every published Capsule is backed by a real
execution_tracewhose exit codes cover the Gene's declaredvalidation, - fabricated successes are rejected locally before they ever reach the hub.
Skills and Genes are not replacements for each other. The Skill keeps serving human readers; the Gene serves model execution. They coexist.
# Check a Gene candidate file (JSON array of Gene objects)
node scripts/validate_gene.js candidates.json
# Check a Capsule file; pass the Gene pool to enable dangling-reference
# and validation-coverage checks
node scripts/validate_capsule.js capsules.json \
--genes-jsonl path/to/genes.jsonlBoth validators enforce:
- Required fields, correct
typetag, id regex, category enum - At least one
AVOID:entry in every Gene (unless explicitly waived) and a realexecution_tracewith integer exit codes in every Capsule - No private-path literals leaked (
/home/<user>, internal repo names, etc.) - Token ceilings (Gene <= 500 estimated tokens)
- Capsule forgery guard:
outcome.status=successwith zero blast radius and no execution trace is rejected - Capsule validation coverage: every
validationcommand declared by the referenced Gene must appear inexecution_tracewith a numeric exit code
Two surfaces, two different paths:
| Surface | What to publish | Endpoint |
|---|---|---|
| Skill Store (human-facing, downloadable) | The full SKILL.md plus bundled files | POST https://evomap.ai/a2a/skill/store/publish (spec) |
| GEP Hub (agent-facing evolution memory) | Individual Genes and Capsules | @evomap/evolver runtime + GEP MCP tools (spec) |
See SKILL.md Phase 10 for full payload shapes, moderation rules, and authentication.
MIT. See LICENSE.
- Paper: Wang, Ren, Zhang, From Procedural Skills to Strategy Genes: Towards Experience-Driven Test-Time Evolution. arXiv:2604.15097
- GEP Protocol
- Skill Store
- Evolver runtime