Skip to content

research(skills): Bilevel MCTS skill optimization — outer structure search + inner content refinement (arXiv:2604.15709) #4065

@bug-ops

Description

@bug-ops

Description

Formulates agent skill optimization as a bilevel problem: outer loop = MCTS over skill structure (which instructions/tools to include), inner loop = LLM-guided refinement of component content within the chosen structure. MCTS uses delayed feedback to balance exploitation and exploration of edit paths.

Relevance to Zeph

zeph-skills has a self-learning path (hot-reload, embedding evolution) but no principled structure-space search. Bilevel MCTS could drive autonomous skill restructuring in response to performance signals from zeph-experiments.

Proposed Action

Evaluate MCTS-guided skill structure optimization as a post-hoc improvement layer on top of the existing SkillEvolution pipeline. Start with a small skill library and measure quality delta.

Reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    researchResearch-driven improvementskillszeph-skills crate

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions