Fix/issue 30 docs and template by Yif-Yang · Pull Request #32 · microsoft/SkillOpt

Yif-Yang · 2026-06-01T20:20:09Z

Summary

Issue #30 reported that docs/reference/api.md, docs/guide/new-benchmark.md,
and skillopt/envs/_template/ all described a fictional EnvAdapter API
(execute / evaluate / build_prompt + DataItem / TaskResult +
BENCHMARK_REGISTRY + skillopt.data.base) that does not exist in the
codebase. Following the guide ended in ImportError, TypeError: Can't instantiate abstract class, or ValueError: Unknown environment.

This PR rewrites the docs + template to match the real ABC in
skillopt/envs/base.py and skillopt/datasets/base.py. No production
code is touched.

Changes

docs/reference/api.md — replaces the fictional API with the real 5
abstract methods (build_train_env / build_eval_env / rollout / reflect / get_task_types), correct BatchSpec shape, real
SplitDataLoader contract, and a note that registration lives in
_ENV_REGISTRY (in scripts/train.py), not a BENCHMARK_REGISTRY.
docs/guide/new-benchmark.md — full worked docfaithful example
including loader, rollout helper (using skillopt.model.chat_target),
adapter, registration block, and a _base_: string-form config.
skillopt/envs/_template/ — template now subclasses the real ABCs
and is instantiable out of the box.

Verified

pytest tests/ → 60/60 passed
python -c "from skillopt.envs._template.env_template import TemplateBenchmarkEnv; TemplateBenchmarkEnv()" succeeds
python scripts/train.py --help unchanged
Cross-reviewed twice with codex exec --sandbox read-only (second pass: APPROVE)

Closes #30.

…ter ABC docs/reference/api.md previously documented a fictional EnvAdapter API (execute / evaluate / build_prompt + DataItem / TaskResult) and a BENCHMARK_REGISTRY that never existed in code. Anyone following the documented contract would hit ImportError or TypeError on the first instantiation. Replace both pages with the real shape from skillopt/envs/base.py and skillopt/datasets/base.py: - EnvAdapter: build_train_env, build_eval_env, rollout, reflect, get_task_types (the 5 actual abstract methods). - Rollout dicts: id / hard / soft required; everything else preserved into RolloutResult.extras. - Reflect dicts: {patch, source_type} schema as consumed by run_minibatch_reflect. - BatchSpec: slotted-but-mutable dataclass matching the actual definition (payload defaults to None, metadata to dict()). - SplitDataLoader.load_split_items as the one mandatory loader method. - Registry: _ENV_REGISTRY in scripts/train.py (lazy try/except ImportError block), not a non-existent BENCHMARK_REGISTRY in skillopt/envs/__init__.py. - _base_: documented as a string path, since the current YAML loader only accepts strings. The new-benchmark.md guide now walks through a docfaithful worked example with a real rollout helper (chat_target + scorer) instead of hand-waving over the rollout step. Refs microsoft#30. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

The shipped env_template.py and loader_template.py described the same fictional async execute / evaluate / build_prompt API documented in docs/reference/api.md. As a result TemplateBenchmarkEnv(cfg) raised 'TypeError: Can't instantiate abstract class' for every copy-and-paste user who followed the in-tree scaffold. Rewrite the template so it's a working starting point: - env_template.py: TemplateBenchmarkEnv(EnvAdapter) now implements all five real abstract methods (build_train_env, build_eval_env, rollout, reflect, get_task_types) with no-op defaults documented as TODO. Instantiable today; pytest 60/60 still passes. - loader_template.py: TemplateBenchmarkLoader(SplitDataLoader) implements load_split_items for .json / .jsonl input and explains the optional load_raw_items override for split_mode="ratio". - README.md: usage steps now point at scripts/train.py's _ENV_REGISTRY (the real registry) instead of a non-existent BENCHMARK_REGISTRY in skillopt/envs/__init__.py, and link to the rewritten new-benchmark guide. - config_template.yaml: _base_ is a string path (not a list, which the loader rejects); skill_init is commented out with a note so the template config doesn't reference a file the user hasn't created. Verified locally: 'from skillopt.envs._template.env_template import TemplateBenchmarkEnv; TemplateBenchmarkEnv()' succeeds. Refs microsoft#30. Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>

Yif-Yang and others added 2 commits June 1, 2026 20:14

Cuzyoung merged commit 25da7cb into microsoft:main Jun 2, 2026
1 check was pending

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix/issue 30 docs and template#32

Fix/issue 30 docs and template#32
Cuzyoung merged 2 commits into
microsoft:mainfrom
Yif-Yang:fix/issue-30-docs-and-template

Yif-Yang commented Jun 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Yif-Yang commented Jun 1, 2026

Summary

Changes

Verified

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants