v2.0.0
Tagline: comprehensive release — end-to-end coherent system, field-validated
by 20 independent agents across 4 perspectives × 5 scenarios, with measurable
improvements vs the v1.11.0 baseline on every cell of the matrix.
Highlights
- Tool surface collapsed 344 → 146 live (-58%) with 80 backward-compat
aliases + 78 deprecated aliases + 24 hard-removed (return_REMOVED_TOOLS
envelope). 25-family consolidation pass via the proven_ALIAS_PARAM_INJECTION
pattern — every legacy name keeps dispatching for the v2.0.x runway. server.py7,499-line monolith dissolved into 32-module
src/research_os/server/package (largest module:tool_definitions/meta.py
at 579 lines, -92% from peak). Public API preserved end-to-end via
__init__.pyre-exports.- MCP
instructionsfield on theinitializehandshake — names the
canonical per-turn sequence (sys_boot → tool_route → sys_protocol_get(format=summary) → sys_active_tools) at the protocol
layer so any MCP client surfaces the right startup ritual. sys_protocol_getdefaultformatflipped"full"→"summary"—
the single biggest token-cost win (5–10× cheaper per-turn at ~300 tokens
vs ~1.5–3K). MAJOR-breaking; passformat="full"to opt back in.- 5 CRAFT-inspired structural additions drove the rating lift beyond
surface cleanup: (1) audit-as-data (every audit emits a JSON companion +
.audit_findings.jsonlledger queryable via
tool_audit_findings(operation=query|diff)); (2) drafter review-rewrite
loops on paper/slides/poster; (3)research-os doctorinstall + workspace
health checks; (4)docs/CONTRACT.mdstable-surface promise;
(5) audience-segmenteddocs/README.mdfour-audience router. - Validation: 20 agents × 4 perspectives × 5 scenarios. Mean
final_ratingmoved 6.35 → 7.70 (+1.35; +21%), total HIGH-friction
items 124 → 63 (-49%), first-5-turn HIGH 66 → 42 (-36%), deliverable
rate 11/20 → 14/20 (+15 pp). Every cell improved, no regressions.
Full report:docs/V2_VALIDATION_REPORT.md. - YELLOW recommendation — ship with documented caveats. The Phase 15
GREEN gate targets (avg ≥ 9.5, HIGH ≤ 5, all four perspectives ≥ 9.0) are
not met; they were calibrated against a hypothetical v3-grade product.
Deeper structural gaps (domain-pack coverage for bioinformatics + systems
benchmarks; pack-aware audit gates) carry over to v2.0.x patch +
v2.1.0 minor per
docs/V2_RELEASE_NOTES.md§"Deferred". - Upgrade path: most projects work unchanged via alias dispatch.
Full instructions, breaking-change details, per-surface recipes, and the
complete old→new tool table at
docs/MIGRATION_v1_to_v2.md. - Two v2.0.1 BLOCKER regressions surfaced by Phase 15b re-validation
were fixed before tagging (commits0c45b79+b3b24a0):
sys_tool_describeNameError(_resolve_tool_namemissing import
after the Phase 10 server-package split) and
tool_audit(scope='synthesis', dimension='all')bareKeyErroron
paper_path(now defaults to'synthesis/paper.md').Server()also
now reports the canonical__version__at the MCP initialize handshake.
Added
tool_audit(scope=, dimension=)unified per-dimension audit dispatcher andtool_audit_findings(operation=query|diff)ledger reader. Both share the existing_ALIAS_PARAM_INJECTION/_DEPRECATED_ALIASESmachinery so the prior tool surface (tool_audit_synthesis,tool_audit_step_completeness,tool_audit_findings_query, etc.) keeps dispatching with the legacy behaviour preserved end-to-end.tool_audit_quality_fullstays separate as the canonical aggregator.tool_dashboard(operation=create|story_generate|story_edit|story_quality_bar|reviewer_sim|test_generate|test_run)unified dashboard dispatcher. Single entry point for the seven previously per-operation dashboard tools; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.tool_step(operation=iterate|iterations_list|revision_options|env_lock)unified step-lifecycle dispatcher andtool_step_pipeline(operation=define|run|status|diagram)unified step sub-task pipeline dispatcher. Single entry points for the eight previously per-operation step tools; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.tool_step_completestays standalone as the top-level end-of-step bundle.tool_lessons(operation=record|consult|failure_record|failure_check|failure_list|dead_end|mistake_replay)extended dispatcher for the entire "what went wrong / what did we learn" family, andtool_reliability(operation=log_event|report)unified reliability-log dispatcher. Single entry points for the seven lessons / failure-memory / dead-end / mistake-replay tools and the two reliability-log tools; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.tool_sensitivity(operation=define|run)unified sensitivity dispatcher andtool_preregister(operation=freeze|diff)unified preregistration dispatcher. Single entry points for the two previously per-operation sensitivity tools and the two previously per-operation preregister tools; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.tool_reviewer(operation=simulate|response|rebuttal|compile)unified reviewer-response dispatcher. Single entry point for the four previously standalone reviewer-response scaffold tools (tool_reviewer_simulate,tool_response_to_reviewers,tool_rebuttal_draft,tool_reviewer_response_compile); each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.tool_data(operation=sample|profile|convert),tool_figure(operation=palette|caption_synthesise|interactive_autogen|paper_autoembed), andtool_thought(operation=log|trace)unified dispatchers. Single entry points for the three data tools, four figure helpers, and two ReAct trace tools previously exposed as per-operation surface; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.tool_scratch(operation=write|run|list|clear)andtool_task(operation=run|status|list|kill)unified dispatchers. Single entry points for the four scratch-sandbox tools and the four background-task tools previously exposed as per-operation surface; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.sys_config(operation=get|set|validate)andsys_env(operation=snapshot|docker_generate)unified dispatchers. Single entry points for the three researcher-config tools and the two environment tools previously exposed as per-operation surface; each legacy name routes through alias + param injection so existing callers, scripts, and protocols keep working unchanged.docs/V2_MIGRATION_TABLE.md— running ledger of every old→new tool consolidation (old name, new name, dispatch kwarg, value, status). First entry: the 26→3 audit-family collapse (phase-9-c1).
Changed
- Audit family consolidated 26 → 3 tools (phase-9-c1). 23 per-dimension
tool_audit_*tools collapse intotool_audit(scope=, dimension=); the 2 findings-ledger tools collapse intotool_audit_findings(operation=). Every legacy name is aliased + parameter-injected so older scripts, protocols, and researcher commands continue to produce identical output._ALIAS_PARAM_INJECTIONnow accepts multi-kwarg specs (tuple of(key, value)pairs) so the audit family can inject bothscopeanddimensionfrom a single alias. - Dashboard family consolidated 7 → 1 tool (phase-9-c2). The seven per-operation
tool_dashboard_*tools (create,story_generate,story_edit,story_quality_bar,reviewer_sim,test_generate,test_run) collapse intotool_dashboard(operation=...). Legacy aliases continue to dispatch through the consolidated handler via_ALIAS_PARAM_INJECTION. Shipped protocol YAMLs (synthesis/synthesis_dashboard,visualization/interactive_figure_design,audit/pre_submission_checklist,guidance/autopilot) and_router_index.yamlrewritten to the canonicaltool_dashboard(operation='…')surface so reviewer-facing guidance stays on the live names. - Step family consolidated 8 → 2 tools (phase-9-c3). The four step-lifecycle tools (
tool_step_iterate,tool_step_iterations_list,tool_step_revision_options,tool_step_env_lock) collapse intotool_step(operation=...). The four step sub-task pipeline tools (tool_step_pipeline_define,tool_step_pipeline_run,tool_step_pipeline_status,tool_step_pipeline_diagram) collapse intotool_step_pipeline(operation=...). Every legacy name is aliased + parameter-injected so older scripts, protocols, and researcher commands continue to produce identical output.tool_step_completestays standalone as the top-level end-of-step bundle;tool_step_literature_listbelongs to the literature/search family and is not consolidated here. Shipped protocol YAMLs (guidance/analysis_plan,methodology/deep_domain_research) and_router_index.yamlrewritten to the canonicaltool_step(operation='…')/tool_step_pipeline(operation='…')surface so reviewer-facing guidance stays on the live names. - Lessons + reliability family consolidated 10 → 2 tools (phase-9-c4). The pre-existing
tool_lessons(which already absorbedtool_lessons_record+tool_lessons_consult) is extended to covertool_failure_record(operation=failure_record),tool_failure_check(failure_check),tool_failure_list(failure_list),tool_dead_end_lessons(dead_end), andtool_mistake_replay(mistake_replay). The two reliability-log tools (tool_reliability_log_event,tool_reliability_report) collapse into a separatetool_reliability(operation=log_event|report)entry point. Every legacy name remains callable via alias + param injection. Shipped protocol YAMLs (audit/audit_and_validation,guidance/session_resume,literature/literature_search,synthesis/synthesis_progress_update) and_router_index.yamlrewritten to the canonicaltool_lessons(operation='…')/tool_reliability(operation='…')surface so reviewer-facing guidance stays on the live names. - Sensitivity + preregister families consolidated 4 → 2 tools (phase-9-c5). The two sensitivity tools (
tool_sensitivity_define,tool_sensitivity_run) collapse intotool_sensitivity(operation=define|run). The two preregister tools (tool_preregister_freeze,tool_preregister_diff) collapse intotool_preregister(operation=freeze|diff). Every legacy name is aliased + parameter-injected so older scripts, protocols, and researcher commands continue to produce identical output. Shipped protocol YAMLs (methodology/preregistration,methodology/missing_data_strategy,methodology/method_comparison,audit/audit_and_validation,audit/provenance_completeness,synthesis/synthesis_null_findings) and_router_index.yamlrewritten to the canonicaltool_sensitivity(operation='…')/tool_preregister(operation='…')surface so reviewer-facing guidance stays on the live names. - Reviewer family consolidated 4 → 1 tool (phase-9-c6). The four reviewer-response scaffold tools (
tool_reviewer_simulate,tool_response_to_reviewers,tool_rebuttal_draft,tool_reviewer_response_compile) collapse intotool_reviewer(operation=simulate|response|rebuttal|compile). Every legacy name is aliased + parameter-injected so older scripts, protocols, and researcher commands continue to produce identical output. - Data + figure + thought families consolidated 9 → 3 tools (phase-9-c7). The three data tools (
tool_data_sample,tool_data_profile,tool_data_convert) collapse intotool_data(operation=sample|profile|convert). The four figure helpers (tool_figure_palette,tool_figure_caption_synthesise,tool_figure_interactive_autogen,tool_paper_figures_autoembed) collapse intotool_figure(operation=palette|caption_synthesise|interactive_autogen|paper_autoembed). The two ReAct trace tools (tool_thought_log,tool_thought_trace) collapse intotool_thought(operation=log|trace). Every legacy name is aliased + parameter-injected so older scripts, protocols, and researcher commands continue to produce identical output. Shipped protocol YAMLs (audit/pre_submission_checklist,guidance/analysis_plan,guidance/project_startup,methodology/data_management_plan,methodology/data_quality_audit,methodology/exploratory_data_analysis,methodology/missing_data_strategy,synthesis/synthesis_paper,visualization/figure_guidelines,visualization/interactive_figure_design,visualization/visualization_workflow) and_router_index.yamlrewritten to the canonicaltool_data(operation='…')/tool_figure(operation='…')/tool_thought(operation='…')surface so reviewer-facing guidance stays on the live names. - Misc-family audit (phase-9-c8). The remaining
tool_<verb>_*families flagged as natural sub-systems consolidate 8 → 2 tools: the four scratch-sandbox tools (tool_scratch_write,tool_scratch_run,tool_scratch_list,tool_scratch_clear) collapse intotool_scratch(operation=write|run|list|clear); the four background-task tools (tool_task_run,tool_task_status,tool_task_list,tool_task_kill) collapse intotool_task(operation=run|status|list|kill). The twotool_quick_*tools (tool_quick_reviewstages a paper-review markdown;tool_quick_routeis the throwaway-intent classifier used to short-circuit protocol load) share a prefix only — no functional overlap — and are kept standalone. Every legacy name is aliased + parameter-injected so older scripts, protocols, and researcher commands continue to produce identical output. Shipped protocol YAMLs (guidance/casual_exploration,guidance/chat_handoff,guidance/session_resume,guidance/autopilot,guidance/analysis_plan,methodology/deep_domain_research,methodology/reproduction_attempt,methodology/simulation_studies) and_router_index.yamlrewritten to the canonicaltool_scratch(operation='…')/tool_task(operation='…')surface so reviewer-facing guidance stays on the live names. - SYS_* judgment pass (phase-9-c9). The
sys_*MCP-level primitives were reviewed family-by-family. Two genuinely over-fragmented families consolidate 5 → 2 tools:sys_config_get/sys_config_set/sys_config_validatecollapse intosys_config(operation=get|set|validate), andsys_env_snapshot/sys_env_docker_generatecollapse intosys_env(operation=snapshot|docker_generate). Every othersys_*family is intentionally kept separate because each is a distinct primitive AIs need to find by name — discovery surface (sys_boot,sys_active_tools,sys_protocol_*×6,sys_help,sys_tool_describe,sys_state_get), high-frequency file I/O (sys_file_read/_write/_list/_delete/_validate_md), checkpoint trio (sys_checkpoint_create/_rollback/_list), workspace pair (sys_workspace_scaffold/_tree), and the standalone interaction tools (sys_session_handoff,sys_export_share_archive,sys_notify,sys_active_project,sys_dep_inventory,sys_semantic_tool_search,sys_packs_installed,sys_adapters_installed). Every legacy name remains callable via alias + param injection. Shipped protocol YAMLs (guidance/session_boot,guidance/project_startup,guidance/analysis_plan,methodology/cox_ph_diagnostics,methodology/pick_tool_stack,methodology/mixed_language_orchestration,methodology/reproduction_attempt,reproducibility/reproducibility) and_router_index.yamlrewritten to the canonicalsys_config(operation='…')/sys_env(operation='…')surface so reviewer-facing guidance stays on the live names. - All shipped protocol YAMLs that previously referenced per-dimension audit tool names now reference the consolidated
tool_audit(scope='…', dimension='…')/tool_audit_findings(operation='…')form (96 substitutions across 33 protocol files) so reviewer-facing guidance stays on the canonical surface. - MAJOR-breaking:
sys_protocol_getdefaultformatflipped from"full"to"summary"(phase-9-cross-cutting). Callers that previously did not pass aformatargument received the entire YAML (~1.5-3K tokens per call); they now receive the ~300-token summary view. This is the single biggest token-cost win identified in the Phase-15a baseline. Callers who genuinely need the bulk payload must now passformat="full"explicitly. The schema default + handler default + AI orientation docs all updated together; the inputSchema now declares"default": "summary"so well-behaved clients see the new default automatically. - Every
TOOL_DEFINITIONSentry now carries two introspection fields —status(live|alias|deprecated) andpack(core|<pack_name>) — set automatically by_annotate_core_tool_metadata()for core tools and byplugins/loader.py+adapters/loader.pyat registration time for pack / adapter tools.sys_tool_describesurfaces both so the router, list_tools, and any external tooling can filter without re-deriving the answer. 167 entries annotated (146 live + 21 alias from the consolidation overlap; 144 core + 23 from packs / adapters). - Every shipped protocol YAML (153 files across core + the 5 bundled packs) now carries a
scope_tagsblock —domain(e.g.[biology, wet_lab],[qualitative],[any]),audience(e.g.[researcher],[auditor],[naive_ai]), andworkflow_shape(e.g.[experiment_pipeline],[proof],[linear_essay],[interview_study],[systems_benchmark],[any]). The router uses these as a soft filter so the embedding-similarity ranking only competes within the project's declared scope. tool_routenow returns arecommended_actionfield — a single-string hint naming the exact next tool to call (e.g."sys_protocol_get(protocol_name='guidance/project_startup', format='summary')"). Saves the AI one round-trip of reasoning per turn.- MCP
Serverinstantiation now passes theinstructions=field with the canonical per-turn sequence (sys_boot → tool_route → sys_protocol_get format=summary → sys_active_tools), so any MCP client that surfaces server-supplied instructions sees the right startup ritual without having to call sys_help.
Deprecated
- Legacy per-dimension audit tool names —
tool_audit_assumptions,tool_audit_citations,tool_audit_claims,tool_audit_cliches,tool_audit_code_quality,tool_audit_coherence,tool_audit_cross_deliverable_consistency,tool_audit_dashboard_content,tool_audit_evalue,tool_audit_figure,tool_audit_figure_coverage,tool_audit_figure_full,tool_audit_figure_interactivity,tool_audit_figure_quality,tool_audit_power,tool_audit_prose,tool_audit_reproducibility,tool_audit_reviewer_responses,tool_audit_statistical_power,tool_audit_step_completeness,tool_audit_step_literature,tool_audit_synthesis,tool_audit_version_coherence,tool_audit_findings_query,tool_audit_findings_diff. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy dashboard tool names —
tool_dashboard_create,tool_dashboard_story_generate,tool_dashboard_story_edit,tool_dashboard_story_quality_bar,tool_dashboard_reviewer_sim,tool_dashboard_test_generate,tool_dashboard_test_run. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy step tool names —
tool_step_iterate,tool_step_iterations_list,tool_step_revision_options,tool_step_env_lock,tool_step_pipeline_define,tool_step_pipeline_run,tool_step_pipeline_status,tool_step_pipeline_diagram. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy sensitivity + preregister tool names —
tool_sensitivity_define,tool_sensitivity_run,tool_preregister_freeze,tool_preregister_diff. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy reviewer tool names —
tool_reviewer_simulate,tool_response_to_reviewers,tool_rebuttal_draft,tool_reviewer_response_compile. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy data + figure + thought tool names —
tool_data_sample,tool_data_profile,tool_data_convert,tool_figure_palette,tool_figure_caption_synthesise,tool_figure_interactive_autogen,tool_paper_figures_autoembed,tool_thought_log,tool_thought_trace. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy scratch + task tool names —
tool_scratch_write,tool_scratch_run,tool_scratch_list,tool_scratch_clear,tool_task_run,tool_task_status,tool_task_list,tool_task_kill. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md. - Legacy sys_config + sys_env tool names —
sys_config_get,sys_config_set,sys_config_validate,sys_env_snapshot,sys_env_docker_generate. All continue to work via alias dispatch through v2.0.x; scheduled for hard-removal in v2.1.0 perdocs/V2_MIGRATION_TABLE.md.
Removed
- Phase 14a — first-wave consolidation aliases hard-removed. The 21 legacy tool names introduced as consolidation aliases in v1.6.1 have expired their 4-minor-version deprecation runway and are now removed. Calling any of them returns a friendly
_REMOVED_TOOLSerror envelope naming the canonical v2 entry point. Old plans, scripts, or third-party callers that still name these will see a clear migration message instead of a generic "unknown tool" error.- Search cluster (5):
tool_search_semantic_scholar,tool_search_pubmed,tool_search_crossref,tool_search_arxiv,tool_search_web→ calltool_search(query=..., source='semantic_scholar'|'pubmed'|'crossref'|'arxiv'|'web')instead. - Plan cluster (3):
tool_plan_turn,tool_plan_advance,tool_plan_clear→ calltool_plan(operation='turn'|'advance'|'clear')instead. - Grounding / verify cluster (4):
tool_grounding_register,tool_ground_from_context→ calltool_ground(mode='explicit'|'from_context', ...).tool_claim_verify,tool_grounding_verify→ calltool_verify(scope='claim'|'project', ...). - Lessons cluster (2):
tool_lessons_record,tool_lessons_consult→ calltool_lessons(operation='record'|'consult', ...)instead. (Other lessons-family aliases remain deprecated for the v2.0.x runway.) - Path cluster (3):
sys_path_create,sys_path_abandon,sys_path_list→ callsys_path(operation='create'|'abandon'|'list', ...)instead. - Memory cluster (4):
mem_methods_append,mem_decision_log,mem_hypothesis_update,mem_analysis_log→ callmem_log(kind='methods'|'decision'|'hypothesis'|'analysis', ...)instead.
- Search cluster (5):
- The corresponding
TOOL_DEFINITIONSentries and_HANDLERSentries were dropped (handler functions like_handle_sys_path_createremain in the module — they're called internally by the consolidated dispatchers via the legacy fallback path).tool_log_decision, the silent pre-v1.6.1 nickname that previously chained throughmem_decision_log → mem_log, now resolves directly tomem_logwithkind='decision'injected so the nickname keeps working. - Phase 14b — tikzposter LaTeX poster path hard-removed. The legacy
create_poster()+_poster_tex_escape()functions undersrc/research_os/tools/actions/synthesis/latex.py(387 lines) are deleted.tool_poster_createis unchanged on the surface (Typst engine is the only path); theengine='latex'branch in_handle_tool_poster_createnow returns a structured error pointing callers at the Typst surface. The legacylayout/audienceLaTeX-only kwargs are no longer documented (engineis retained on the schema for back-compat with a hard-error guard).researcher_config.synthesis.poster_engineis pinned to"typst"— the validator enum now rejects"latex". Protocolsynthesis/printableupdated (template: tikzposter→academic_36x48/academic_48x36/public_24x36per audience; description text re-pointed at Typst). Router index summary + triggers updated. Docs (PROTOCOLS.md,RESEARCHER_GUIDE.md,TOOLS.md,ROADMAP.md) re-pointed at Typst. Testtest_legacy_tikzposter_path_still_worksreplaced bytest_legacy_tikzposter_create_poster_is_goneand the enum-shape test now asserts"latex" not in synthesis.poster_engine._REMOVED_TOOLSentries added fortool_poster_create_latex/tool_poster_compile_latex(nicknames that were never real tools but a future caller might try). Audit of all other handlers inserver.pyfound zero truly orphan handlers; every handler is referenced either by the dispatcher map or called internally by a Phase 9 consolidator. - Phase 14d — dead config fields removed. Five
researcher_config.yamlfields identified by the v1.9.2 Lens-7 audit as declared-but-never-read are removed from the on-disk template, the in-codeCONFIG_TEMPLATEconstant, anddocs/RESEARCHER_GUIDE.md. None of these fields had any consumer insrc/,tests/, or shipped protocols — the comments that claimed they were "Read bymethodology/pick_tool_stack" were inaccurate. Thepick_tool_stackprotocol picks language + library purely from method + field-practice + literature signal; it never consulted these fields. Existing projects on prior versions that hand-set these keys are unaffected (the keys silently become unknown extras;validate_configdoes not enforce key membership).- Runtime cluster (1):
runtime.default_n_for_sampling— no caller anywhere;tool_data(operation='sample')takes itsnargument from the tool call, not from config. - Tool-stack cluster (4): entire
tool_stack:block removed:tool_stack.preferred_languages,tool_stack.allow_mixed_language_steps,tool_stack.field_practice_overrides_preference,tool_stack.cite_field_practice_when_choosing. Themethodology/pick_tool_stackprotocol itself is unchanged — it asks the AI to pick based on method + literature signal + env compatibility, never consulting these config fields.
- Runtime cluster (1):
- Trivial dead-variable cleanup in
tools/actions/audit/audit.py:651(f_stat,f_pfromhet_breuschpaganunpack are now_f_stat,_f_p— they were tuple-discard placeholders flagged by Lens-9 as the one realvulture --min-confidence 80finding still present after v1.9.3's larger sweep).
Fixed
sys_tool_describeNameErrorregression introduced by the Phase 10
server-package split —meta_routing.pyreferenced_resolve_tool_name
which wasn't re-exported from_handlers_runtime.py. Fixed in commit
0c45b79: added the import +__all__entry so the introspection path
works from the firstlist_tools()call.tool_audit(scope='synthesis', dimension='all')raised a bare
KeyErroronpaper_path. Fixed in commit0c45b79: handler now
defaults to'synthesis/paper.md'(matches what theaudit_synthesis
worker already assumes when callers omit the kwarg).- MCP
Server()instance now reports the canonical__version__at the
MCP initialize handshake instead of hard-coding'0.1.0'. Fixed in
commitb3b24a0(phase-13 follow-up).
Validation
- Phase 15b re-validation: 20 agents × 4 perspectives × 5 scenarios = 20
independent runs against the v2.0.0 candidate. Meanfinal_rating
moved 6.35 → 7.70 (+1.35; +21%), total HIGH-friction items
124 → 63 (-49%), first-5-turn HIGH 66 → 42 (-36%), deliverable
rate 11/20 → 14/20 (+15 pp). Every cell of the 4×5 matrix moved up by
+0.7 to +1.9 points; no regressions in any of the 20 runs. Full
per-perspective × per-scenario rating table, friction-event delta, the
carryover deferral list (v2.0.1 patch / v2.1.0 minor / v3.0.0 major),
and the YELLOW shipping recommendation at
docs/V2_VALIDATION_REPORT.md.
Migration
- Most projects work unchanged — every consolidated tool name keeps
dispatching via_DEPRECATED_ALIASES+_ALIAS_PARAM_INJECTIONfor
the v2.0.x runway. Hard removal of the v2.0 deprecated aliases is
scheduled for v2.1.0. Full instructions, breaking-change details,
per-surface recipes, and the complete old→new tool table at
docs/MIGRATION_v1_to_v2.md(the v1→v2
upgrade guide) plusdocs/V2_MIGRATION_TABLE.md
(the running ledger of every old→new consolidation). - Release-shaped overview, headline numbers, and YELLOW caveat at
docs/V2_RELEASE_NOTES.md.