Skip to content

ingest: scaffold mcp/ alongside toolkit.yaml so 'validate' passes #2

@tonymenzo

Description

@tonymenzo

Problem

scitoolkit ingest writes toolkit.yaml but doesn't scaffold the
mcp/ directory. scitoolkit validate requires mcp/__init__.py
and mcp/server_stdio.py to exist
(scitoolkit/validation.py:469-479), so the next step printed by
ingest's own summary —

Next steps:
  - Edit toolkit.yaml metadata ...
  - Run scitoolkit validate.

— is misleading: validate fails out of the box on a freshly-ingested
toolkit. The author has to know to copy mcp/ from somewhere
(typically a toolkit they previously created with scitoolkit init).

Evidence the gap is known

tests/e2e/run_ingest_e2e.py:144-148 papers over it explicitly:

# mcp/ dir scaffolded to satisfy the existing validation check.
(repo / "mcp").mkdir(exist_ok=True)
(repo / "mcp" / "__init__.py").write_text("", encoding="utf-8")
(repo / "mcp" / "server_stdio.py").write_text("", encoding="utf-8")

Note the e2e writes empty files just to clear the validation check.
That's not a real MCP server — it's a workaround that masks the gap.

Suggested fix

scitoolkit init already does the right thing in
scitoolkit/toolkit.py:196-205: it renders
scitoolkit/templates/mcp/__init__.py.template and
scitoolkit/templates/mcp/server_stdio.py.template with the toolkit's
name/version/description. Extract that block into a small helper —
e.g. _scaffold_mcp_dir(toolkit_path, substitutions) — and call it
from both init and ingest.

For ingest specifically:

  • Pull name / version / description from the existing
    toolkit.yaml (so the rendered server_stdio.py uses real values, not
    placeholders). If those keys are still TODO placeholders, render with
    the placeholders unchanged — author will edit them anyway.
  • Skip writing if mcp/server_stdio.py already exists, so re-running
    ingest doesn't clobber a customized server. (Match the
    refuse-to-overwrite shape ingest already uses for toolkit.yaml.)
  • Mention the scaffolded files in the Next-steps summary so the user
    knows new files appeared.

Relevant files

  • scitoolkit/ingest.pyingest() entry point, currently yaml-only
  • scitoolkit/toolkit.py:196-205 — mcp scaffolding to extract
  • scitoolkit/templates/mcp/*.template — the templates themselves
  • scitoolkit/validation.py:469-479 — what validate checks for
  • scitoolkit/cli.py:556-667 — ingest command + Next-steps text
  • tests/e2e/run_ingest_e2e.py:144-148 — workaround that should
    delete itself once this is fixed

Related

#1 (silent-drop warning for unresolvable module paths) — separate
issue, also surfaced while onboarding heptapod.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions