Skip to content

Draft Hatch-Schemas v2.0.0 extension schema proposal #27

@LittleCoinCoin

Description

@LittleCoinCoin

Context

Hatch-Schemas v1.2.2 defines the CrackingShells package extension schema. The TPC hackathon alignment strategy requires a v2.0.0 release that narrows the schema to the io.crackingshells.hatch extension block in server.json._meta and introduces three new additions.

Your task is to propose a refined schema as a draft PR. The proposal will be reviewed and ratified at the Day 1 kickoff meeting with all hackathon teams. The branch must not be merged before that meeting — ratification may adjust field names, optionality, or enum values.


Existing Schema — v1.2.2 Baseline

Source: package/v1.2.2/hatch_pkg_metadata_schema.json
Architecture reference: __reports__/resources/00-hatch_schemas_architecture_v0.md

Current extension structure (simplified)

{
  "package_schema_version": "1.2.2",
  "name": "<reverse-dns>",
  "version": "...",
  "description": "...",
  "tags": [...],
  "author": { "name": "..." },
  "license": { "name": "..." },
  "entry_point": {
    "mcp_server": "<path>",
    "hatch_mcp_server": "<path>"
  },
  "dependencies": {
    "python": [
      {
        "name": "...",
        "version_constraint": ">=1.0.0",
        "package_manager": "pip | conda",
        "channel": "conda-forge"
      }
    ],
    "system": [
      { "name": "...", "version_constraint": "...", "package_manager": "apt | brew" }
    ],
    "docker": [
      { "name": "...", "tag": "...", "digest": "sha256:..." }
    ],
    "hatch": [
      { "name": "io.github.owner/pkg", "version_constraint": "==0.3.1" }
    ]
  }
}

v1.2.2 invariants to preserve

  • entry_point.mcp_server and entry_point.hatch_mcp_server both required (dual entry points since v1.2.1)
  • channel only valid when package_manager == "conda"
  • docker.digest required (sha256:... format)
  • All schema files are JSON Schema Draft-07

Proposed v2.0.0 Additions

Reference: __reports__/crackingshells_alignment_strategy/01-alignment_strategy_v1.md § The Scientific Extension Contract

The three additions below are proposals — this issue asks you to turn them into actual JSON Schema Draft-07 definitions and flag any concerns before the meeting.

1. schema_version (new required top-level field)

Allows Hatch CLI to detect which schema era a package uses and select the right parser.

schema_version: string   // must match a Hatch-Schemas release tag, e.g. "2.0.0"

Suggested JSON Schema constraint: { "type": "string", "pattern": "^\\d+\\.\\d+\\.\\d+$" }

2. citations[] — discriminated union (new optional array)

Replaces the previous flat doi / url fields. Each entry identifies one academic or software reference in a specific citation format.

Citation {
  format:  enum   // "doi" | "arxiv" | "pmid" | "isbn" | "url"
                  // | "bibtex" | "ris" | "csl-json" | "formatted"
  value:   string // identifier or full citation text; semantics depend on format
  note?:   string // "primary paper" | "dataset" | "software" | free text
}

Value patterns by format (for Hatch-Validator to enforce):

format expected value pattern
doi ^10\.\d{4,}/\S+$
arxiv ^\d{4}\.\d{4,5}(v\d+)?$ or https://arxiv.org/abs/...
pmid ^\d+$
isbn ^(97[89])?\d{9}[\dX]$
url valid URI
bibtex, ris, csl-json, formatted free string

The format enum is owned by Hatch-Schemas — new formats are added via schema minor version bump, not by package authors.

3. provenance{} — optional object

Reproducibility metadata. Distinct from version (semver): two builds tagged v1.0.0 can differ if the lockfile changed without a tag bump. git_sha is the precise reproducibility anchor; build_env tells air-gap tooling which locking strategy was used.

Provenance {
  git_sha?:   string   // 40-char hex or short SHA
  build_env?: enum     // "conda-lock" | "pip-compile" | "manual"
}

Constraint: if provenance is present, at least one of git_sha or build_env must be set.


Your Tasks

  1. Create branch feat/schema-v2.0.0-extension from main
  2. Create package/v2.0.0/hatch_pkg_metadata_schema.json as a valid JSON Schema Draft-07 document incorporating all v1.2.2 fields plus the three proposed additions
  3. Propose refinements — if you see problems with the discriminated union design, enum values, or field names, add your notes to the PR description before the Day 1 meeting
  4. Open a draft PR targeting main with title feat(schema): v2.0.0 extension schema — citations, provenance, schema_version
  5. Mark the PR ready for review only after the Day 1 kickoff meeting ratifies the schema
  6. Complete the sub-issue below alongside this work

Acceptance Criteria

  • Branch feat/schema-v2.0.0-extension exists
  • package/v2.0.0/hatch_pkg_metadata_schema.json is valid JSON Schema Draft-07
  • schema-validation.yml CI passes on the new schema file
  • citations array accepts all 9 format values and rejects unknown format strings
  • citations[].value is pattern-validated for structured formats (doi, arxiv, pmid, isbn)
  • provenance is optional; when present the anyOf [git_sha, build_env] constraint is enforced
  • Draft PR is open with your refinement notes before the Day 1 kickoff
  • Sample server.json (Issue Workflow updates #4) validates successfully against the new schema

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions