Skip to content

Define the schema v2 tap catalog contract #16

@eric-tramel

Description

@eric-tramel

Parent epic: #15

Why

catalog/plugins.json is currently schema v1. It includes enough metadata for read-only listing and compatibility preflight, but it does not define the install source or docs URL required for data-designer plugins info and data-designer plugins install.

This issue is the concrete schema contract. Implementers should not treat the schema shape as open-ended: schema v2 should be the JSON contract below unless a later review deliberately changes it.

Proposed schema v2

The top-level document must be:

{
  "schema_version": 2,
  "plugins": [
    {
      "name": "document-chunker",
      "plugin_type": "seed-reader",
      "description": "Read local documents as chunked seed records",
      "package": {
        "name": "data-designer-retrieval-sdg",
        "version": "0.1.0",
        "path": "plugins/data-designer-retrieval-sdg"
      },
      "entry_point": {
        "group": "data_designer.plugins",
        "name": "document-chunker",
        "value": "data_designer_retrieval_sdg.plugins:document_chunker_plugin"
      },
      "compatibility": {
        "python": {"specifier": ">=3.10"},
        "data_designer": {
          "requirement": "data-designer>=0.5.7",
          "specifier": ">=0.5.7",
          "marker": null
        }
      },
      "source": {
        "type": "pypi",
        "package": "data-designer-retrieval-sdg"
      },
      "docs": {
        "url": "https://nvidia-nemo.github.io/DataDesignerPlugins/plugins/data-designer-retrieval-sdg/"
      }
    }
  ]
}

Required top-level fields:

  • schema_version: integer literal 2.
  • plugins: array of plugin entries sorted deterministically by package name, then runtime plugin name.

Required plugin entry fields:

  • name: non-empty runtime plugin name, equal to Plugin.name from the loaded entry point.
  • plugin_type: one of column-generator, seed-reader, or processor.
  • description: package-level [project].description string.
  • package.name: PEP 503-compatible package name from [project].name.
  • package.version: PEP 440 version from [project].version.
  • package.path: repo-relative package directory, for example plugins/data-designer-template. This is repo-local metadata only and must not be treated as the install source for remote taps.
  • entry_point.group: literal data_designer.plugins.
  • entry_point.name: installed entry-point key from [project.entry-points."data_designer.plugins"].
  • entry_point.value: import target from the same entry-point table.
  • compatibility.python.specifier: validated [project].requires-python specifier.
  • compatibility.data_designer.requirement: exact direct dependency string for data-designer.
  • compatibility.data_designer.specifier: parsed version specifier from the direct data-designer dependency.
  • compatibility.data_designer.marker: string marker from the direct dependency, or null.
  • source: one install source object, using the source union below.
  • docs.url: absolute HTTP(S) URL for the plugin docs page.

Source union:

{"type": "pypi", "package": "data-designer-example"}

Use for released packages on PyPI. DataDesigner should derive the default exact install target from source.package plus package.version, for example data-designer-example==0.1.0, unless the user explicitly requests a resolver-driven/latest workflow.

{
  "type": "git",
  "url": "https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git",
  "ref": "data-designer-example/v0.1.0",
  "subdirectory": "plugins/data-designer-example"
}

Use for direct Git subdirectory installs. DataDesigner should derive a PEP 508 style install target like data-designer-example @ git+https://github.com/NVIDIA-NeMo/DataDesignerPlugins.git@data-designer-example/v0.1.0#subdirectory=plugins/data-designer-example.

{"type": "path", "path": "plugins/data-designer-example", "editable": true}

Use only for local/catalog-file authoring workflows. The default NVIDIA raw catalog must not use path sources.

Validation rules

  • schema_version must be rejected by consumers when unsupported.
  • entry_point.group must be exactly data_designer.plugins.
  • Runtime plugin names must be unique within one catalog.
  • Multiple entries may share the same package and source; this is how multi-plugin packages are represented.
  • Invalid PEP 440 versions, invalid specifiers, missing direct data-designer dependency, stale installed entry points, and malformed source objects must fail catalog generation or schema validation.

Acceptance criteria

  • A docs page or schema file defines the contract above without leaving source/schema shape as an open decision.
  • The contract includes examples for pypi, git, and path source objects.
  • The contract states that tap discovery and runtime entry-point discovery are separate layers.
  • The contract states that package.path is not an install source for remote taps.
  • The contract states that multi-plugin packages are multiple plugin entries sharing one package/source.

Dependencies

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or requestplugin tapPlugin catalog and tap ecosystem work

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions