Skip to content

Model Registry

sea-shunned edited this page Dec 18, 2023 · 3 revisions

To unify model definitions, simplify providing information to the Napari plugin, and allow for community contribution we have a model registry repository that defines the models available within AI OnDemand (AIoD).

Contributing

This section covers how to contribute to the registry, whether it's a completely new model, new model version, or a model for a completely new task (e.g. new organelle). Depending on what is contributed, additonal work may be needed (e.g. update/add a Python script to the Segment-Flow pipeline). If so, this will be outlined with relevant links in the sections below.

Contribute a New Model

To add a new model, a new manifest needs to be added to the manifest directory in the repo. Note that if this includes a new task, you'll also need to add a new task.

The schema section below will provide an outline of the bare minimum needed, but you're encouraged to look at previously-created manifests, and ensure that they locally pass validation with the Pydantic model before opening a pull request.

Note

Adding a new (base) model will also need an accompanying Python script and process in the Segment-Flow pipeline. For further details, see the relevant page.

Contribute a New Model Version

To add a new model version, simply add the version to the appropriate existing manifest, then make a pull request where the updated schema will be validated (though you can test eligibility locally to make sure).

See the schema section below for further guidelines on what is needed to define a model version.

Contribute a Model with a New Task

To contribute a model or model version with a new task, the list of available tasks will need to be updated, as these are used to constrain model schemas and define what is available in the Napari plugin.

In the model schema, TASK_NAMES is a dictionary defines that defines the short-hand name (key) and the display name (value) for a given task. Simple add the new key:pair value, and make a pull request to add this new task.

Schema

While schema are not the most readable for humans, a few perspectives are given between:

  • The Pydantic model used for parsing and validation can be found (here)
  • The generated JSON schema from the Pydantic model (here)
  • Existing schema, all in the manifests directory which should help clarify what is needed!

Overall, models are specified hierarchically, from a base model to a model version to a task-specific version. The following information is required for a schema:

  • A model name
  • Model versions
    • For each version, its name and each of the tasks that model is trained for (normally one), and the model location (either a filepath or a URL)
  • Relevant metadata
    • While a DOI is not a requirement, but some basic information about the model is needed, and will be reviewed upon a PR. See the relevant contribution section.

Each model version represents a variant of a base model, where differences may be different input data (e.g. for a different task), different checkpoints/hyperparameters for the same model, or they could even be architectural differences (e.g. varying sizes). Ultimately, as long as the underlying Python script to run the model handles everything needed for that version (if anything extra is needed), then that's enough.

Note

Any parameters given at the root input level apply for all model versions. However, a config_path or list of params can be given to specific model-task-versions if they differ from the root.