Conversation
First draft of .claude/skills/add-inference-model/SKILL.md, capturing the end-to-end playbook for shipping a new pre-trained core model through inference_models + the model-registry-sdk. Covers the four surfaces a new model can touch (inference_models class + registry entry, weight zips + registration script, workflow block, legacy /infer adapter), notes that inference/models/<family>/ is deprecated, and instructs the skill to survey same-(backend, task) siblings in inference_models before scaffolding rather than copying any single model's idioms. Gotchas section captures the handful of real surprises collected while shipping the first model through this pattern; this file is explicitly a living document — future additions should extend it rather than rewrite.
There was a problem hiding this comment.
Pull request overview
Adds a new Claude “skill” document that serves as an end-to-end playbook for adding new core/pre-trained models, including required integration points (model class/registry, weight zips + model-registry registration) and optional wiring (workflow blocks, /infer adapters).
Changes:
- Introduces
.claude/skills/add-inference-model/SKILL.mdwith step-by-step guidance for implementing, registering, testing, and packaging new core models - Documents the “four surfaces” a new model can touch, plus a verification checklist and “gotchas” captured from recent model work
- Includes operational notes for staging/prod registration and local execution pitfalls
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| **Run from `inference_models/` cwd** — the package has its own `pytest.ini` and a path-sensitive `conftest.py`: | ||
|
|
||
| ```bash | ||
| cd inference_models | ||
| python -m pytest tests/unit_tests/models/test_<family>.py -W ignore | ||
| ``` | ||
|
|
||
| Running from the repo root silently mis-collects. |
There was a problem hiding this comment.
This section says there is a “path-sensitive conftest.py” under inference_models/, but there is no inference_models/tests/conftest.py in this repo. If the real issue is picking up the correct pytest.ini, consider rephrasing to: “run from inference_models/ so pytest uses inference_models/pytest.ini (or use pytest -c inference_models/pytest.ini ... from repo root)”.
|
|
||
| ### 8. Inference-models adapter (surface 4, optional) | ||
|
|
||
| Skip unless the user wants plain `/infer` endpoint support. Add a subclass of `Model` to `inference/core/models/inference_models_adapters.py` matching your task (there are per-task parents: object detection, instance segmentation, classification, keypoints, semantic segmentation, etc. — read the existing adapters in that file). The adapter's `__init__` calls `AutoModel.from_pretrained(model_id, api_key=api_key)` and stores the result; predict / infer methods delegate. |
There was a problem hiding this comment.
The adapter guidance here doesn’t match the actual adapter pattern in inference/core/models/inference_models_adapters.py: adapters call AutoModel.from_pretrained(model_id_or_path=..., ...) and pass additional flags like allow_untrusted_packages, allow_direct_local_storage_loading, backend selection, etc. Recommend updating this sentence to reference the real signature/keyword (model_id_or_path) and to explicitly say “follow the existing adapter constructors in that file” to avoid copy/paste mistakes.
| Skip unless the user wants plain `/infer` endpoint support. Add a subclass of `Model` to `inference/core/models/inference_models_adapters.py` matching your task (there are per-task parents: object detection, instance segmentation, classification, keypoints, semantic segmentation, etc. — read the existing adapters in that file). The adapter's `__init__` calls `AutoModel.from_pretrained(model_id, api_key=api_key)` and stores the result; predict / infer methods delegate. | |
| Skip unless the user wants plain `/infer` endpoint support. Add a subclass of `Model` to `inference/core/models/inference_models_adapters.py` matching your task (there are per-task parents: object detection, instance segmentation, classification, keypoints, semantic segmentation, etc. — read the existing adapters in that file). In the adapter `__init__`, follow the existing adapter constructors in that file: they call `AutoModel.from_pretrained(model_id_or_path=..., ...)` and pass through the additional flags they need (for example `allow_untrusted_packages`, `allow_direct_local_storage_loading`, backend selection, etc.), then store the result; predict / infer methods delegate. |
|
|
||
| Run `AutoModel.from_pretrained("<arch>/<variant>", api_key=<staging-key>)` against staging (set `ROBOFLOW_ENVIRONMENT=staging` or `ROBOFLOW_API_HOST=https://api.roboflow.one`) and exercise the model with a real input. If surface 3 was built, also run it through `debugrun.py` / a short MP4 via `InferencePipeline`. If surface 4 was built, hit `/infer?model_id=<arch>/<variant>`. | ||
|
|
||
| **When running `debugrun.py` or the inference server from the repo root**, pass `PYTHONSAFEPATH=1`. The `inference_models/` project dir (at repo root, no `__init__.py`) is treated by Python as an empty namespace package and shadows the editable-installed `inference_models`. `PYTHONSAFEPATH=1` (or `python -P`) stops Python from auto-adding the script dir to `sys.path` and the shadow disappears. |
There was a problem hiding this comment.
PYTHONSAFEPATH=1 / python -P only works on newer Python versions (and -P isn’t available on Python 3.10, which this repo still tests in CI). Suggest noting the minimum Python version for this workaround and/or providing an alternative for 3.10 (e.g., run via python -m ... from an installed env, or adjust PYTHONPATH/working directory) so contributors don’t hit an “unknown option”/no-op.
| **When running `debugrun.py` or the inference server from the repo root**, pass `PYTHONSAFEPATH=1`. The `inference_models/` project dir (at repo root, no `__init__.py`) is treated by Python as an empty namespace package and shadows the editable-installed `inference_models`. `PYTHONSAFEPATH=1` (or `python -P`) stops Python from auto-adding the script dir to `sys.path` and the shadow disappears. | |
| **When running `debugrun.py` or the inference server from the repo root**, avoid letting the repo-root `inference_models/` directory shadow the editable-installed `inference_models` package. On newer Python versions that support it, you can use `PYTHONSAFEPATH=1` (or `python -P`) so Python does not auto-add the script directory to `sys.path`. **Do not rely on `python -P` on Python 3.10**. For Python 3.10, prefer running from an installed environment via `python -m ...` instead of invoking a repo-root script directly, or adjust your `PYTHONPATH` / working directory so the repo-root namespace package is not on `sys.path`. |
| Before touching files, get concrete answers: | ||
|
|
||
| 1. **Architecture name** (registry key string) — lower-case, hyphens OK, no slashes. This is the string matched in `models_registry.py`. | ||
| 2. **Task type** — must match both the `TaskType` enum in `model_registry_utils.http.weights_service` **and** the task-type constant in `inference_models/.../models_registry.py`. |
There was a problem hiding this comment.
The guidance for task types references a TaskType enum in model_registry_utils.http.weights_service, but that module doesn’t exist in this repo and TaskType in inference_models.models.auto_loaders.entities is just Optional[str] (not an enum). Suggest rewording to point to the concrete task constants defined in inference_models/inference_models/models/auto_loaders/models_registry.py (e.g. OBJECT_DETECTION_TASK, etc.) and to the model-registry API/schema docs (or the SDK) for the service-side accepted strings, rather than an in-repo enum that isn’t present.
| 2. **Task type** — must match both the `TaskType` enum in `model_registry_utils.http.weights_service` **and** the task-type constant in `inference_models/.../models_registry.py`. | |
| 2. **Task type** — choose one of the concrete task constants defined in `inference_models/inference_models/models/auto_loaders/models_registry.py` (for example `OBJECT_DETECTION_TASK`, etc.), and verify the exact service-side accepted string against the model-registry API/schema docs or the model-registry SDK. |
|
|
||
| For **shared plumbing across several HF models** (sessioned video trackers, etc.), check `inference_models/inference_models/models/common/` for reusable bases before writing your own. | ||
|
|
||
| Read `inference_models/docs/contributors/adding-model.md` and `writing-tests.md` — they cover the `from_pretrained` contract in more depth than this skill. |
There was a problem hiding this comment.
writing-tests.md is referenced without a path, but the doc lives under inference_models/docs/contributors/writing-tests.md. Consider linking to the full path (and keeping the two links consistent) so readers can navigate directly from GitHub.
| Read `inference_models/docs/contributors/adding-model.md` and `writing-tests.md` — they cover the `from_pretrained` contract in more depth than this skill. | |
| Read `inference_models/docs/contributors/adding-model.md` and `inference_models/docs/contributors/writing-tests.md` — they cover the `from_pretrained` contract in more depth than this skill. |
Summary
.claude/skills/add-inference-model/SKILL.md— an end-to-end playbook Claude invokes when the user asks to add a new core/pre-trained model.roboflow/model-registry-sdk(always), workflow block (optional),inference/core/models/inference_models_adapters.py(only if plain/inferendpoint serving is needed).inference/models/<family>/is deprecated — new core models always go ininference_models/.inference_models/.../models/before scaffolding rather than copying any one model's idioms.Written right after adding SAM2 Video (PRs #2245 + roboflow/model-registry-sdk#5) so the gotchas section captures real surprises from that work. Framed as a living document — future model additions should extend the file, not rewrite it.
Test plan
inference_models/before scaffolding