Skip to content

Add add-inference-model Claude skill#2247

Open
hansent wants to merge 2 commits intomainfrom
claude-skill/add-inference-model
Open

Add add-inference-model Claude skill#2247
hansent wants to merge 2 commits intomainfrom
claude-skill/add-inference-model

Conversation

@hansent
Copy link
Copy Markdown
Collaborator

@hansent hansent commented Apr 20, 2026

Summary

  • instruct claude that this is a living skill / should update the skill itself when learnign new things while adding a model
  • First draft of .claude/skills/add-inference-model/SKILL.md — an end-to-end playbook Claude invokes when the user asks to add a new core/pre-trained model.
  • Four surfaces a new model can touch: inference_models class + registry entry (always), weight zips + registration script via roboflow/model-registry-sdk (always), workflow block (optional), inference/core/models/inference_models_adapters.py (only if plain /infer endpoint serving is needed).
  • Explicit note that inference/models/<family>/ is deprecated — new core models always go in inference_models/.
  • Tells the skill to survey same-(backend, task) siblings in inference_models/.../models/ before scaffolding rather than copying any one model's idioms.

Written right after adding SAM2 Video (PRs #2245 + roboflow/model-registry-sdk#5) so the gotchas section captures real surprises from that work. Framed as a living document — future model additions should extend the file, not rewrite it.

Test plan

  • Trigger the skill on a new model-adding prompt and confirm it surveys inference_models/ before scaffolding
  • On the next model ship, confirm the gotchas / step notes hold and add anything new that surfaces

First draft of .claude/skills/add-inference-model/SKILL.md, capturing
the end-to-end playbook for shipping a new pre-trained core model
through inference_models + the model-registry-sdk.

Covers the four surfaces a new model can touch (inference_models
class + registry entry, weight zips + registration script, workflow
block, legacy /infer adapter), notes that inference/models/<family>/
is deprecated, and instructs the skill to survey same-(backend, task)
siblings in inference_models before scaffolding rather than copying
any single model's idioms.

Gotchas section captures the handful of real surprises collected
while shipping the first model through this pattern; this file is
explicitly a living document — future additions should extend it
rather than rewrite.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new Claude “skill” document that serves as an end-to-end playbook for adding new core/pre-trained models, including required integration points (model class/registry, weight zips + model-registry registration) and optional wiring (workflow blocks, /infer adapters).

Changes:

  • Introduces .claude/skills/add-inference-model/SKILL.md with step-by-step guidance for implementing, registering, testing, and packaging new core models
  • Documents the “four surfaces” a new model can touch, plus a verification checklist and “gotchas” captured from recent model work
  • Includes operational notes for staging/prod registration and local execution pitfalls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +101 to +108
**Run from `inference_models/` cwd** — the package has its own `pytest.ini` and a path-sensitive `conftest.py`:

```bash
cd inference_models
python -m pytest tests/unit_tests/models/test_<family>.py -W ignore
```

Running from the repo root silently mis-collects.
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section says there is a “path-sensitive conftest.py” under inference_models/, but there is no inference_models/tests/conftest.py in this repo. If the real issue is picking up the correct pytest.ini, consider rephrasing to: “run from inference_models/ so pytest uses inference_models/pytest.ini (or use pytest -c inference_models/pytest.ini ... from repo root)”.

Copilot uses AI. Check for mistakes.

### 8. Inference-models adapter (surface 4, optional)

Skip unless the user wants plain `/infer` endpoint support. Add a subclass of `Model` to `inference/core/models/inference_models_adapters.py` matching your task (there are per-task parents: object detection, instance segmentation, classification, keypoints, semantic segmentation, etc. — read the existing adapters in that file). The adapter's `__init__` calls `AutoModel.from_pretrained(model_id, api_key=api_key)` and stores the result; predict / infer methods delegate.
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The adapter guidance here doesn’t match the actual adapter pattern in inference/core/models/inference_models_adapters.py: adapters call AutoModel.from_pretrained(model_id_or_path=..., ...) and pass additional flags like allow_untrusted_packages, allow_direct_local_storage_loading, backend selection, etc. Recommend updating this sentence to reference the real signature/keyword (model_id_or_path) and to explicitly say “follow the existing adapter constructors in that file” to avoid copy/paste mistakes.

Suggested change
Skip unless the user wants plain `/infer` endpoint support. Add a subclass of `Model` to `inference/core/models/inference_models_adapters.py` matching your task (there are per-task parents: object detection, instance segmentation, classification, keypoints, semantic segmentation, etc. — read the existing adapters in that file). The adapter's `__init__` calls `AutoModel.from_pretrained(model_id, api_key=api_key)` and stores the result; predict / infer methods delegate.
Skip unless the user wants plain `/infer` endpoint support. Add a subclass of `Model` to `inference/core/models/inference_models_adapters.py` matching your task (there are per-task parents: object detection, instance segmentation, classification, keypoints, semantic segmentation, etc. — read the existing adapters in that file). In the adapter `__init__`, follow the existing adapter constructors in that file: they call `AutoModel.from_pretrained(model_id_or_path=..., ...)` and pass through the additional flags they need (for example `allow_untrusted_packages`, `allow_direct_local_storage_loading`, backend selection, etc.), then store the result; predict / infer methods delegate.

Copilot uses AI. Check for mistakes.

Run `AutoModel.from_pretrained("<arch>/<variant>", api_key=<staging-key>)` against staging (set `ROBOFLOW_ENVIRONMENT=staging` or `ROBOFLOW_API_HOST=https://api.roboflow.one`) and exercise the model with a real input. If surface 3 was built, also run it through `debugrun.py` / a short MP4 via `InferencePipeline`. If surface 4 was built, hit `/infer?model_id=<arch>/<variant>`.

**When running `debugrun.py` or the inference server from the repo root**, pass `PYTHONSAFEPATH=1`. The `inference_models/` project dir (at repo root, no `__init__.py`) is treated by Python as an empty namespace package and shadows the editable-installed `inference_models`. `PYTHONSAFEPATH=1` (or `python -P`) stops Python from auto-adding the script dir to `sys.path` and the shadow disappears.
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PYTHONSAFEPATH=1 / python -P only works on newer Python versions (and -P isn’t available on Python 3.10, which this repo still tests in CI). Suggest noting the minimum Python version for this workaround and/or providing an alternative for 3.10 (e.g., run via python -m ... from an installed env, or adjust PYTHONPATH/working directory) so contributors don’t hit an “unknown option”/no-op.

Suggested change
**When running `debugrun.py` or the inference server from the repo root**, pass `PYTHONSAFEPATH=1`. The `inference_models/` project dir (at repo root, no `__init__.py`) is treated by Python as an empty namespace package and shadows the editable-installed `inference_models`. `PYTHONSAFEPATH=1` (or `python -P`) stops Python from auto-adding the script dir to `sys.path` and the shadow disappears.
**When running `debugrun.py` or the inference server from the repo root**, avoid letting the repo-root `inference_models/` directory shadow the editable-installed `inference_models` package. On newer Python versions that support it, you can use `PYTHONSAFEPATH=1` (or `python -P`) so Python does not auto-add the script directory to `sys.path`. **Do not rely on `python -P` on Python 3.10**. For Python 3.10, prefer running from an installed environment via `python -m ...` instead of invoking a repo-root script directly, or adjust your `PYTHONPATH` / working directory so the repo-root namespace package is not on `sys.path`.

Copilot uses AI. Check for mistakes.
Before touching files, get concrete answers:

1. **Architecture name** (registry key string) — lower-case, hyphens OK, no slashes. This is the string matched in `models_registry.py`.
2. **Task type** — must match both the `TaskType` enum in `model_registry_utils.http.weights_service` **and** the task-type constant in `inference_models/.../models_registry.py`.
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The guidance for task types references a TaskType enum in model_registry_utils.http.weights_service, but that module doesn’t exist in this repo and TaskType in inference_models.models.auto_loaders.entities is just Optional[str] (not an enum). Suggest rewording to point to the concrete task constants defined in inference_models/inference_models/models/auto_loaders/models_registry.py (e.g. OBJECT_DETECTION_TASK, etc.) and to the model-registry API/schema docs (or the SDK) for the service-side accepted strings, rather than an in-repo enum that isn’t present.

Suggested change
2. **Task type**must match both the `TaskType` enum in `model_registry_utils.http.weights_service` **and** the task-type constant in `inference_models/.../models_registry.py`.
2. **Task type**choose one of the concrete task constants defined in `inference_models/inference_models/models/auto_loaders/models_registry.py` (for example `OBJECT_DETECTION_TASK`, etc.), and verify the exact service-side accepted string against the model-registry API/schema docs or the model-registry SDK.

Copilot uses AI. Check for mistakes.

For **shared plumbing across several HF models** (sessioned video trackers, etc.), check `inference_models/inference_models/models/common/` for reusable bases before writing your own.

Read `inference_models/docs/contributors/adding-model.md` and `writing-tests.md` — they cover the `from_pretrained` contract in more depth than this skill.
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writing-tests.md is referenced without a path, but the doc lives under inference_models/docs/contributors/writing-tests.md. Consider linking to the full path (and keeping the two links consistent) so readers can navigate directly from GitHub.

Suggested change
Read `inference_models/docs/contributors/adding-model.md` and `writing-tests.md` — they cover the `from_pretrained` contract in more depth than this skill.
Read `inference_models/docs/contributors/adding-model.md` and `inference_models/docs/contributors/writing-tests.md` — they cover the `from_pretrained` contract in more depth than this skill.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants