-
Notifications
You must be signed in to change notification settings - Fork 2
Adding a new model
Step-by-step. Roughly the order to follow on a fresh contribution.
Read AI model policy. If the model fails any of the open-source compliance checks, stop here. Open a discussion or issue if you're unsure – we'd rather have the conversation early than have you do the conversion work for a model we can't ship.
mkdir -p models/<model-id>
touch models/<model-id>/{model.yaml,convert.py,demo.py,README.md}Conventions:
- Model id is lowercase kebab-case.
- Naming pattern is usually
<task>-<arch>[-variant]– e.g.denoise-nind,mask-object-sam21-small. - The id must match the directory name and the top-level directory inside the
.dtmodelarchive (the packaging step takes care of that automatically).
Minimum viable schema:
id: <model-id>
name: "<human-readable name>"
description: "<one-line description>"
task: denoise # denoise | rawdenoise | mask | upscale | embed | depth | erase
version: "0.1" # bump on every meaningful change
backend: onnx
tiling: true # true if the runtime should tile input
type: single # single | split | multi – affects demo.py signature
dep_group: <group-in-pyproject>
coreml_format: mlprogram # optional, opt into CoreML MLProgram
attributes:
input_sizes: [768] # tile size(s) baked into the ONNX
repo:
submodule: vendor/<repo> # optional, if external code is needed
checkpoints:
- url: "https://..."
path: "temp/<model-id>/checkpoint.pth"
model_card:
long_description: ...
scope: ...
author: ...
source: ...
paper: ...
license: ...
training_data: ...
training_data_license: ...
notes: ...
convert:
- script: convert.py
args:
checkpoint: "{temp}/checkpoint.pth"
output: "{output}/model.onnx"
opset: 20Full reference in Pipeline reference.
Must export a convert() function – the pipeline calls it directly. Keep main() with argparse for standalone use.
def convert(checkpoint, output, opset=17, fp16=False):
model = load_model(checkpoint)
export_to_onnx(model, output, opset_version=opset, fp16=fp16)
def main():
import argparse
parser = argparse.ArgumentParser()
parser.add_argument("--checkpoint", required=True)
parser.add_argument("--output", required=True)
parser.add_argument("--opset", type=int, default=17)
parser.add_argument("--fp16", action="store_true")
args = parser.parse_args()
convert(args.checkpoint, args.output, args.opset, args.fp16)
if __name__ == "__main__":
main()Argument names must match the keys in model.yaml's convert.args.
Must export a demo() function. The first arguments depend on the model type:
-
type: single→demo(model, image, output, **kwargs) -
type: split→demo(encoder, decoder, image, output, **kwargs) -
type: multi→demo(model_dir, image, output, **kwargs)
The pipeline passes image and output paths automatically. Per-image arguments go via demo.image_args in model.yaml.
models/<model-id>/README.md must include:
- Source – repo URL, paper, license
- Architecture – brief description
- ONNX I/O – input/output tensors, shapes, dtypes, normalization, tiling
- Selection criteria – table with every row from AI model policy
Existing cards (e.g. denoise-nind) are good templates.
Drop one or two representative images into samples/<task>/. These are what dtai demo runs against.
If the model needs Python packages beyond core, add a group in pyproject.toml:
[dependency-groups]
mymodel = ["torch>=2.0", "some-other-lib"]Reference it via dep_group: mymodel in model.yaml. CI installs only the model's group when checking it.
git submodule add https://github.com/upstream/repo vendor/<name>Pin to a specific tag or commit – don't track main. The pinned SHA lives in the same commit that adds the model.
uv sync --group <dep_group>
uv run dtai run <model-id>Verify the resulting output/<model-id>.dtmodel works in darktable via preferences → AI → import from file….
CI runs the full setup → convert → validate → demo chain. Once everything is green, a maintainer reviews against the policy and merges. The next nightly picks it up; it ships in the next even-minor release.
darktable-ai wiki is licensed under the Creative Commons BY-SA 4.0 terms.