-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
Description
Describe the bug
When trying to export any of the Vit based DINOv3 models using coremltools I'm getting the error:
RuntimeError: PyTorch convert function for op 'tensor_split' not implemented.
To Reproduce
Steps to reproduce the behavior:
import torch
import timm
import coremltools as ct
model_name = 'vit_small_patch16_dinov3'
model = timm.create_model(model_name, pretrained=True)
with torch.no_grad():
traced_model = torch.jit.trace(model, dummy_input)
traced_model.eval()
coreml_model = ct.convert(
traced_model,
inputs=[ct.ImageType(
name="input",
shape=dummy_input.shape,
bias=[-0.485/0.229, -0.456/0.224, -0.406/0.225], # ImageNet normalization
scale=1.0/(0.229*255.0) # ImageNet normalization
)],
outputs=[ct.TensorType(name="output")],
convert_to="mlprogram" # Use ML Program format for better compatibility
)Expected behavior
Successful export.
Desktop (please complete the following information):
- OS: macOS Sequoia v 15.5
- timm 1.0.20
- torch 2.8.0
-torch_xla2 0.0.1.dev202412041639
-torchvision 0.23.0
Additional context
After some digging it looks like the issue might originate from the attention layers being used in the timm model version. When I run:
for name, module in model.named_modules():
if 'attn' in name:
print(f"Attention layer: {name} - {type(module)}")It looks like the attention blocks are using timm.models.eva.EvaAttention.
The DINOv2 models appear to use timm.layers.attention.Attention and export with success.
As a last note exporting DINOv3 models reconstituted using the transformers library are also successful, so I'm wondering if swapping timm.models.eva.EvaAttention for timm.layers.attention.Attention might fix the issue?