1.3.0.dev20260626003333
·
12 commits
to main
since this release
Installation
Via PyPI
pip install pjrt-plugin-tt==1.3.0.dev20260626003333 --extra-index-url https://pypi.eng.aws.tenstorrent.com/
pip install vllm-tt==1.3.0.dev20260626003333 --extra-index-url https://pypi.eng.aws.tenstorrent.com/Via Docker
docker pull ghcr.io/tenstorrent/tt-xla-slim:1.3.0.dev20260626003333What's Changed
- Xfail Nightly Failed tests by @ctr-pmuruganTT in #5347
- [CI] Use manylinux wheel for tests on uplifts/wheel build changes by @nsumrakTT in #5364
- ci: add manual-build.yml dispatchable build-only workflow by @kmabeeTT in #5365
- Add PCC check to DeepSeek-V4 streaming test by @sshonTT in #5356
- fix
manual-build.ymlstartup_failure (missing permissions) by @kmabeeTT in #5384 - Uplift third_party/tt_forge_models to 8ac99db296d3e5dd87456bb5ff840a1b25fa33ee 2026-06-25 by @vmilosevic in #5374
- Uplift third_party/tt-mlir to aae51da94d33f0de5046338fa3e8a2d663e4081c 2026-06-25 by @vmilosevic in #5302
Full Changelog: 1.3.0.dev20260625003219...1.3.0.dev20260626003333
LLM Performance
| Model | Token/sec/user | Batch | Token/sec | ttft (ms) | Hardware |
|---|---|---|---|---|---|
| Qwen/Qwen2.5-0.5B-Instruct | 228.0 | 32 | 7296.0 | 828.13 | n150 |
| Qwen/Qwen2.5-1.5B-Instruct | 193.0 | 32 | 6176.0 | 1450.65 | n150 |
| Qwen/Qwen2.5-3B-Instruct | 185.0 | 32 | 5920.0 | 2380.85 | n150 |
| Qwen/Qwen2.5-7B-Instruct | 157.0 | 32 | 5024.0 | 4866.43 | n150 |
| Qwen/Qwen3-0.6B | 218.0 | 32 | 6976.0 | 1387.01 | n150 |
| Qwen/Qwen3-1.7B | 199.0 | 32 | 6368.0 | 1718.19 | n150 |
| Qwen/Qwen3-4B | 174.0 | 32 | 5568.0 | 3133.68 | n150 |
| Qwen/Qwen3-8B | 141.0 | 32 | 4512.0 | 4207.92 | n150 |
| meta-llama/Llama-3.1-8B-Instruct | 164.0 | 32 | 5248.0 | 3893.39 | n150 |
| meta-llama/Llama-3.2-1B-Instruct | 248.0 | 32 | 7936.0 | 978.41 | n150 |
| meta-llama/Llama-3.2-3B-Instruct | 187.0 | 32 | 5984.0 | 2243.65 | n150 |
| microsoft/phi-1 | 401.0 | 32 | 12832.0 | 1872.8 | n150 |
| microsoft/phi-1_5 | 394.0 | 32 | 12608.0 | 1879.76 | n150 |
| microsoft/phi-2 | 284.0 | 32 | 9088.0 | 4113.29 | n150 |
| mistralai/Ministral-8B-Instruct-2410 | 156.0 | 32 | 4992.0 | 4090.52 | n150 |
| mistralai/Mistral-7B-Instruct-v0.3 | 281.0 | 32 | 8992.0 | 3670.15 | n150 |
| pytorch_DeepSeek-V3.1_deepseek_v3_1_modified_nlp_causal_lm_custom | 3.0 | 64 | 192.0 | 4160.07 | n150 |
| pytorch_DeepSeek-V3.2_deepseek_v3_2_exp_modified_nlp_causal_lm_custom | 2.0 | 128 | 256.0 | 7492.82 | n150 |
| pytorch_Falcon_3_10B_Base_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1912.48 | n150 |
| pytorch_Falcon_3_10B_Base_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 984.2 | p150 |
| pytorch_Falcon_3_1B_Base_nlp_causal_lm_huggingface | 7.0 | 32 | 224.0 | 871.19 | n150 |
| pytorch_Falcon_3_1B_Base_nlp_causal_lm_huggingface | 14.0 | 32 | 448.0 | 436.97 | p150 |
| pytorch_Falcon_3_3B_Base_nlp_causal_lm_huggingface | 7.0 | 32 | 224.0 | 1015.13 | n150 |
| pytorch_Falcon_3_3B_Base_nlp_causal_lm_huggingface | 13.0 | 32 | 416.0 | 507.14 | p150 |
| pytorch_Falcon_3_7B_Base_nlp_causal_lm_huggingface | 6.0 | 32 | 192.0 | 1332.25 | n150 |
| pytorch_Falcon_3_7B_Base_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 624.36 | p150 |
| pytorch_GLM_4.7_nlp_causal_lm_huggingface | 7.0 | 64 | 448.0 | 1590.4 | n150 |
| pytorch_Gemma_1.1_2B_IT_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1238.92 | n150 |
| pytorch_Kimi-K2.5_kimi_k2_5_modified_nlp_causal_lm_custom | 3.0 | 64 | 192.0 | 4843.86 | n150 |
| pytorch_Kimi-K2_kimi_k2_instruct_modified_nlp_causal_lm_custom | 3.0 | 64 | 192.0 | 4633.34 | n150 |
| pytorch_Llama_3.1_70B_Instruct_nlp_causal_lm_huggingface | 2.0 | 32 | 64.0 | 6284.5 | n150 |
| pytorch_Llama_3.1_8B_Instruct_nlp_causal_lm_huggingface | 4.0 | 32 | 128.0 | 2306.68 | n150 |
| pytorch_Llama_3.1_8B_Instruct_nlp_causal_lm_huggingface | 6.0 | 32 | 192.0 | 789.29 | p150 |
| pytorch_Llama_3.2_1B_Instruct_nlp_causal_lm_huggingface | 8.0 | 32 | 256.0 | 753.85 | n150 |
| pytorch_Llama_3.2_1B_Instruct_nlp_causal_lm_huggingface | 13.0 | 32 | 416.0 | 402.48 | p150 |
| pytorch_Llama_3.2_3B_Instruct_nlp_causal_lm_huggingface | 7.0 | 32 | 224.0 | 775.13 | n150 |
| pytorch_Llama_3.2_3B_Instruct_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 422.54 | p150 |
| pytorch_Mistral_7B_INSTRUCT_v03_nlp_causal_lm_huggingface | 12.0 | 32 | 384.0 | 1278.41 | n150 |
| pytorch_Mistral_7B_INSTRUCT_v03_nlp_causal_lm_huggingface | 21.0 | 32 | 672.0 | 597.0 | p150 |
| pytorch_Mistral_Ministral_8B_Instruct_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 899.43 | p150 |
| pytorch_Mistral_Nemo_INSTRUCT_2407_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 991.94 | p150 |
| pytorch_Mistral_Small_24B_INSTRUCT_2501_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1929.07 | n150 |
| pytorch_Mistral_Small_24B_INSTRUCT_2501_nlp_causal_lm_huggingface | 9.0 | 32 | 288.0 | 1009.42 | p150 |
| pytorch_Phi-1.5_Phi_1_5_nlp_causal_lm_huggingface | 12.0 | 32 | 384.0 | 728.86 | n150 |
| pytorch_Phi-1.5_Phi_1_5_nlp_causal_lm_huggingface | 20.0 | 32 | 640.0 | 349.8 | p150 |
| pytorch_Phi-1_Phi_1_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 700.03 | n150 |
| pytorch_Phi-2_Phi_2_nlp_causal_lm_huggingface | 15.0 | 32 | 480.0 | 605.62 | p150 |
| pytorch_Qwen 2.5_0.5B_Instruct_nlp_causal_lm_huggingface | 7.0 | 32 | 224.0 | 738.71 | n150 |
| pytorch_Qwen 2.5_0.5B_Instruct_nlp_causal_lm_huggingface | 11.0 | 32 | 352.0 | 430.99 | p150 |
| pytorch_Qwen 2.5_1.5B_Instruct_nlp_causal_lm_huggingface | 7.0 | 32 | 224.0 | 785.81 | n150 |
| pytorch_Qwen 2.5_1.5B_Instruct_nlp_causal_lm_huggingface | 10.0 | 32 | 320.0 | 451.76 | p150 |
| pytorch_Qwen 2.5_14B_Instruct_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1150.86 | p150 |
| pytorch_Qwen 2.5_3B_Instruct_nlp_causal_lm_huggingface | 6.0 | 32 | 192.0 | 922.21 | n150 |
| pytorch_Qwen 2.5_3B_Instruct_nlp_causal_lm_huggingface | 10.0 | 32 | 320.0 | 492.64 | p150 |
| pytorch_Qwen 2.5_7B_Instruct_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1123.5 | n150 |
| pytorch_Qwen 2.5_7B_Instruct_nlp_causal_lm_huggingface | 8.0 | 32 | 256.0 | 568.6 | p150 |
| pytorch_Qwen 3_0_6B_nlp_causal_lm_huggingface | 6.0 | 32 | 192.0 | 1398.32 | n150 |
| pytorch_Qwen 3_0_6B_nlp_causal_lm_huggingface | 10.0 | 32 | 320.0 | 720.72 | p150 |
| pytorch_Qwen 3_14B_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1295.86 | p150 |
| pytorch_Qwen 3_1_7B_nlp_causal_lm_huggingface | 6.0 | 32 | 192.0 | 970.65 | n150 |
| pytorch_Qwen 3_1_7B_nlp_causal_lm_huggingface | 10.0 | 32 | 320.0 | 503.08 | p150 |
| pytorch_Qwen 3_4B_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1268.91 | n150 |
| pytorch_Qwen 3_4B_nlp_causal_lm_huggingface | 9.0 | 32 | 288.0 | 645.31 | p150 |
| pytorch_Qwen 3_8B_nlp_causal_lm_huggingface | 5.0 | 32 | 160.0 | 1838.86 | n150 |
| pytorch_Qwen 3_8B_nlp_causal_lm_huggingface | 8.0 | 32 | 256.0 | 907.22 | p150 |
| tiiuae/Falcon3-1B-Base | 249.0 | 32 | 7968.0 | 1241.16 | n150 |
| tiiuae/Falcon3-3B-Base | 200.0 | 32 | 6400.0 | 1974.86 | n150 |
| tiiuae/Falcon3-7B-Base | 270.0 | 32 | 8640.0 | 1729.49 | n150 |
Non-LLM Performance
| Model | Batch | Sample/sec | Hardware |
|---|---|---|---|
| BAAI/bge-m3 | 32 | 91.0 | n150 |
| Qwen/Qwen3-Embedding-4B | 1 | 9.0 | n150 |
| playground-v2.5 | 1 | 0.0 | n150 |
| pytorch_BERT_emrecan/bert-base-turkish-cased-mean-nli-stsb-tr_nlp_embed_gen_huggingface | 8 | 44.0 | n150 |
| pytorch_BERT_emrecan/bert-base-turkish-cased-mean-nli-stsb-tr_nlp_embed_gen_huggingface | 8 | 114.0 | p150 |
| pytorch_BGE-M3_Base_nlp_embed_gen_custom | 4 | 9.0 | n150 |
| pytorch_BGE-M3_Base_nlp_embed_gen_custom | 4 | 19.0 | p150 |
| pytorch_EfficientNet_Timm_B0_cv_image_cls_timm | 8 | 348.0 | n150 |
| pytorch_EfficientNet_Timm_B0_cv_image_cls_timm | 8 | 797.0 | p150 |
| pytorch_MNIST_Cnn_Dropout_cv_image_cls_custom | 32 | 14201.0 | n150 |
| pytorch_MNIST_Cnn_Dropout_cv_image_cls_custom | 32 | 32686.0 | p150 |
| pytorch_MobileNetV2_Mobilenet_v2_cv_image_cls_torch_hub | 12 | 1237.0 | n150 |
| pytorch_MobileNetV2_Mobilenet_v2_cv_image_cls_torch_hub | 12 | 3026.0 | p150 |
| pytorch_Qwen 3_Embedding_4B_nlp_embed_gen_huggingface | 32 | 46.0 | n150 |
| pytorch_Qwen 3_Embedding_4B_nlp_embed_gen_huggingface | 32 | 105.0 | p150 |
| pytorch_ResNet_ResNet50_HuggingFace_cv_image_cls_huggingface | 8 | 1345.0 | n150 |
| pytorch_ResNet_ResNet50_HuggingFace_cv_image_cls_huggingface | 8 | 2839.0 | p150 |
| pytorch_SegFormer_B0_Finetuned_Ade_512_512_cv_image_seg_huggingface | 1 | 38.0 | n150 |
| pytorch_SegFormer_B0_Finetuned_Ade_512_512_cv_image_seg_huggingface | 1 | 85.0 | p150 |
| pytorch_Swin_S_cv_image_cls_torchvision | 1 | 10.0 | n150 |
| pytorch_Swin_S_cv_image_cls_torchvision | 1 | 23.0 | p150 |
| pytorch_U-Net for Conditional Generation_Base_conditional_generation_huggingface | 1 | 5.0 | n150 |
| pytorch_U-Net for Conditional Generation_Base_conditional_generation_huggingface | 1 | 9.0 | p150 |
| pytorch_Ultra-Fast Lane Detection v2_TuSimple_ResNet34_Backbone_cv_image_seg_github | 1 | 136.0 | n150 |
| pytorch_Ultra-Fast Lane Detection v2_TuSimple_ResNet34_Backbone_cv_image_seg_github | 1 | 252.0 | p150 |
| pytorch_VGG19-UNet_base_cv_image_seg_custom | 1 | 153.0 | n150 |
| pytorch_VGG19-UNet_base_cv_image_seg_custom | 1 | 308.0 | p150 |
| pytorch_ViT_Base_cv_image_cls_huggingface | 8 | 226.0 | n150 |
| pytorch_ViT_Base_cv_image_cls_huggingface | 8 | 548.0 | p150 |
| pytorch_VoVNet_Ese_Vovnet19b_Dw.ra_In1k_cv_image_cls_timm | 8 | 664.0 | n150 |
| pytorch_VoVNet_Ese_Vovnet19b_Dw.ra_In1k_cv_image_cls_timm | 8 | 1518.0 | p150 |
| sdxl-lightning | 1 | 0.0 | n150 |
| sdxl-lightning | 1 | 0.0 | p150 |
Model coverage
Info: Full list of supported models is available in the assets section.
| Model task | Model architecture | Model variant | Model framework | Inference | Training | n150 | n300 | p150 | Single device | Data parallel | Tensor parallel | Model source |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| conditional generation | U-Net for Conditional Generation | Base | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | AlexNet | Custom 1x2 | jax | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| cv image cls | DINOv2 | Small | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | EfficientNet | B0 | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | View Source |
| cv image cls | MNIST | Cnn Batchnorm | jax | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MNIST | Cnn Dropout | jax | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MNIST | Cnn Dropout | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MNIST | Cnn Nodropout | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MNIST | Mlp Custom | jax | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MNIST | Mlp Custom | jax | ❌ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MNIST | Mlp Custom 1x2 | jax | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| cv image cls | MobileNetV1 | Mobilenet v1 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | MobileNetV2 | Mobilenet v2 | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | View Source |
| cv image cls | ResNet | ResNet50 HuggingFace High Resolution | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | SegFormer | Mit B0 | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | View Source |
| cv image cls | Swin | S | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | VGG | HF Vgg19 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image cls | ViT | Base | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | View Source |
| cv image cls | VoVNet | Ese Vovnet19b Dw.ra In1k | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | View Source |
| cv image seg | Ultra-Fast Lane Detection | TuSimple ResNet18 Backbone | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv image seg | VGG19-UNet | base | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv img to img | Autoencoder | linear | pytorch | ❌ | ✅ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | Attention DenseUNet | Base | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | DETR | ResNet50 Backbone | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | OWL-ViT | Base Patch32 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | PointPillars | pointpillars | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | YOLOP | Default | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | YOLOS Small | Small | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | YOLOv4 | Base | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ | View Source |
| cv object det | YOLOv7 | Default | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | YOLOv9 | T | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| cv object det | ssd512 | ssd512 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| mm action prediction | OpenVLA-OFT | Finetuned Libero 10 | pytorch | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| mm action prediction | pi_0 | pi0 base | pytorch | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| mm image text similarity | CLIP | Base Patch16 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| mm image text similarity | SigLIP | Base Patch16 224 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| mm visual qa | Mistral | base | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | ALLaM | 7B Instruct | pytorch | ✅ | ❌ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Command_A_Reasoning | command-a-reasoning-08-2025 | pytorch | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Falcon | 3 10B Base | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Falcon | 3 1B Base | pytorch | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Falcon | 3 3B Base | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Falcon | 3 7B Base | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | GPT-2 | Base | jax | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | GPT-2 | Xl | jax | ❌ | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | GPT-OSS | 20B | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Gemma | 1.1 2B IT | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Gemma | 1.1 7B IT | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Gemma | 2 27B IT | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Gemma | 2 2B IT | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Gemma | 2 9B IT | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Llama | 3.1 70B | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Llama | 3.1 8B Instruct | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Llama | 3.2 1B | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Llama | 3.2 3B | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Llama | 3.3 70B Instruct | pytorch | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | 7B INSTRUCT v03 | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | Devstral Small 2505 | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | Large INSTRUCT 2411 | pytorch | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | Magistral Small 2506 | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | Ministral 8B Instruct | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | Nemo INSTRUCT 2407 | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Mistral | Small 24B INSTRUCT 2501 | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Phi-1 | Phi 1 | jax | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Phi-1 | Phi 1 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-1 | Phi 1 | pytorch | ❌ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-1 LoRA | Phi 1 | pytorch | ❌ | ✅ | ❌ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-1.5 | Phi 1 5 | jax | ✅ | ❌ | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Phi-1.5 | Phi 1 5 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-2 | Phi 2 | jax | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Phi-2 | Phi 2 | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-3 | Mini 128K Instruct | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-3 | Mini 4K Instruct | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-3 | Mini Instruct | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Phi-4 | Phi 4 | pytorch | ✅ | ❌ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Qwen 2 | Qwq 32B | pytorch | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ | ❌ | ✅ | View Source |
| nlp causal lm | Qwen 2.5 | 0.5B | jax | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Qwen 2.5 | 0.5B Instruct | jax | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | View Source |
| nlp causal lm | Qwen 2.5 | 0.5B Instruct | pytorch | ✅ | ❌ | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | View Source |
| nlp causal lm | Qwen 2.5 | 1.5B Instruct | jax | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | ❌ | ✅ | View Source |