Release 1.3.0 · tenstorrent/tt-xla

Installation

Via PyPI

pip install pjrt-plugin-tt==1.3.0 --extra-index-url https://pypi.eng.aws.tenstorrent.com/
pip install vllm-tt==1.3.0 --extra-index-url https://pypi.eng.aws.tenstorrent.com/

Via Docker

docker pull ghcr.io/tenstorrent/tt-xla-slim:1.3.0

What's Changed

Uplift third_party/tt-mlir to 297f7eb6c0c07b5d3d16a9f5eec807cbe0edd4c1 2026-05-24 by @vmilosevic in #4843
[vLLM plugin] Use [1, num_devices] shape for 1D mesh by @mmanzoorTT in #4591
Add framework column and --framework filter to JUnit XML summarizer by @devisettymahidhar608 in #4845
Uplift third_party/tt-mlir to b4871ad192c5783a58a09e1b0627d9cf1227c5f4 2026-05-25 by @vmilosevic in #4896
Nightly Maintenance may24 by @devisettymahidhar608 in #4898
Add kimi-k2.5 benchmark by @gengelageTT in #4802
[vLLM]: skip wasted profile_run() in determine_available_memory by @kmabeeTT in #4893
Add triage skill for bfloat16 dtype-mismatch FE failures by @agobeljicTT in #4574
Uplift third_party/tt-mlir to 3ac5318a23224a280aa926b42b5bdcf11aefe12a 2026-05-25 by @vmilosevic in #4936
Add benchmark report to On Nightly summary by @vkovacevicTT in #4831
Enable group_norm composite (tt-metal#40916 fixed) by @kamalrajkannan78 in #4868
Uplift third_party/tt_forge_models to a797c897100bddeb77d43fe61fd7f2746a7246c2 2026-05-25 by @vmilosevic in #4899
decompositions: fix scales passed as exact in upsample_nearest callers by @Dev-X25874 in #4655
[CI] Replace fetch job id action with param on perf benchmark by @nsumrakTT in #4938
Uplift third_party/tt_forge_models to 8fb89846e04493da0f1dae0c656b59c4a50eddf3 2026-05-26 by @vmilosevic in #4945
Uplift third_party/tt-mlir to c5f398432a61100da79b7b9f2941130496092287 2026-05-26 by @vmilosevic in #4941
Add Mochi-1 component tests at original resolution (text encoder, DiT, VAE decoder) by @kamalrajkannan78 in #4641
HiDream_I1: add sharded test for text_encoder_3 by @kamalrajkannan78 in #4848
Infer device_type from CI inputs by @vkovacevicTT in #3721
Update Forge version in inference server when releasing monthly by @vvukomanTT in #4962
Set opt. lvl. to 1 for SDXL-Lightning, HiDreamI1 and Playground v2.5 VAE decoder tests by @kamalrajkannan78 in #4946
[Cog VideoX 5B] Add initial tests for each part of the pipeline by @meenakshiramanathan1 in #4558
[Hunyuan Video] Add initial tests for each part of the pipeline by @meenakshiramanathan1 in #4517
Uplift third_party/tt_forge_models to 3b4e360baa9457776e9b97a1137f4ccabb70d3f9 2026-05-27 by @vmilosevic in #4964
Add QB2 to weekly runs by @devisettymahidhar608 in #4937
[OmiGen] Add initial tests for each part of the pipeline by @meenakshiramanathan1 in #4749
[vLLM] Add perf benchmarks for Qwen3-Embedding-4B and BGE-m3 at batch_size=1 and 32 by @alinakhanTT in #4840
Wire up D2M Fusion Option into LLM Benchmarks + Test GPT-OSS-20B with D2M Fusion Enabled by @brapananTT in #4534
Add Deepseek V4 Flash E2E changes to nightly CI by @hshahTT in #4841
Uplift Transformers to 5.5.1 by @ssaliceTT in #4272
[Lumina Image] Add initial tests for each part of the pipeline by @meenakshiramanathan1 in #4775
Update DeepSeek-OCR single-device test config for ~0.94 PCC by @ashokkumarkannan1 in #4951
Remove model-specific install requirements from perf benchmarks by @odjuricicTT in #4939
pjrt+vllm_plugin: expose dram_size_bytes; use it for KV cache sizing by @kmabeeTT in #4960
Fix tiktoken pyreq for kimi benchmarks by @gengelageTT in #4989
uplift torch_xla by @mstojkovicTT in #4979
[CI] Add QB2-Blackhole TP benchmarks to nightly by @rpavlovicTT in #4983
Update vLLM benchmark CODEOWNERS by @vkovacevicTT in #4984
Add training mode and LoRA backward tests for LLM torch models by @agobeljicTT in #4219
[Benchmarks] Remove hardcoded arch report parameter by @vkovacevicTT in #4972
Add Gemma4 e4b, 31b support for vLLM by @sshonTT in #4889
[Composite] Add nn.RMSNorm module-form support by @kamalrajkannan78 in #4985
[Playground v2.5] Add end-to-end pipeline example by @kamalrajkannan78 in #4992
Reduce vLLM decode graphs from 5 to 2 by @alinakhanTT in #4789
Add Gemma-4-31B-it to vLLM benchmarks on QB2 Blackhole by @kmabeeTT in #5012
Lower gemma 1.1_7B_IT inference pcc threshold by @vzeljkovicTT in #5026
[Tests] Xfail training PCC failures for phi1, phi1_lora, gemma_lora by @vzeljkovicTT in #5027
Lower sdxl clip threshold nightly by @vzeljkovicTT in #5023
Set targetModule path as a default for emitPy testing by @amilovanovicTT in #4819
Add Qwen3-32B vLLM perf benchmark for QuietBox2 (batch 1) by @ssaliceTT in #5030
[Benchmark] Fix multichip arch, perf regression check and qb2 transformers pin failures by @vkovacevicTT in #5020
Add streaming inference for DeepSeek-V4-Flash by @sshonTT in #4811
[vLLM] Improve test diagnostics by enabling basic logs by @mmanzoorTT in #4879
Uplift third_party/tt_forge_models to 363958eba679bef0cf12fe6ed39e22e917048851 2026-06-02 by @vmilosevic in #5049
Bump version to 1.3.0 by @vvukomanTT in #4991
[wheel] Support bundling libtt-alchemist-lib.so into manylinux wheel by @svuckovicTT in #5050
Set default kv cache dtype to bfp_bf8 by @kdimicTT in #4613
[vLLM] Pin input shardings in the execution path to match warmup by @sshonTT in #5035
assert_pcc=false for failing tests (phi1, phi1_lora, gemma_lora on p150) by @agobeljicTT in #5061
[CI] Add nightly run for vLLM QB2 tests by @mmanzoorTT in #5055
Uplift PJRT C API header from v0.106 to v0.110 by @acicovicTT in #4503
Docs review skill by @acicovicTT in #4718
[CI] Perf benchmark simplification by @vvukomanTT in #5078
Uplift third_party/tt_forge_models to 6eadd3f27fa7819e6f6619484a61371d2fa44983 2026-06-04 by @vmilosevic in #5060
Disable kv cache dtype conversion when MLA cache is used by @kdimicTT in #5082
Fix JAX optimization_level not reaching PJRT plugin by @aorlovicTT in #4857
Fix project name for vLLM perf tests by @mmanzoorTT in #5089
Fix runtime errors encountered due to transformers uplift. by @devisettymahidhar608 in #5053
Uplift third_party/tt_forge_models to 6b4b47a7c419cdc2713ddfc6e3179f61012c12f4 2026-06-05 by @vmilosevic in #5101
[vLLM] Skip KV cache initialization by @mmanzoorTT in #5095
[Benchmark] Run single-chip benchmark models through vllm benchmark by @vkovacevicTT in #5056
[CI] Multihost CI integration by @nsumrakTT in #4971
Add relative L2 error similarity metric to op-tests and benchmarks by @dgolubovicTT in #4676
Add vLLM Llama-3.1 TP benchmarks (n300-llmbox) by @alinakhanTT in #5073
[vLLM] Expose experimental_kv_cache_dtype + add xfailed BFP8 repro test by @kmabeeTT in #5007
MoE backend from huggingface by @sshonTT in #4988
Refactor Sliding Attention Overrides with Generic Model Rewrite Support by @devisettymahidhar608 in #4975
[Perf tests] Fix benchmark TP config to avoid duplicate attributes by @mmanzoorTT in #5111
Uplift third_party/tt_forge_models to 71584c597ec304999080596fccefea1becdd73f1 2026-06-07 by @vmilosevic in #5113
[CI] Add option to run perf benchmark tests with custom torch-xla build by @mmanzoorTT in #4829
Fix GPT-OSS 20B example segfault by selecting tt_dense experts backend by @devisettymahidhar608 in #5109
[perf] Run single-chip benchmarks on stable qb2-blackhole via p150-perf label by @rpavlovicTT in #5103
Move Training MoE Tests to QB2 by @pglusacTT in #5104
[vLLM] Add support for Rotary Embedding with Multimodal Sections by @mmanzoorTT in #5107
Add deepseek-v3.1 and glm4.7 benchmarks by @gengelageTT in #5097
Add Janus-Pro T2I component bring-up tests by @ashokkumarkannan1 in #4810
[CI] Add CPU only tests by @vvukomanTT in #5127
Remove logical mapping of p150-perf -> qb2 by @vvukomanTT in #5126
[GLM Image] Add initial tests for each part of the pipeline by @meenakshiramanathan1 in #4800
Update code owner for vLLM by @mmanzoorTT in #5128
[vLLM] Fix num_hidden_layers override for text-only models by @mmanzoorTT in #5125
Add chisel context pytest fixture --enable-chisel by @ndrakulicTT in #5025
Fix rel_l2 comparison crash on rank-2 training tensors by @agobeljicTT in #5136
Temporarily disable gpt_oss_20b_tp_d2m test from llm benchmarks by @brapananTT in #5135
fix vllm perf's tokens / second value by @jazpurTT in #4953
Add missed torch tests to CI by @gengelageTT in #5131
Implement deferred transfer of host tensor data to multihost workers by @jameszianxuTT in #4617
[CI] Re-enable t3k mutihost test by @nsumrakTT in #5143
Don't install requirements for skipped tests by @sdjukicTT in #5138
Uplift third_party/tt-mlir to 464a5f341908d5a3f790e697be5ffa6fd3fb6f32 2026-06-02 by @vmilosevic in #4957
[Test] Add CPU compile-only tests by @jasonmacTT in #5033
Fix main build break - Make host tensor shell represent strides as i64 by @jameszianxuTT in #5156
[vLLM] batch paged_fill_cache across users in prefill (compile/perf improvement) by @kmabeeTT in #4955
Update scatter add tests based on tt-mlir PR by @ddilbazTT in #5028
[Nightly] Fix BH galaxy nightly failure by @sshonTT in #5094
[Claude] Skill for model sharding by @vkovinicTT in #5134
[CI] Check if (torch) test is added without device marker by @nsumrakTT in #5182
[Benchmark] Add custom sharding for glm_4_7 by @mvasiljevicTT in #5139
Ignore runtime-generated debug artifacts under generated/ by @mmanzoorTT in #5192
Set --no-rosegment linker flag in manylinux builds to mitigate patchelf corruption of binaries by @acicovicTT in #5181
[pjrt] Route non-contiguous host buffers through the owned-tensor path by @mstojkovicTT in #5059
[CI] Use manylinux wheel for testing on push event by @nsumrakTT in #5193
Wan5b Tests by @ppadjinTT in #4965
Remove xfail from mistral test for vllm by @sshonTT in #5194
Remove optimizer submesh device; let optimizer use mock device by @rpavlovicTT in #5146
[vLLM] Remove xfails for fixed bugs (#4570, #5006); re-point logprobs xfail by @kmabeeTT in #5186
[Playground v2.5] Wire e2e pipeline into nightly and benchmark CI by @kamalrajkannan78 in #5044
Wan14b tests by @ppadjinTT in #5036
Uplift third_party/tt_forge_models to 09239ae98eb4f0b03abe5240aca9418eb3131717 2026-06-13 by @vmilosevic in #5133
Add a claude skill to compare failures between 2 nightlies by @ctr-pmuruganTT in #4358
Uplift third_party/tt_forge_models to b27be2665e4f0b7773f5addd1879c2d90f77ce51 2026-06-15 by @vmilosevic in #5200
[CI] Fix on push multihost test fail by @nsumrakTT in #5220
[CI] Fix wheel build selector by @nsumrakTT in #5221
Remove some n300-llmbox models from benchmark CI by @vkovacevicTT in #5226
Clamp out-of-range negative aten.slice starts to -dim_size (#5199) by @kamalrajkannan78 in #5211
Uplift third_party/tt_forge_models to 2fa8c5686d64d51ec4a8d30e21cde86a3d776bf3 2026-06-16 by @vmilosevic in #5231
Add Lazy Execution Option in Legacy Compile Path by @pglusacTT in #5235
[CI] Fix p150-perf shared runner name by @nsumrakTT in #5243
[CI] Replace custom job_id action with built-in job.check_run_id by @vmilosevic in #5253
Implementation of PJRT_Client_CreateUninitializedBuffer by @acicovicTT in #5080
Bring back gpt_oss_20b_tp_batch_size_1 and qwen_2_5_coder_32b_instruct_tp to benchmark CI by @vkovacevicTT in #5239
Uplift third_party/tt_forge_models to 0d3ee26c4e8be082876ff6fb04f65dc33e96189f 2026-06-17 by @vmilosevic in #5252
Fix inflated number of devices when no tensors are sharded by @acicovicTT in #5236
[Benchmark] Fix GLM4.7 and Deepseek-v3.1 MoE shard specs by @gengelageTT in #5240
Updates Config for gemma4-12B model by @saiarthiraguram in #5258
[Test] Blackhole galaxy simple test by @vkovinicTT in #5261
[vLLM] Add Gemma 4-31B Blackhole Galaxy test through vLLM by @ddilbazTT in #5224
[FX Fusing] Add RMSNorm fusion patterns for Llama, GPT-OSS, Gemma family, and vLLM by @alinakhanTT in #5140
Adds sliding-window attention support for Gemma3 multimodal models by @devisettymahidhar608 in #5115
[Test] Mark sana/1600M_1024px inference EXPECTED_PASSING by @saiarthiraguram in #5188
[EmitPy] Add regression test for TP/EP when export_tensors=True by @amilovanovicTT in #5219
Uplift third_party/tt_forge_models to 79ef7852d1d1c664917a6984a0d4c527825a6142 2026-06-18 by @vmilosevic in #5274
[SDXL Lightning] Add e2e pipeline in nightly and benchmark CI by @kamalrajkannan78 in #5244
[vLLM] Warmup phase optimization by @mmanzoorTT in #5129
[vLLM] Fix batch-32 TP benchmark failure; set all TP tests to bs32 by @ssaliceTT in #5159
Add generality models to tensor parallel inference test config by @devisettymahidhar608 in #5201
Skip gpt_oss_20b_tp perf test on n300-llmbox by @vmilosevic in #5287
move cpu_compile_only to experimental nightly by @jameszianxuTT in #5268
Uplift third_party/tt_forge_models to 2fd9c86262aa7059a2152af43c7286d41f7a3edf 2026-06-19 by @vmilosevic in #5286
Add triage skill for missing-input FE failures by @agobeljicTT in #4611
Uplift third_party/tt-mlir to 70ff200c7d2fa8d8401f316a0a0b35ee88cbfb72 2026-06-18 by @vmilosevic in #5164
[Qol] Set a default controller hostname when env var not set by @jasonmacTT in #5293
vLLM Falcon3-7B removing num_hidden_layers config from test by @ssaliceTT in #5301
Uplift third_party/tt_forge_models to 32d5c2e4a8cfd55b0f2ec99b3ec8d1b217fcb742 2026-06-20 by @vmilosevic in #5304
vllm: Skip extract_nodes_info unless XLA_HLO_DEBUG=1 for compile time speedup by @kmabeeTT in #5299
Fix runner label mapping logic in perf tests by @vvukomanTT in #5288
Reduce DeepSeek-V4 e2e PCC test to 10 layers on BH Galaxy by @sshonTT in #5296
Uplift third_party/tt_forge_models to 6400d1eb60ba1ca2f7ea37f8c3e613a2d744c301 2026-06-22 by @vmilosevic in #5305
Add dependencies commits to release notes by @vvukomanTT in #5311
[WAN 2.2] Path for sp DiT sharding by @vkovinicTT in #5309
Fix TT_RUNTIME_DEBUG compile-time variable propagation to PJRT callers by @jameszianxuTT in #5242
Expand perf reports to include p150 benchmarks by @vvukomanTT in #5318
[vLLM] Fix runner to use correct sampling graph for cpu sampling by @mmanzoorTT in #5316
Bring up Mixtral models in the vLLM plugin. by @devisettymahidhar608 in #3523
Update test config set2 by @devisettymahidhar608 in #5320
Bring up the Pixtral model in the vLLM plugin by @devisettymahidhar608 in #3996
Fix libtt-alchemist-lib.so bundling into manylinux wheel by @amilovanovicTT in #5333
Add Gemma-4 26B-A4B MoE support for vLLM on 2D mesh by @sshonTT in #5141
[pjrt] fix prepare inputs for codegen by @pilkicTT in #5322
[vLLM] Pin FastAPI in deps to avoid route-tree regression by @mmanzoorTT in #5335
[vLLM] Set default optimization level to 1 by @mmanzoorTT in #5327

New Contributors

@Dev-X25874 made their first contribution in #4655
@brapananTT made their first contribution in #4534
@jasonmacTT made their first contribution in #5033

Full Changelog: 1.2.0...1.3.0

LLM Performance

Model	Token/sec/user	Batch	Token/sec	ttft (ms)	Hardware
pytorch_DeepSeek-V3.1_deepseek_v3_1_modified_nlp_causal_lm_custom	3.0	64	192.0	4159.47	n150
pytorch_Falcon_3_1B_Base_nlp_causal_lm_huggingface	57.0	32	1824.0	663.53	n150
pytorch_Falcon_3_3B_Base_nlp_causal_lm_huggingface	38.0	32	1216.0	865.52	n150
pytorch_Falcon_3_7B_Base_nlp_causal_lm_huggingface	19.0	32	608.0	1186.81	n150
pytorch_GLM_4.7_nlp_causal_lm_huggingface	7.0	64	448.0	1590.81	n150
pytorch_Gemma_1.1_2B_IT_nlp_causal_lm_huggingface	40.0	32	1280.0	638.1	n150
pytorch_Kimi-K2.5_kimi_k2_5_modified_nlp_causal_lm_custom	3.0	64	192.0	4587.63	n150
pytorch_Kimi-K2_kimi_k2_instruct_modified_nlp_causal_lm_custom	3.0	64	192.0	4808.17	n150
pytorch_Llama_3.1_8B_Instruct_nlp_causal_lm_huggingface	23.0	32	736.0	2230.65	n150
pytorch_Llama_3.2_1B_Instruct_nlp_causal_lm_huggingface	67.0	32	2144.0	575.95	n150
pytorch_Llama_3.2_3B_Instruct_nlp_causal_lm_huggingface	31.0	32	992.0	605.5	n150
pytorch_Mistral_7B_INSTRUCT_v03_nlp_causal_lm_huggingface	20.0	32	640.0	1252.08	n150
pytorch_Mistral_Ministral_8B_Instruct_nlp_causal_lm_huggingface	12.0	32	384.0	550.28	n150
pytorch_Mistral_Small_24B_INSTRUCT_2501_nlp_causal_lm_huggingface	16.0	32	512.0	1790.87	n150
pytorch_Phi-1.5_Phi_1_5_nlp_causal_lm_huggingface	22.0	32	704.0	617.58	n150
pytorch_Phi-1_Phi_1_nlp_causal_lm_huggingface	22.0	32	704.0	632.51	n150
pytorch_Qwen 2.5_0.5B_Instruct_nlp_causal_lm_huggingface	78.0	32	2496.0	408.36	n150
pytorch_Qwen 2.5_1.5B_Instruct_nlp_causal_lm_huggingface	38.0	32	1216.0	461.04	n150
pytorch_Qwen 2.5_3B_Instruct_nlp_causal_lm_huggingface	31.0	32	992.0	697.85	n150
pytorch_Qwen 2.5_7B_Instruct_nlp_causal_lm_huggingface	16.0	32	512.0	859.7	n150
pytorch_Qwen 3_0_6B_nlp_causal_lm_huggingface	36.0	32	1152.0	1163.37	n150
pytorch_Qwen 3_1_7B_nlp_causal_lm_huggingface	30.0	32	960.0	746.03	n150
pytorch_Qwen 3_4B_nlp_causal_lm_huggingface	17.0	32	544.0	976.23	n150
pytorch_Qwen 3_8B_nlp_causal_lm_huggingface	13.0	32	416.0	1673.32	n150

Non-LLM Performance

Model	Batch	Sample/sec	Hardware
pytorch_BERT_emrecan/bert-base-turkish-cased-mean-nli-stsb-tr_nlp_embed_gen_huggingface	8	44.0	n150
pytorch_BGE-M3_Base_nlp_embed_gen_custom	4	9.0	n150
pytorch_BGE-M3_Base_nlp_embed_gen_custom	4	19.0	p150
pytorch_EfficientNet_Timm_B0_cv_image_cls_timm	8	349.0	n150
pytorch_MNIST_Cnn_Dropout_cv_image_cls_custom	32	14643.0	n150
pytorch_MobileNetV2_Mobilenet_v2_cv_image_cls_torch_hub	12	1237.0	n150
pytorch_Qwen 3_Embedding_4B_nlp_embed_gen_huggingface	32	46.0	n150
pytorch_ResNet_ResNet50_HuggingFace_cv_image_cls_huggingface	8	1339.0	n150
pytorch_SegFormer_B0_Finetuned_Ade_512_512_cv_image_seg_huggingface	1	39.0	n150
pytorch_Swin_S_cv_image_cls_torchvision	1	10.0	n150
pytorch_U-Net for Conditional Generation_Base_conditional_generation_huggingface	1	5.0	n150
pytorch_Ultra-Fast Lane Detection v2_TuSimple_ResNet34_Backbone_cv_image_seg_github	1	136.0	n150
pytorch_VGG19-UNet_base_cv_image_seg_custom	1	147.0	n150
pytorch_ViT_Base_cv_image_cls_huggingface	8	229.0	n150
pytorch_VoVNet_Ese_Vovnet19b_Dw.ra_In1k_cv_image_cls_timm	8	667.0	n150

Model coverage

Info: Full list of supported models is available in the assets section.

Model task	Model architecture	Model variant	Model framework	Inference	Training	n150	n300	p150	Single device	Data parallel	Tensor parallel	Model source
conditional generation	U-Net for Conditional Generation	Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	AlexNet	Custom 1x2	jax	✅	❌	❌	✅	❌	❌	❌	✅	View Source
cv image cls	DINOv2	Small	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	EfficientNet	B0	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	MNIST	Cnn Batchnorm	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Cnn Dropout	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Cnn Dropout	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Cnn Nodropout	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Mlp Custom	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Mlp Custom	jax	❌	✅	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Mlp Custom 1x2	jax	✅	❌	❌	✅	❌	❌	❌	✅	View Source
cv image cls	MobileNetV1	Mobilenet v1	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MobileNetV2	Mobilenet v2	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	ResNet	ResNet50 HuggingFace High Resolution	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	SegFormer	Mit B0	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	Swin	S	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	VGG	HF Vgg19	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	ViT	Base	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	VoVNet	Ese Vovnet19b Dw.ra In1k	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image seg	Ultra-Fast Lane Detection	TuSimple ResNet18 Backbone	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image seg	VGG19-UNet	base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv img to img	Autoencoder	linear	pytorch	❌	✅	✅	❌	✅	✅	❌	❌	View Source
cv object det	Attention DenseUNet	Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	DETR	ResNet50 Backbone	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	OWL-ViT	Base Patch32	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	PointPillars	pointpillars	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOP	Default	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOS Small	Small	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOv4	Base	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv object det	YOLOv7	Default	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOv9	T	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	ssd512	ssd512	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
mm action prediction	OpenVLA-OFT	Finetuned Libero 10	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm action prediction	pi_0	pi0 base	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm image text similarity	CLIP	Base Patch16	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
mm image text similarity	SigLIP	Base Patch16 224	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
mm visual qa	Llama	3.2 11B Vision Instruct	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm visual qa	Mistral	base	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	ALLaM	7B Instruct	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	Command_A_Reasoning	command-a-reasoning-08-2025	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Falcon	3 10B Base	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Falcon	3 1B Base	pytorch	✅	❌	✅	✅	✅	✅	❌	✅	View Source
nlp causal lm	Falcon	3 3B Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Falcon	3 7B Base	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	GPT-2	Base	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	GPT-2	Xl	jax	❌	✅	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	GPT-OSS	20B	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Gemma	1.1 2B IT	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Gemma	1.1 7B IT	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Gemma	2 27B IT	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Gemma	2 2B IT	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Gemma	2 9B IT	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Llama	3.1 70B	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Llama	3.1 8B Instruct	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Llama	3.2 1B	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Llama	3.2 3B	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Llama	3.3 70B Instruct	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	7B INSTRUCT v03	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Mistral	Devstral Small 2505	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Large INSTRUCT 2411	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Magistral Small 2506	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Ministral 8B Instruct	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Mistral	Nemo INSTRUCT 2407	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Small 24B INSTRUCT 2501	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Phi-1	Phi 1	jax	✅	❌	✅	✅	✅	✅	❌	✅	View Source
nlp causal lm	Phi-1	Phi 1	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-1	Phi 1	pytorch	❌	✅	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-1 LoRA	Phi 1	pytorch	❌	✅	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-1.5	Phi 1 5	jax	✅	❌	✅	✅	✅	✅	❌	✅	View Source
nlp causal lm	Phi-1.5	Phi 1 5	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-2	Phi 2	jax	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Phi-2	Phi 2	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-3	Mini 128K Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-3	Mini 4K Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-3	Mini Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-4	Phi 4	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	Qwen 2	Qwq 32B	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Qwen 2.5	0.5B	jax	✅	❌	✅	✅	✅	✅	❌	✅	View Source
nlp causal lm	Qwen 2.5	0.5B Instruct	jax	✅	❌	✅	✅	✅	✅	❌	✅	View Source
nlp causal lm	Qwen 2.5	0.5B Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1.3.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Installation

Via PyPI

Via Docker

What's Changed

New Contributors

LLM Performance

Non-LLM Performance

Model coverage

Contributors

Uh oh!