Release 1.2.0 · tenstorrent/tt-xla

Installation

Via PyPI

pip install pjrt-plugin-tt==1.2.0 --extra-index-url https://pypi.eng.aws.tenstorrent.com/
pip install vllm-tt==1.2.0 --extra-index-url https://pypi.eng.aws.tenstorrent.com/

Via Docker

docker pull ghcr.io/tenstorrent/tt-xla-slim:1.2.0

What's Changed

Uplift PJRT C API header from v0.103 to v0.106 by @ajakovljevicTT in #4313
Reland EmitPy testing by @sgligorijevicTT in #4261
Uplift third_party/tt_forge_models to cae9ccbc67a318736f656bee9a9ea776eb73e69c 2026-04-28 by @acicovicTT in #4423
XFail sdpa_decode affected tests by @acicovicTT in #4425
Nightly maintenance by @sgligorijevicTT in #4426
Disable kimi_k2 in CI by @vkovacevicTT in #4432
[vLLM plugin] Fix 1D mesh shape and tensor sharding for tensor parallelism by @mmanzoorTT in #4421
Uplift third_party/tt-mlir to 5b85073695682d062a0ac7fe5888bfb5b410853d 2026-04-28 by @acicovicTT in #4416
Lower PCC threshold for siglip variant achieving PCC greater than 0.98 in N150 by @meenakshiramanathan1 in #4415
Removing monkeypatch for jax.random.uniform by @aorlovicTT in #4427
Remove duplicate umd shared lib files by @vvukomanTT in #4451
Handle PCC errors in benchmark by @vkovacevicTT in #4438
Manual tools test workflow by @acicovicTT in #4428
Codegen multiple dirs on graph breaks instead of clobbering by @svuckovicTT in #4429
Simplify call-test.yml by moving conditions into test matrix config by @vmilosevic in #4456
Add initial support for composite torch.gather by @hshahTT in #4431
update analyze-nightly skill to be reused by other skills by @ctr-pmuruganTT in #4448
[vLLM plugin] Fix 2D-mesh device-sampler garbage output (and add coherence check) by @mmanzoorTT in #4442
Compile only improved by @AleksKnezevic in #4271
Make updating release notes robust by @vvukomanTT in #4501
Create large llm test matrix, move a llama to large matrix by @sgligorijevicTT in #4454
[vLLM plugin] vllm v0.19.0 uplift by @mmanzoorTT in #4443
Update PyTorch tag in torch-xla build script by @mmanzoorTT in #4492
Uplift third_party/tt-mlir to f8d3bf0e97dee04ea1783b00304b37b48d446c62 2026-05-04 by @acicovicTT in #4450
Weekly Maintance May2 by @devisettymahidhar608 in #4497
Nightly Maintance May1 by @devisettymahidhar608 in #4498
Remove known failure entries for fixed training tests by @agobeljicTT in #4512
[vLLM] Replace sort-based sampling with multi-core topk for 2x non-greedy speedup by @kmabeeTT in #4334
Add accuracy regression check by @vvukomanTT in #4522
Prioritize runtime_reason over static_reason in record_model_test_properties by @agobeljicTT in #4524
Revert "[vLLM plugin] vllm v0.19.0 uplift" by @mmanzoorTT in #4516
Uplift third_party/tt_forge_models to eba69819e9d4e5b7bd3c818656120d2c09b1a679 2026-05-05 by @acicovicTT in #4502
[CI] Fix uplift automerge by @nsumrakTT in #4533
Add passing p150 vLLM single_device tests to basic-test matrices by @kmabeeTT in #4515
[vLLM plugin] Cleanly shut down after a failed test to prevent device hang by @mmanzoorTT in #4511
Fix mesh shape when the graph has no inputs by @pglusacTT in #4439
All-to-All Dispatch and Combine Backward by @pglusacTT in #4386
[Krea Realtime 14B] Add initial tests for each part of the pipeline by @kamalrajkannan78 in #4513
Add KV cache dtype conversion option by @kdimicTT in #4140
Update status on perceiverio models by @saiarthiraguram in #4555
Add finding-missed-fusions skill by @ppadjinTT in #4545
Add claude skill for benchmark reports by @vkovacevicTT in #4514
Default batch_size=2 for training in test_all_models_torch by @agobeljicTT in #4535
Uplift third_party/tt-mlir to bb1deb417cef0e2c60147072cf7b6926a49ccca7 2026-05-06 by @acicovicTT in #4509
[Benchmark] Add check for fused ops in benchmarks by @vkovacevicTT in #4434
[vLLM] Fix flaky output coherence assertion in TP generation test by @kmabeeTT in #4563
Fix tt-triage install by @vvukomanTT in #4553
Uplift third_party/tt_forge_models to 9c743461b7fe91bd33683f66c9a456f0a22e1634 2026-05-07 by @acicovicTT in #4544
Add --force-run option to pytest for debugging skipped tests by @agobeljicTT in #4572
Handle ComplexType in simplifyMainFuncOp zero-attr creation by @kamalrajkannan78 in #4556
[Benchmark] Bring back gpt_oss_120b in accuracy tests by @vkovacevicTT in #4530
Fix masked_scatter decomposition to resolve OOM error in gemma3 multimodal models by @sonalibaskaran2499 in #4315
sparse_mlp: pad batch to tile-align tokens for bsz<32 support by @sshonTT in #4537
PyTorch and vLLM uplift by @mmanzoorTT in #4543
Update the status of Vilt model in inference test config by @kamalrajkannan78 in #4576
Lower PCC thresholds after nightly run 25529490187 by @vzeljkovicTT in #4586
Allow overriding tt-mlir and tt-metal source dirs in third_party build by @nsmithtt in #4528
Allow TT_RUNTIME_USING_DUALT3K to force fabric 2D init in non-distributed context by @jameszianxuTT in #4419
[vLLM plugin] vLLM 0.19.1 uplift by @mmanzoorTT in #4588
Add 4 layer deepseek and glm tests with weight caching by @gengelageTT in #4538
Uplift third_party/tt_forge_models to f224af305a10d38acb9fbd72c0c3514b26ec4544 2026-05-09 by @acicovicTT in #4598
[HunyuanVideo-1.5-Diffusers-480p_t2v_distilled] Add initial tests for each part of the pipeline by @kamalrajkannan78 in #4531
Add model-test-emitpy preset to manual-test.yml by @svuckovicTT in #4605
Add support for new ForgePrefillModel class in testing infra by @umalesTT in #4390
Add llama 3.1 70b and gpt oss 120b dp by @vvukomanTT in #4606
Testing models using sliding attention by @devisettymahidhar608 in #4287
Nightly Maintance may9 by @devisettymahidhar608 in #4601
Weekly maintance may9 by @devisettymahidhar608 in #4599
Don't download weights in GPT OSS training test by @pglusacTT in #4610
xfail olmo and mistral_8b examples added in #4287 by @jameszianxuTT in #4625
[vLLM plugin] Enable TP pooling test for intfloat/e5-mistral-7b-instruct by @mmanzoorTT in #4612
[vLLM + Benchmark] 3 perf improvements: ttnn.sampling fused op, pad-batch-to-32, skip greedy on all_random by @kmabeeTT in #4536
Add deepseek-v3.2-exp benchmark by @gengelageTT in #4557
Add kimi-k2 benchmark back into CI by @gengelageTT in #4566
Switch accuracy metrics to quantile-based (p5) + mean over full batch by @dgolubovicTT in #4362
Add galaxy-wh-6u vLLM support with Mistral-Large model test by @devisettymahidhar608 in #3814
Fix RMS norm batch parallel test by @acicovicTT in #4637
[pjrt] expose const-eval-inputs-to-system-memory pipeline option by @sshonTT in #4593
Uplift third_party/tt_forge_models to 7477c75d5b02f21cefb28def2f8023260fe1bb09 2026-05-12 by @acicovicTT in #4634
sparse_mlp: add DeepSeek-V4 MoE support by @sshonTT in #4622
Add retry on release notes update by @vvukomanTT in #4649
Uplift third_party/tt_forge_models to 7c4207b180592babb6f472764fec3bfc99577118 2026-05-13 by @vmilosevic in #4658
[HunyuanVideo-1.5-Diffusers-480p_t2v_distilled] Add tiled VAE decoder test by @kamalrajkannan78 in #4621
Update Mistral model config and bump torch version in CLAUDE.md by @devisettymahidhar608 in #4656
Nightly) Test DeepSeek-V3.2-Exp MoE block with real HF weights by @sshonTT in #4680
Uplift third_party/tt_forge_models to 93218a34fc9fc6a671e0e41101da470c80891b2a 2026-05-14 by @vmilosevic in #4684
Add triage skill for unpack_forward_output FE failures by @agobeljicTT in #4549
Update transfuser config based on latest main by @saiarthiraguram in #4700
add configs for openlem jax model by @ctr-pmuruganTT in #4666
add dump_irs option to upload pytest IR artifacts by @ndrakulicTT in #4506
Align vllm_benchmark with llm_benchmark and add opt-125m tests by @alinakhanTT in #4654
Uplift third_party/tt-mlir to eb9005fa360a80e44607e2dfd4404137b510092e 2026-05-14 by @acicovicTT in #4569
Add required runtime debug tools to tt-xla explorer wheel by @nsumrakTT in #4385
Add deepseek v3.2 prefill and indexer tests to nightly by @gengelageTT in #4669
[pjrt] release host source after layout migration by @sshonTT in #4594
Uplift third_party/tt_forge_models to a64a98131c35b010895198f489355d0e6306934f 2026-05-15 by @vmilosevic in #4715
Update test config statuses for YOLOv9, YOLOS, and OLM OCR by @agobeljicTT in #4723
Update the inference test config of d_fine variants based on latest main results by @kamalrajkannan78 in #4717
[Benchmark] Make sure decode PCC comparison uses same input as golden by @odjuricicTT in #4661
[vLLM tests] Fix INTERNALERROR in cleanup hook; relax test_seed_mixed_batch xfail to strict=False (onPR Flaky Fail) by @kmabeeTT in #4736
[Composite OPs] Register autograd for xla::mark_tensor by @umalesTT in #4731
Enable optimization level configuration for tests in test infra by @acicovicTT in #4636
Migrate tt-xla docs to the sphinx backend by @acicovicTT in #4686
Set opt. lvl. in fusion tests for concatenate_heads to 1 by @acicovicTT in #4763
[PJRT] stop clearing program cache on executable destroy by @mstojkovicTT in #4734
Apply torch.manual_seed fixture to benchmark tests by @gengelageTT in #4713
[CI] Pull sfpi for local builds by @nsumrakTT in #4769
Fix: correct image path in README by @devisettymahidhar608 in #4770
Enable ttmlir python tools in debug build by @ndrakulicTT in #4776
Nightly Maintance may16 by @devisettymahidhar608 in #4739
Uplift third_party/tt_forge_models to 6519407b21b991539aa75880f5b9333c80475991 2026-05-20 by @vmilosevic in #4793
Adds config for motif model by @saiarthiraguram in #4745
Update the inference test config of panoptic segmentation variants w.r.t current main by @kamalrajkannan78 in #4795
Nightly Maintance may19 by @devisettymahidhar608 in #4796
Add Playground v2.5 component tests by @kamalrajkannan78 in #4711
Add HiDream-I1-Fast component tests by @kamalrajkannan78 in #4759
Add SDXL-Lightning component tests by @kamalrajkannan78 in #4730
Add working vLLM benchmarks for single-chip and TP models to CI by @alinakhanTT in #4732
[vLLM plugin] Unify pooling runner input layout to [num_reqs, tokens] by @mmanzoorTT in #4673
Fix off by 1 in accuracy benchmark loop by @odjuricicTT in #4807
[CI] Move p150 perf benchmarks from experimental to main nightly by @rpavlovicTT in #4808
Add scatter add tests for ttir.embedding_backward coverage by @ddilbazTT in #4826
Update Pi0, GR00T, and DeepSeek OCR inference configs by @ashokkumarkannan1 in #4683
Uplift third_party/tt-mlir to 2bd67018499ffa0ed9a0aee507325d75a8e46b84 2026-05-21 by @vmilosevic in #4716
Uplift third_party/tt_forge_models to f96d6a82a01cb2fe2133d45431b2a6620fc7c792 2026-05-21 by @vmilosevic in #4817
Bump opt. lvl. for Qwen 3 4B to 1 by @acicovicTT in #4867
Add HunyuanImage 2.1 component tests (text_encoder, text_encoder_2, transformer, vae) by @kamalrajkannan78 in #4782
Update kimi-k2 mla cache test to use tt-forge-models by @gengelageTT in #4830
Add LLMBox Deepseek v4 tests by @hshahTT in #4743
Uplift third_party/tt_forge_models to 7201811e7020d0e35e908df47a9e57926ba0aa1c 2026-05-23 by @vmilosevic in #4882

Full Changelog: 1.1.0...1.2.0

LLM Performance

Model	Token/sec/user	Batch	Token/sec	ttft (ms)
facebook/opt-125m	6.0	1	6.0	175.07
pytorch_Falcon_3_1B_Base_nlp_causal_lm_huggingface	57.0	32	1824.0	281.39
pytorch_Falcon_3_3B_Base_nlp_causal_lm_huggingface	37.0	32	1184.0	385.3
pytorch_Gemma_1.1_2B_IT_nlp_causal_lm_huggingface	40.0	32	1280.0	428.1
pytorch_Llama_3.1_8B_Instruct_nlp_causal_lm_huggingface	22.0	32	704.0	655.24
pytorch_Llama_3.2_1B_Instruct_nlp_causal_lm_huggingface	68.0	32	2176.0	248.85
pytorch_Mistral_7B_INSTRUCT_v03_nlp_causal_lm_huggingface	20.0	32	640.0	638.69
pytorch_Mistral_Ministral_8B_Instruct_nlp_causal_lm_huggingface	12.0	32	384.0	304.81
pytorch_Phi-1.5_Phi_1_5_nlp_causal_lm_huggingface	24.0	32	768.0	462.4
pytorch_Phi-1_Phi_1_nlp_causal_lm_huggingface	24.0	32	768.0	457.77
pytorch_Phi-2_Phi_2_nlp_causal_lm_huggingface	11.0	32	352.0	1002.6
pytorch_Qwen 2.5_0.5B_Instruct_nlp_causal_lm_huggingface	81.0	32	2592.0	286.17
pytorch_Qwen 2.5_1.5B_Instruct_nlp_causal_lm_huggingface	39.0	32	1248.0	350.69
pytorch_Qwen 2.5_3B_Instruct_nlp_causal_lm_huggingface	33.0	32	1056.0	531.72
pytorch_Qwen 2.5_7B_Instruct_nlp_causal_lm_huggingface	16.0	32	512.0	759.47
pytorch_Qwen 3_0_6B_nlp_causal_lm_huggingface	36.0	32	1152.0	451.46
pytorch_Qwen 3_1_7B_nlp_causal_lm_huggingface	30.0	32	960.0	490.97
pytorch_Qwen 3_4B_nlp_causal_lm_huggingface	18.0	32	576.0	683.49
pytorch_Qwen 3_8B_nlp_causal_lm_huggingface	13.0	32	416.0	806.92
tiiuae/Falcon3-1B-Base	32.0	1	32.0	50.67

Non-LLM Performance

Model	Batch	Sample/sec
pytorch_BERT_emrecan/bert-base-turkish-cased-mean-nli-stsb-tr_nlp_embed_gen_huggingface	8	44.0
pytorch_BGE-M3_Base_nlp_embed_gen_custom	4	9.0
pytorch_EfficientNet_Timm_B0_cv_image_cls_timm	8	332.0
pytorch_MNIST_Cnn_Dropout_cv_image_cls_custom	32	14688.0
pytorch_MobileNetV2_Mobilenet_v2_cv_image_cls_torch_hub	12	1252.0
pytorch_Qwen 3_Embedding_4B_nlp_embed_gen_huggingface	32	46.0
pytorch_ResNet_ResNet50_HuggingFace_cv_image_cls_huggingface	8	1353.0
pytorch_SegFormer_B0_Finetuned_Ade_512_512_cv_image_seg_huggingface	1	38.0
pytorch_Swin_S_cv_image_cls_torchvision	1	9.0
pytorch_U-Net for Conditional Generation_Base_conditional_generation_huggingface	1	3.0
pytorch_Ultra-Fast Lane Detection v2_TuSimple_ResNet34_Backbone_cv_image_seg_github	1	143.0
pytorch_VGG19-UNet_base_cv_image_seg_custom	1	151.0
pytorch_ViT_Base_cv_image_cls_huggingface	8	237.0
pytorch_VoVNet_Ese_Vovnet19b_Dw.ra_In1k_cv_image_cls_timm	8	713.0

Model coverage

Info: Full list of supported models is available in the assets section.

Model task	Model architecture	Model variant	Model framework	Inference	Training	n150	n300	p150	Single device	Data parallel	Tensor parallel	Model source
conditional generation	U-Net for Conditional Generation	Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	AlexNet	Custom 1x2	jax	✅	❌	❌	✅	❌	❌	❌	✅	View Source
cv image cls	DINOv2	Small	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv image cls	EfficientNet	B0	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	MNIST	Cnn Batchnorm	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Cnn Dropout	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Cnn Dropout	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Cnn Nodropout	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Mlp Custom	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Mlp Custom	jax	❌	✅	✅	❌	✅	✅	❌	❌	View Source
cv image cls	MNIST	Mlp Custom 1x2	jax	✅	❌	❌	✅	❌	❌	❌	✅	View Source
cv image cls	MobileNetV1	Mobilenet v1	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv image cls	MobileNetV2	Mobilenet v2	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	ResNet	ResNet50 HuggingFace High Resolution	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	SegFormer	Mit B0	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	Swin	S	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	VGG	HF Vgg19	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image cls	ViT	Base	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image cls	VoVNet	Ese Vovnet19b Dw.ra In1k	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv image seg	MaskFormer Swin-B	Swin Base Coco	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image seg	Ultra-Fast Lane Detection	TuSimple ResNet18 Backbone	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv image seg	VGG19-UNet	base	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv img to img	Autoencoder	linear	pytorch	❌	✅	✅	❌	✅	✅	❌	❌	View Source
cv object det	Attention DenseUNet	Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	DETR	ResNet50 Backbone	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv object det	OWL-ViT	Base Patch32	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	PointPillars	pointpillars	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOP	Default	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOS Small	Small	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOv4	Base	pytorch	✅	❌	✅	✅	✅	✅	✅	❌	View Source
cv object det	YOLOv7	Default	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
cv object det	YOLOv9	T	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv object det	ssd512	ssd512	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
cv panoptic seg	Panoptic Segmentation	ResNet50 Backbone 1x COCO	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
mm action prediction	OpenVLA-OFT	Finetuned Libero 10	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm action prediction	pi_0	pi0 base	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm image text similarity	CLIP	Base Patch16	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm image text similarity	SigLIP	Base Patch16 224	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
mm visual qa	Mistral	base	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	ALLaM	7B Instruct	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	Command_A_Reasoning	command-a-reasoning-08-2025	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Falcon	3 10B Base	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Falcon	3 1B Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Falcon	3 3B Base	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Falcon	3 7B Base	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	GPT-2	Base	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	GPT-2	Xl	jax	❌	✅	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	GPT-OSS	20B	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Gemma	1.1 2B IT	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Gemma	1.1 7B IT	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Gemma	2 27B IT	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Gemma	2 2B IT	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Gemma	2 9B IT	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Llama	3.1 70B	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Llama	3.1 70B Instruct	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Llama	3.1 8B Instruct	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Llama	3.2 1B	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Llama	3.2 3B	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Llama	3.3 70B Instruct	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	7B INSTRUCT v03	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Mistral	Devstral Small 2505	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Large INSTRUCT 2411	pytorch	✅	❌	❌	❌	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Magistral Small 2506	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Ministral 8B Instruct	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Mistral	Nemo INSTRUCT 2407	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Mistral	Small 24B INSTRUCT 2501	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Phi-1	Phi 1	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-1	Phi 1	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-1.5	Phi 1 5	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-1.5	Phi 1 5	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-2	Phi 2	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-3	Mini 128K Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-3	Mini 4K Instruct	pytorch	✅	❌	❌	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-3	Mini Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Phi-4	Phi 4	pytorch	✅	❌	❌	✅	✅	✅	❌	✅	View Source
nlp causal lm	Qwen 2	Qwq 32B	pytorch	✅	❌	❌	✅	❌	❌	❌	✅	View Source
nlp causal lm	Qwen 2.5	0.5B	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Qwen 2.5	0.5B Instruct	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Qwen 2.5	0.5B Instruct	pytorch	✅	❌	✅	❌	✅	✅	❌	❌	View Source
nlp causal lm	Qwen 2.5	1.5B Instruct	jax	✅	❌	✅	❌	✅	✅	❌	❌	View Source

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.2.0

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Installation

Via PyPI

Via Docker

What's Changed

LLM Performance

Non-LLM Performance

Model coverage

Contributors

Uh oh!