The PyTorch 2.0 TT-NN Compiler enables seamless execution of PyTorch models on Tenstorrent AI accelerators. By leveraging the TT-NN backend, you can achieve significant performance improvements while maintaining PyTorch's familiar API.
Install from the repo:
pip install git+https://bitbucket.org/tenstorrent/pytorch2.0_ttnn
or as an editable package from source:
git clone https://github.com/tenstorrent/pytorch2.0_ttnn.git
cd pytorch2.0_ttnn
pip install -e .
Option 1: Eager Mode: get your model running by switching to a TT device
import torch
import torch_ttnn
model = YourModel()
device = ttnn.open_device(device_id=0)
model.to(torch_ttnn.ttnn_device_as_torch_device(device))
output = model(input_data)
Option 2: Compilation Mode (Recommended): get more perf with a JIT compiler
import torch
import torch_ttnn
model = YourModel()
device = ttnn.open_mesh_device(ttnn.MeshShape(1, 2)) # 1x2 device grid
option = torch_ttnn.TorchTtnnOption(device=device, data_parallel=2)
model = torch.compile(model, backend=torch_ttnn.backend, options=option)
output = model(input_data)
We've extensively tested the compiler across a diverse range of model architectures. Here's a summary of our validation results:
Model | Status | Batch | Compiled First Run (ms) | Original Throughput (Inferences Per Second) | Compiled Throughput (Inferences Per Second) | Accuracy (%) | Torch Ops Before (Unique Ops) | Torch Ops Remain (Unique Ops) | To/From Device Ops |
---|---|---|---|---|---|---|---|---|---|
Autoencoder (linear) | ✅ | 1 | 380.4 | 0.466888 | 526.3157894736842 | 100.0 | 22 (3) | 0 (0) | 0 |
BERT | ✅ | 8 | 45896.43 | 0.0107214 | 39.95205753096284 | 99.69 | 1465 (22) | 0 (0) | 0 |
DPR | ✅ | 1 | 18587.31 | 0.354789 | 72.30657989877079 | 99.38 | 720 (22) | 0 (0) | 1 |
HardNet | ✅ | 1 | 168441.12 | 0.196658 | 19.98001998001998 | 98.45 | 245 (10) | 0 (0) | 124 |
MLPMixer | ✅ | 1 | 18600.14 | 0.201063 | 79.1139240506329 | 99.99 | 253 (11) | 0 (0) | 0 |
Mnist | ✅ | 1 | 5871.58 | 30.1296 | 408.1632653061224 | 99.42 | 14 (8) | 0 (0) | 1 |
MobileNetV2 | ✅ | 1 | 94348.17 | 1.12857 | 38.37298541826554 | 99.09 | 154 (9) | 0 (0) | 0 |
OpenPose V2 | ✅ | 1 | 19618.0 | 0.334242 | 35.67606136282554 | 91.49 | 155 (7) | 0 (0) | 6 |
Perceiver IO | ✅ | 1 | 51594.21 | 0.0204124 | 19.227071716977502 | 99.95 | 1531 (20) | 0 (0) | 1 |
ResNet18 | ✅ | 1 | 50784.1 | 0.420679 | 73.20644216691069 | 99.27 | 70 (9) | 0 (0) | 1 |
ResNet50 | ✅ | 4 | 77262.36 | 0.752556 | 46.576618537494184 | 98.61 | 176 (9) | 0 (0) | 1 |
RoBERTa | ✅ | 1 | 35202.52 | 0.0768288 | 22.10921954455008 | 28.56 | 719 (21) | 0 (0) | 3 |
U-Net | ✅ | 1 | 74530.51 | 0.0159431 | 67.52194463200539 | 100.0 | 68 (6) | 0 (0) | 12 |
Unet-brain | ✅ | 1 | 3374.58 | 0.0164509 | 55.2791597567717 | N/A | 68 (6) | 0 (0) | 12 |
Unet-carvana | ✅ | 1 | 32995.1 | 0.0118069 | 30.59039461609055 | 99.69 | 67 (5) | 0 (0) | 12 |
albert/albert-base-v2 | ✅ | 1 | 26818.2 | 0.734258 | 42.15851602023609 | 98.82 | 791 (21) | 0 (0) | 3 |
albert/albert-base-v2-classification | ✅ | 1 | 10319.82 | 0.740883 | 46.66355576294914 | 99.97 | 779 (21) | 0 (0) | 2 |
albert/albert-large-v2 | ✅ | 1 | 20157.95 | 0.392332 | 24.236548715462916 | 98.95 | 1547 (21) | 0 (0) | 3 |
albert/albert-xlarge-v2 | ✅ | 1 | 44869.53 | 0.105447 | 12.891581797086504 | 97.36 | 1547 (21) | 0 (0) | 3 |
densenet121 | ✅ | 1 | 143878.54 | 0.260364 | 13.027618551328816 | 99.74 | 432 (10) | 0 (0) | 597 |
densenet161 | ✅ | 1 | 193868.05 | 0.102238 | 9.669309611293752 | 99.49 | 572 (10) | 0 (0) | 1147 |
densenet169 | ✅ | 1 | 252974.82 | 0.258978 | 9.337940050424876 | 99.58 | 600 (10) | 0 (0) | 1241 |
densenet201 | ✅ | 1 | 87818.07 | 0.209147 | 7.605141075366947 | 99.39 | 712 (10) | 0 (0) | 1905 |
distilbert-base-uncased | ✅ | 1 | 27401.45 | 0.730663 | 85.39709649871904 | 72.37 | 361 (16) | 0 (0) | 1 |
dla34.in1k | ✅ | 1 | 75717.13 | 0.270115 | 37.579857196542655 | 99.48 | 135 (9) | 0 (0) | 23 |
ese_vovnet19b_dw.ra_in1k | ✅ | 1 | 64235.77 | 0.541729 | 51.09862033725089 | 99.44 | 111 (12) | 0 (0) | 19 |
ghostnet_100.in1k | ✅ | 1 | 191671.49 | 0.662032 | 18.195050946142647 | 99.6 | 515 (14) | 0 (0) | 64 |
mobilenet_v2 | ✅ | 1 | 82737.52 | 1.0306 | 32.44646333549643 | 99.09 | 154 (9) | 0 (0) | 0 |
mobilenet_v3_large | ✅ | 1 | 88933.92 | 1.25044 | 32.66906239790919 | 99.15 | 188 (11) | 0 (0) | 0 |
mobilenet_v3_small | ✅ | 1 | 118022.69 | 1.78333 | 34.81894150417828 | 99.09 | 158 (11) | 0 (0) | 0 |
mobilenetv1_100.ra4_e3600_r224_in1k | ✅ | 1 | 66679.44 | 0.857074 | 59.98800239952009 | 96.04 | 85 (7) | 0 (0) | 0 |
regnet_x_16gf | ✅ | 1 | 63865.16 | 0.0664586 | 15.651901706057284 | 99.56 | 235 (8) | 0 (0) | 0 |
regnet_x_1_6gf | ✅ | 1 | 56965.52 | 0.41886 | 27.540622418066647 | 99.47 | 195 (8) | 0 (0) | 0 |
regnet_x_32gf | ✅ | 1 | 107482.93 | 0.0281106 | 8.057368463459834 | 99.27 | 245 (8) | 0 (0) | 0 |
regnet_x_3_2gf | ✅ | 1 | 91311.5 | 0.25224 | 22.629554197782305 | 99.5 | 265 (8) | 0 (0) | 0 |
regnet_x_400mf | ✅ | 1 | 82798.78 | 0.968251 | 26.96144513345915 | 99.66 | 235 (8) | 0 (0) | 0 |
regnet_x_800mf | ✅ | 1 | 54691.23 | 0.671384 | 34.17634996582365 | 99.44 | 175 (8) | 0 (0) | 0 |
regnet_x_8gf | ✅ | 1 | 56008.59 | 0.127511 | 18.086453246518357 | 98.99 | 245 (8) | 0 (0) | 0 |
regnet_y_16gf | ✅ | 1 | 148402.5 | 0.065168 | 11.763321962122102 | 99.71 | 303 (10) | 0 (0) | 0 |
regnet_y_1_6gf | ✅ | 1 | 132566.84 | 0.435114 | 15.586034912718205 | 99.65 | 447 (10) | 0 (0) | 0 |
regnet_y_32gf | ✅ | 1 | 128920.33 | 0.0341463 | 7.935878104912309 | 99.72 | 335 (10) | 0 (0) | 0 |
regnet_y_3_2gf | ✅ | 1 | 90827.03 | 0.27943 | 18.90359168241966 | 99.82 | 351 (10) | 0 (0) | 0 |
regnet_y_400mf | ✅ | 1 | 95965.88 | 1.12252 | 25.64102564102564 | 99.64 | 271 (10) | 0 (0) | 0 |
regnet_y_800mf | ✅ | 1 | 72225.13 | 0.636991 | 28.481913984619766 | 99.59 | 239 (10) | 0 (0) | 0 |
regnet_y_8gf | ✅ | 1 | 84200.02 | 0.1165 | 17.46419839329375 | 99.82 | 287 (10) | 0 (0) | 0 |
resnet101 | ✅ | 1 | 16838.34 | 0.105869 | 18.057060310581434 | 99.28 | 346 (9) | 0 (0) | 1 |
resnet152 | ✅ | 1 | 100397.93 | 0.0872946 | 10.925379656943079 | 99.14 | 516 (9) | 0 (0) | 1 |
resnet18 | ✅ | 1 | 24661.18 | 0.421944 | 76.68711656441718 | 99.63 | 70 (9) | 0 (0) | 1 |
resnet34 | ✅ | 1 | 53904.96 | 0.208159 | 44.662795891022775 | 98.9 | 126 (9) | 0 (0) | 1 |
resnet50 | ✅ | 1 | 54402.09 | 0.194155 | 33.82949932341001 | 98.61 | 176 (9) | 0 (0) | 1 |
resnext101_32x8d | ✅ | 1 | 53438.11 | 0.0634725 | 8.774238834781084 | 99.57 | 346 (9) | 0 (0) | 1 |
resnext101_64x4d | ✅ | 1 | 97216.09 | 0.0642695 | 8.939746111210441 | 99.65 | 346 (9) | 0 (0) | 1 |
resnext50_32x4d | ✅ | 1 | 32819.5 | 0.218632 | 29.7000297000297 | 99.44 | 176 (9) | 0 (0) | 1 |
textattack/albert-base-v2-imdb | ✅ | 1 | 35501.15 | 0.735051 | 45.41326067211626 | 100.0 | 782 (22) | 0 (0) | 2 |
tf_efficientnet_lite0.in1k | ✅ | 1 | 122760.64 | 0.459021 | 24.764735017335312 | 99.3 | 149 (9) | 0 (0) | 5 |
tf_efficientnet_lite1.in1k | ✅ | 1 | 77887.44 | 0.564844 | 18.695083193120208 | 99.56 | 194 (9) | 0 (0) | 5 |
tf_efficientnet_lite2.in1k | ✅ | 1 | 147601.07 | 0.43213 | 13.222266296443212 | 99.21 | 194 (9) | 0 (0) | 5 |
twmkn9/albert-base-v2-squad2 | ✅ | 1 | 22227.95 | 0.0400934 | 39.635354736424894 | 99.86 | 783 (23) | 0 (0) | 2 |
vgg11 | ✅ | 1 | 40769.16 | 0.0835439 | 98.42519685039369 | 99.65 | 33 (8) | 0 (0) | 5 |
vgg11_bn | ✅ | 1 | 2696.0 | 0.0778108 | 81.9000819000819 | 98.93 | 41 (9) | 0 (0) | 5 |
vgg13 | ✅ | 1 | 59941.63 | 0.0518468 | 81.43322475570034 | 99.35 | 37 (8) | 0 (0) | 5 |
vgg13_bn | ✅ | 1 | 6718.52 | 0.0608414 | 72.72727272727272 | 97.31 | 47 (9) | 0 (0) | 5 |
vgg16 | ✅ | 1 | 2328.64 | 0.0386412 | 70.57163020465774 | 99.44 | 43 (8) | 0 (0) | 5 |
vgg16_bn | ✅ | 1 | 66344.2 | 0.0388076 | 62.18905472636817 | 98.37 | 56 (9) | 0 (0) | 5 |
vgg19 | ✅ | 1 | 2638.23 | 0.0312266 | 62.26650062266501 | 99.24 | 49 (8) | 0 (0) | 5 |
vgg19_bn | ✅ | 1 | 4137.37 | 0.0351847 | 54.975261132490374 | 96.97 | 65 (9) | 0 (0) | 5 |
wide_resnet101_2 | ✅ | 1 | 92672.72 | 0.0437005 | 16.960651289009498 | 99.2 | 346 (9) | 0 (0) | 1 |
wide_resnet50_2 | ✅ | 1 | 83865.69 | 0.0805556 | 32.67973856209151 | 98.8 | 176 (9) | 0 (0) | 1 |
xception71.tf_in1k | ✅ | 1 | 118354.65 | 0.059471 | 4.079468037367928 | 99.21 | 393 (9) | 0 (0) | 0 |
Autoencoder (conv) | 🚧 | 1 | 3813.67 | 0.491205 | 335.5704697986577 | 100.0 | 9 (3) | 1 (1) | 1 |
Autoencoder (conv)-train | 🚧 | 1 | 14060.64 | 0.336507 | 155.27950310559004 | 100.0 | 24 (7) | 11 (4) | 0 |
Autoencoder (linear)-train | 🚧 | 1 | 14020.84 | 0.322749 | 76.62835249042145 | 100.0 | 104 (8) | 14 (2) | 0 |
Bloom | 🚧 | 1 | 46306.88 | 0.0334381 | 1.453361625439642 | 98.86 | 1405 (27) | 2 (2) | 0 |
CLIP | 🚧 | 1 | 59223.2 | 0.0273692 | 5.929439667951378 | 99.56 | 1397 (30) | 7 (6) | 2 |
CLIP-train | 🚧 | 1 | 84239.57 | 0.0427298 | 0.672888643658361 | 100.0 | 3944 (44) | 265 (16) | 5 |
DETR | 🚧 | 1 | 155683.9 | 0.00907839 | 0.1873753953620842 | 94.02 | 1663 (42) | 9 (6) | 3 |
DINOv2 | 🚧 | 1 | 32610.64 | 0.0517643 | 14.684287812041118 | 98.99 | 928 (25) | 16 (1) | 2 |
GLPN-KITTI | 🚧 | 1 | 272913.25 | 0.00816913 | 0.016969772253777514 | 99.77 | 2959 (26) | 22 (2) | 6 |
GPT-2 | 🚧 | 1 | 28662.68 | 0.391319 | 32.31017770597738 | 99.98 | 745 (29) | 2 (2) | 2 |
HardNet-train | 🚧 | 1 | 142217.49 | 0.0778058 | 0.12227375381646953 | 100.0 | 867 (21) | 412 (9) | 120 |
MLPMixer-train | 🚧 | 1 | 39609.41 | 0.0566745 | 0.10479917334412067 | 100.0 | 616 (19) | 100 (5) | 0 |
Mnist-train | 🚧 | 1 | 17872.95 | 0.289311 | 35.448422545196735 | 100.0 | 46 (15) | 10 (6) | 0 |
MobileNetSSD | 🚧 | 1 | 246416.16 | 1.41758 | 0.4738078993252976 | 43.63 | 522 (31) | 7 (4) | 32 |
OpenPose V2-train | 🚧 | 1 | 78649.12 | 0.0949146 | 0.12672761420058953 | 100.0 | 523 (14) | 246 (7) | 6 |
ResNet18-train | 🚧 | 1 | 43716.08 | 0.172851 | 0.24379897311872523 | 100.0 | 241 (19) | 121 (9) | 0 |
ResNet50-train | 🚧 | 1 | 75338.18 | 0.0588952 | 0.08341640216715814 | 100.0 | 616 (19) | 318 (9) | 0 |
SegFormer | 🚧 | 1 | 27513.11 | 0.0242198 | 3.8147554741741057 | 99.86 | 676 (22) | 16 (1) | 4 |
SegFormer-train | 🚧 | 1 | 176857.68 | 0.01241 | 0.02995324896900917 | 100.0 | 1794 (36) | 156 (12) | 4 |
U-Net-train | 🚧 | 1 | 92426.66 | 0.0124864 | 0.02529950822815906 | 100.0 | 236 (15) | 122 (8) | 8 |
Unet-brain-train | 🚧 | 1 | 61166.68 | 0.00956468 | 0.021203808034292494 | 100.0 | 236 (15) | 122 (8) | 8 |
Unet-carvana-train | 🚧 | 1 | 151978.47 | 0.00584343 | 0.011400970655839697 | 100.0 | 232 (13) | 121 (7) | 8 |
YOLOS | 🚧 | 1 | 41291.88 | 0.0496856 | 3.5066802258302063 | 98.46 | 952 (27) | 17 (2) | 2 |
YOLOv3 | 🚧 | 1 | 80197.6 | 0.00506433 | 17.7367860943597 | 98.63 | 250 (7) | 2 (1) | 4 |
albert/albert-xxlarge-v2 | 🚧 | 1 | 19281.55 | 0.0532404 | 6.837606837606837 | 98.54 | 791 (21) | 24 (1) | 3 |
dla34.in1k-train | 🚧 | 1 | 51271.09 | 0.104225 | 0.15712696487269573 | 100.0 | 469 (18) | 230 (8) | 17 |
ese_vovnet19b_dw.ra_in1k-train | 🚧 | 1 | 98798.79 | 0.207539 | 0.31701448439179186 | 100.0 | 383 (25) | 176 (10) | 16 |
facebook/deit-base-patch16-224 | 🚧 | 1 | 28152.33 | 0.0645301 | 8.766546857192951 | 98.34 | 685 (17) | 1 (1) | 1 |
facebook/deit-base-patch16-224-train | 🚧 | 1 | 34051.86 | 0.0135963 | 0.8779862507353134 | 100.0 | 1854 (27) | 127 (8) | 2 |
ghostnet_100.in1k-train | 🚧 | 1 | 164724.83 | 0.665053 | 0.5503092738118822 | 100.0 | 1469 (33) | 562 (12) | 64 |
ghostnetv2_100.in1k | 🚧 | 1 | 81170.24 | 0.694666 | 8.968609865470851 | 99.65 | 683 (18) | 24 (2) | 68 |
ghostnetv2_100.in1k-train | 🚧 | 1 | 72813.84 | 0.451241 | 0.2580358824698163 | 100.0 | 2001 (39) | 852 (17) | 68 |
googlenet | 🚧 | 1 | 147537.26 | 0.449198 | 24.078979051288226 | 99.67 | 214 (15) | 1 (1) | 51 |
hrnet_w18.ms_aug_in1k | 🚧 | 1 | 122150.89 | 0.203783 | 5.1652892561983474 | 99.65 | 1209 (11) | 31 (1) | 0 |
hrnet_w18.ms_aug_in1k-train | 🚧 | 1 | 220129.59 | 0.068379 | 0.09654159059993841 | 100.0 | 3998 (21) | 1973 (9) | 0 |
inception_v4.tf_in1k | 🚧 | 1 | 134730.62 | 0.0824491 | 6.294454585510165 | 99.09 | 495 (11) | 14 (1) | 84 |
inception_v4.tf_in1k-train | 🚧 | 1 | 194641.97 | 0.0222046 | 0.03174487280623088 | 100.0 | 1851 (24) | 932 (11) | 80 |
mixer_b16_224.goog_in21k | 🚧 | 1 | 17043.65 | 0.0919227 | 9.143275121148394 | 3.65 | 356 (11) | 1 (1) | 0 |
mixer_b16_224.goog_in21k-train | 🚧 | 1 | 36026.15 | 0.0168269 | 0.8453299745555678 | 100.0 | 959 (18) | 101 (6) | 0 |
mobilenetv1_100.ra4_e3600_r224_in1k-train | 🚧 | 1 | 60787.42 | 0.270605 | 0.34602555052665085 | 100.0 | 258 (16) | 164 (7) | 0 |
regnet_y_128gf | 🚧 | 1 | 344402.67 | 0.00192155 | 0.01659929841405323 | 98.91 | 447 (10) | 3 (1) | 0 |
ssd300_vgg16 | 🚧 | 1 | 157675.17 | 0.277711 | 0.6674631727194452 | N/A | 332 (30) | 8 (5) | 37 |
ssdlite320_mobilenet_v3_large | 🚧 | 1 | 187960.71 | 1.06538 | 0.41905528177277146 | 41.24 | 522 (31) | 7 (4) | 32 |
swin_b | 🚧 | 1 | 131255.31 | 0.0782158 | 3.903810118675827 | 99.54 | 2492 (32) | 110 (2) | 479 |
swin_s | 🚧 | 1 | 30085.54 | 0.0933425 | 3.5318217136398955 | 99.68 | 2492 (32) | 110 (2) | 479 |
swin_t | 🚧 | 1 | 127891.92 | 0.175615 | 7.341065922771986 | 99.76 | 1238 (32) | 50 (2) | 227 |
swin_v2_b | 🚧 | 1 | 155984.62 | 0.0539692 | 2.893518518518518 | 23.56 | 3140 (40) | 158 (3) | 473 |
swin_v2_s | 🚧 | 1 | 29427.15 | 0.108186 | 3.4775351231047433 | 36.71 | 3140 (40) | 158 (3) | 473 |
swin_v2_t | 🚧 | 1 | 97601.96 | 0.201151 | 6.237135907191417 | 54.34 | 1562 (40) | 74 (3) | 221 |
tf_efficientnet_lite0.in1k-train | 🚧 | 1 | 136011.23 | 0.292251 | 0.13154917571286498 | 100.0 | 452 (17) | 285 (8) | 5 |
tf_efficientnet_lite1.in1k-train | 🚧 | 1 | 142142.64 | 0.230875 | 0.10081011004431613 | 100.0 | 587 (17) | 370 (8) | 5 |
tf_efficientnet_lite2.in1k-train | 🚧 | 1 | 124647.59 | 0.192633 | 0.07623084223646046 | 100.0 | 587 (17) | 370 (8) | 5 |
tf_efficientnet_lite3.in1k | 🚧 | 1 | 108309.6 | 0.353552 | 3.7418147801683816 | 99.15 | 221 (9) | 5 (1) | 5 |
tf_efficientnet_lite3.in1k-train | 🚧 | 1 | 114089.02 | 0.120481 | 0.05423225832343143 | 100.0 | 668 (17) | 426 (9) | 5 |
tf_efficientnet_lite4.in1k | 🚧 | 1 | 155553.34 | 0.188086 | 2.314814814814815 | 99.21 | 275 (9) | 6 (1) | 5 |
tf_efficientnet_lite4.in1k-train | 🚧 | 1 | 140577.17 | 0.0684103 | 0.03741863784932637 | 100.0 | 830 (17) | 529 (9) | 5 |
vit_b_16 | 🚧 | 1 | 33951.28 | 0.064918 | 6.692992436918547 | 99.52 | 552 (17) | 1 (1) | 1 |
vit_b_32 | 🚧 | 1 | 18068.67 | 0.185095 | 6.989097008666479 | 98.73 | 552 (17) | 1 (1) | 1 |
vit_h_14 | 🚧 | 1 | 695117.12 | 0.00130755 | 0.3776506353971941 | 98.14 | 1452 (17) | 1 (1) | 1 |
vit_l_16 | 🚧 | 1 | 56903.86 | 0.0196752 | 3.623844899438304 | 99.73 | 1092 (17) | 1 (1) | 1 |
vit_l_32 | 🚧 | 1 | 38114.73 | 0.0561787 | 4.78423117405033 | 99.06 | 1092 (17) | 1 (1) | 1 |
xception71.tf_in1k-train | 🚧 | 1 | 150427.21 | 0.0179414 | 0.01836345950682726 | 100.0 | 1378 (18) | 806 (7) | 0 |
FLAN-T5 | ❌ | N/A | N/A | 0.231409 | N/A | N/A | 20020 (38) | N/A | N/A |
Falcon-7B | ❌ | N/A | N/A | 0.0159877 | N/A | N/A | 2600 (27) | N/A | N/A |
GPTNeo | ❌ | N/A | N/A | 0.0948277 | N/A | N/A | 2733 (35) | N/A | N/A |
Llama | ❌ | N/A | N/A | 0.00554626 | N/A | N/A | 3690 (35) | N/A | N/A |
OPT | ❌ | N/A | N/A | 0.0400944 | N/A | N/A | 4003 (32) | N/A | N/A |
Stable Diffusion V2 | ❌ | N/A | N/A | 0.00134308 | N/A | N/A | 1870 (29) | N/A | N/A |
ViLT | ❌ | N/A | N/A | 0.0625589 | N/A | N/A | 766 (29) | N/A | N/A |
Whisper | ❌ | N/A | N/A | 0.00319721 | N/A | N/A | 4310 (21) | N/A | N/A |
YOLOv5 | ❌ | N/A | N/A | 0.0541112 | N/A | N/A | 236 (13) | N/A | N/A |
codegen | ❌ | N/A | N/A | 0.141725 | N/A | N/A | 9183 (37) | N/A | N/A |
speecht5-tts | ❌ | N/A | N/A | 0.0190955 | N/A | N/A | 6942 (40) | N/A | N/A |
t5-base | ❌ | N/A | N/A | 0.161917 | N/A | N/A | 14681 (38) | N/A | N/A |
t5-large | ❌ | N/A | N/A | 0.0217367 | N/A | N/A | 22696 (38) | N/A | N/A |
t5-small | ❌ | N/A | N/A | 0.392642 | N/A | N/A | 6118 (38) | N/A | N/A |
Model: Name of the model.
Status: Indicates whether the model is:
- ✅ End-to-end on device: All PyTorch operations have been converted to TT-NN operations.
- 🚧 Compiled: The converted model runs but some operations still fallback to PyTorch. This may be due to an unsupported operation or configuration.
- ❌ Traced: The model does not run but its PyTorch operations are traced for future development. This may indicate a temporary incompatibility with a compiler pass.
Batch: Batch size used for inference
Compiled First Run (ms): Time until the first compiled run finishes (ms), including compilation time and warming caches.
Original Throughput (Inferences Per Second): Execution throughput (in inferences per second) of the model before conversion.
Compiled Throughput (Inferences Per Second): Execution throughput (in inferences per second) of the model after conversion, once caches are warm.
Accuracy (%): Model accuracy on a predefined test dataset after conversion.
Torch Ops Before (Unique Ops): The total number of operations used by the model in the original Torch implementation. The number in parentheses represents the total unique ops.
Torch Ops Remain (Unique Ops): The total number of operations used after conversion to TT-NN. The number in parentheses represents the total unique ops.
To/From Device Ops: The number ofto/from_device
operations (data transfer to/from the device).
Whether you are new to Tenstorrent hardware or an experienced developer, there are many ways to contribute.
Start with our high level Contribution guide. You can find more information here:
- Discussions
- Operations Report
- Lowering TT-NN Operation to PyTorch
- Native Device Integration Extension
- Build with Metal from Source
- Known Issues
- Problem Solving
We encourage contributions and offer 🤑 Bounties for some issues.
To get started with development, you'll need a Wormhole or Blackhole Tenstorrent accelerator card, which:
- can be ordered on the Tenstorrent website
- can be requested on Koyeb
Install the development dependencies:
pip install -r requirements-dev.txt
pip install -e .
You can build the wheel file with
python -m build
torch_ttnn/
: Main package directory containing the core implementationtests/
: Test files for the project including model suites. We usepytest
as our testing framework.tools/
: Development and utility scriptsdocs/
: Project documentation and reportsdemo/
: Example code and usage demonstrations
If you have questions or need help getting started, please:
- Review the existing documentation
- Ask PyTorch TT-NN DeepWiki or TT-Metal DeepWiki
- Ask on Discord
- Open an issue on GitHub