PyTorch 2.0 TTNN Compiler

The PyTorch 2.0 TT-NN Compiler enables seamless execution of PyTorch models on Tenstorrent AI accelerators. By leveraging the TT-NN backend, you can achieve significant performance improvements while maintaining PyTorch's familiar API.

🚀 Quick Start

Installation

Install from the repo:

pip install git+https://bitbucket.org/tenstorrent/pytorch2.0_ttnn

or as an editable package from source:

git clone https://github.com/tenstorrent/pytorch2.0_ttnn.git
cd pytorch2.0_ttnn
pip install -e .

✨ Basic Usage

Option 1: Eager Mode: get your model running by switching to a TT device

import torch
import torch_ttnn

model = YourModel()

device = ttnn.open_device(device_id=0)
model.to(torch_ttnn.ttnn_device_as_torch_device(device))

output = model(input_data)

Option 2: Compilation Mode (Recommended): get more perf with a JIT compiler

import torch
import torch_ttnn

model = YourModel()

device = ttnn.open_mesh_device(ttnn.MeshShape(1, 2))  # 1x2 device grid
option = torch_ttnn.TorchTtnnOption(device=device, data_parallel=2)

model = torch.compile(model, backend=torch_ttnn.backend, options=option)
output = model(input_data)

📊 Model Support

We've extensively tested the compiler across a diverse range of model architectures. Here's a summary of our validation results:

Model	Status	Batch	Compiled First Run (ms)	Original Throughput (Inferences Per Second)	Compiled Throughput (Inferences Per Second)	Accuracy (%)	Torch Ops Before (Unique Ops)	Torch Ops Remain (Unique Ops)	To/From Device Ops
Autoencoder (linear)	✅	1	380.4	0.466888	526.3157894736842	100.0	22 (3)	0 (0)	0
BERT	✅	8	45896.43	0.0107214	39.95205753096284	99.69	1465 (22)	0 (0)	0
DPR	✅	1	18587.31	0.354789	72.30657989877079	99.38	720 (22)	0 (0)	1
HardNet	✅	1	168441.12	0.196658	19.98001998001998	98.45	245 (10)	0 (0)	124
MLPMixer	✅	1	18600.14	0.201063	79.1139240506329	99.99	253 (11)	0 (0)	0
Mnist	✅	1	5871.58	30.1296	408.1632653061224	99.42	14 (8)	0 (0)	1
MobileNetV2	✅	1	94348.17	1.12857	38.37298541826554	99.09	154 (9)	0 (0)	0
OpenPose V2	✅	1	19618.0	0.334242	35.67606136282554	91.49	155 (7)	0 (0)	6
Perceiver IO	✅	1	51594.21	0.0204124	19.227071716977502	99.95	1531 (20)	0 (0)	1
ResNet18	✅	1	50784.1	0.420679	73.20644216691069	99.27	70 (9)	0 (0)	1
ResNet50	✅	4	77262.36	0.752556	46.576618537494184	98.61	176 (9)	0 (0)	1
RoBERTa	✅	1	35202.52	0.0768288	22.10921954455008	28.56	719 (21)	0 (0)	3
U-Net	✅	1	74530.51	0.0159431	67.52194463200539	100.0	68 (6)	0 (0)	12
Unet-brain	✅	1	3374.58	0.0164509	55.2791597567717	N/A	68 (6)	0 (0)	12
Unet-carvana	✅	1	32995.1	0.0118069	30.59039461609055	99.69	67 (5)	0 (0)	12
albert/albert-base-v2	✅	1	26818.2	0.734258	42.15851602023609	98.82	791 (21)	0 (0)	3
albert/albert-base-v2-classification	✅	1	10319.82	0.740883	46.66355576294914	99.97	779 (21)	0 (0)	2
albert/albert-large-v2	✅	1	20157.95	0.392332	24.236548715462916	98.95	1547 (21)	0 (0)	3
albert/albert-xlarge-v2	✅	1	44869.53	0.105447	12.891581797086504	97.36	1547 (21)	0 (0)	3
densenet121	✅	1	143878.54	0.260364	13.027618551328816	99.74	432 (10)	0 (0)	597
densenet161	✅	1	193868.05	0.102238	9.669309611293752	99.49	572 (10)	0 (0)	1147
densenet169	✅	1	252974.82	0.258978	9.337940050424876	99.58	600 (10)	0 (0)	1241
densenet201	✅	1	87818.07	0.209147	7.605141075366947	99.39	712 (10)	0 (0)	1905
distilbert-base-uncased	✅	1	27401.45	0.730663	85.39709649871904	72.37	361 (16)	0 (0)	1
dla34.in1k	✅	1	75717.13	0.270115	37.579857196542655	99.48	135 (9)	0 (0)	23
ese_vovnet19b_dw.ra_in1k	✅	1	64235.77	0.541729	51.09862033725089	99.44	111 (12)	0 (0)	19
ghostnet_100.in1k	✅	1	191671.49	0.662032	18.195050946142647	99.6	515 (14)	0 (0)	64
mobilenet_v2	✅	1	82737.52	1.0306	32.44646333549643	99.09	154 (9)	0 (0)	0
mobilenet_v3_large	✅	1	88933.92	1.25044	32.66906239790919	99.15	188 (11)	0 (0)	0
mobilenet_v3_small	✅	1	118022.69	1.78333	34.81894150417828	99.09	158 (11)	0 (0)	0
mobilenetv1_100.ra4_e3600_r224_in1k	✅	1	66679.44	0.857074	59.98800239952009	96.04	85 (7)	0 (0)	0
regnet_x_16gf	✅	1	63865.16	0.0664586	15.651901706057284	99.56	235 (8)	0 (0)	0
regnet_x_1_6gf	✅	1	56965.52	0.41886	27.540622418066647	99.47	195 (8)	0 (0)	0
regnet_x_32gf	✅	1	107482.93	0.0281106	8.057368463459834	99.27	245 (8)	0 (0)	0
regnet_x_3_2gf	✅	1	91311.5	0.25224	22.629554197782305	99.5	265 (8)	0 (0)	0
regnet_x_400mf	✅	1	82798.78	0.968251	26.96144513345915	99.66	235 (8)	0 (0)	0
regnet_x_800mf	✅	1	54691.23	0.671384	34.17634996582365	99.44	175 (8)	0 (0)	0
regnet_x_8gf	✅	1	56008.59	0.127511	18.086453246518357	98.99	245 (8)	0 (0)	0
regnet_y_16gf	✅	1	148402.5	0.065168	11.763321962122102	99.71	303 (10)	0 (0)	0
regnet_y_1_6gf	✅	1	132566.84	0.435114	15.586034912718205	99.65	447 (10)	0 (0)	0
regnet_y_32gf	✅	1	128920.33	0.0341463	7.935878104912309	99.72	335 (10)	0 (0)	0
regnet_y_3_2gf	✅	1	90827.03	0.27943	18.90359168241966	99.82	351 (10)	0 (0)	0
regnet_y_400mf	✅	1	95965.88	1.12252	25.64102564102564	99.64	271 (10)	0 (0)	0
regnet_y_800mf	✅	1	72225.13	0.636991	28.481913984619766	99.59	239 (10)	0 (0)	0
regnet_y_8gf	✅	1	84200.02	0.1165	17.46419839329375	99.82	287 (10)	0 (0)	0
resnet101	✅	1	16838.34	0.105869	18.057060310581434	99.28	346 (9)	0 (0)	1
resnet152	✅	1	100397.93	0.0872946	10.925379656943079	99.14	516 (9)	0 (0)	1
resnet18	✅	1	24661.18	0.421944	76.68711656441718	99.63	70 (9)	0 (0)	1
resnet34	✅	1	53904.96	0.208159	44.662795891022775	98.9	126 (9)	0 (0)	1
resnet50	✅	1	54402.09	0.194155	33.82949932341001	98.61	176 (9)	0 (0)	1
resnext101_32x8d	✅	1	53438.11	0.0634725	8.774238834781084	99.57	346 (9)	0 (0)	1
resnext101_64x4d	✅	1	97216.09	0.0642695	8.939746111210441	99.65	346 (9)	0 (0)	1
resnext50_32x4d	✅	1	32819.5	0.218632	29.7000297000297	99.44	176 (9)	0 (0)	1
textattack/albert-base-v2-imdb	✅	1	35501.15	0.735051	45.41326067211626	100.0	782 (22)	0 (0)	2
tf_efficientnet_lite0.in1k	✅	1	122760.64	0.459021	24.764735017335312	99.3	149 (9)	0 (0)	5
tf_efficientnet_lite1.in1k	✅	1	77887.44	0.564844	18.695083193120208	99.56	194 (9)	0 (0)	5
tf_efficientnet_lite2.in1k	✅	1	147601.07	0.43213	13.222266296443212	99.21	194 (9)	0 (0)	5
twmkn9/albert-base-v2-squad2	✅	1	22227.95	0.0400934	39.635354736424894	99.86	783 (23)	0 (0)	2
vgg11	✅	1	40769.16	0.0835439	98.42519685039369	99.65	33 (8)	0 (0)	5
vgg11_bn	✅	1	2696.0	0.0778108	81.9000819000819	98.93	41 (9)	0 (0)	5
vgg13	✅	1	59941.63	0.0518468	81.43322475570034	99.35	37 (8)	0 (0)	5
vgg13_bn	✅	1	6718.52	0.0608414	72.72727272727272	97.31	47 (9)	0 (0)	5
vgg16	✅	1	2328.64	0.0386412	70.57163020465774	99.44	43 (8)	0 (0)	5
vgg16_bn	✅	1	66344.2	0.0388076	62.18905472636817	98.37	56 (9)	0 (0)	5
vgg19	✅	1	2638.23	0.0312266	62.26650062266501	99.24	49 (8)	0 (0)	5
vgg19_bn	✅	1	4137.37	0.0351847	54.975261132490374	96.97	65 (9)	0 (0)	5
wide_resnet101_2	✅	1	92672.72	0.0437005	16.960651289009498	99.2	346 (9)	0 (0)	1
wide_resnet50_2	✅	1	83865.69	0.0805556	32.67973856209151	98.8	176 (9)	0 (0)	1
xception71.tf_in1k	✅	1	118354.65	0.059471	4.079468037367928	99.21	393 (9)	0 (0)	0
Autoencoder (conv)	🚧	1	3813.67	0.491205	335.5704697986577	100.0	9 (3)	1 (1)	1
Autoencoder (conv)-train	🚧	1	14060.64	0.336507	155.27950310559004	100.0	24 (7)	11 (4)	0
Autoencoder (linear)-train	🚧	1	14020.84	0.322749	76.62835249042145	100.0	104 (8)	14 (2)	0
Bloom	🚧	1	46306.88	0.0334381	1.453361625439642	98.86	1405 (27)	2 (2)	0
CLIP	🚧	1	59223.2	0.0273692	5.929439667951378	99.56	1397 (30)	7 (6)	2
CLIP-train	🚧	1	84239.57	0.0427298	0.672888643658361	100.0	3944 (44)	265 (16)	5
DETR	🚧	1	155683.9	0.00907839	0.1873753953620842	94.02	1663 (42)	9 (6)	3
DINOv2	🚧	1	32610.64	0.0517643	14.684287812041118	98.99	928 (25)	16 (1)	2
GLPN-KITTI	🚧	1	272913.25	0.00816913	0.016969772253777514	99.77	2959 (26)	22 (2)	6
GPT-2	🚧	1	28662.68	0.391319	32.31017770597738	99.98	745 (29)	2 (2)	2
HardNet-train	🚧	1	142217.49	0.0778058	0.12227375381646953	100.0	867 (21)	412 (9)	120
MLPMixer-train	🚧	1	39609.41	0.0566745	0.10479917334412067	100.0	616 (19)	100 (5)	0
Mnist-train	🚧	1	17872.95	0.289311	35.448422545196735	100.0	46 (15)	10 (6)	0
MobileNetSSD	🚧	1	246416.16	1.41758	0.4738078993252976	43.63	522 (31)	7 (4)	32
OpenPose V2-train	🚧	1	78649.12	0.0949146	0.12672761420058953	100.0	523 (14)	246 (7)	6
ResNet18-train	🚧	1	43716.08	0.172851	0.24379897311872523	100.0	241 (19)	121 (9)	0
ResNet50-train	🚧	1	75338.18	0.0588952	0.08341640216715814	100.0	616 (19)	318 (9)	0
SegFormer	🚧	1	27513.11	0.0242198	3.8147554741741057	99.86	676 (22)	16 (1)	4
SegFormer-train	🚧	1	176857.68	0.01241	0.02995324896900917	100.0	1794 (36)	156 (12)	4
U-Net-train	🚧	1	92426.66	0.0124864	0.02529950822815906	100.0	236 (15)	122 (8)	8
Unet-brain-train	🚧	1	61166.68	0.00956468	0.021203808034292494	100.0	236 (15)	122 (8)	8
Unet-carvana-train	🚧	1	151978.47	0.00584343	0.011400970655839697	100.0	232 (13)	121 (7)	8
YOLOS	🚧	1	41291.88	0.0496856	3.5066802258302063	98.46	952 (27)	17 (2)	2
YOLOv3	🚧	1	80197.6	0.00506433	17.7367860943597	98.63	250 (7)	2 (1)	4
albert/albert-xxlarge-v2	🚧	1	19281.55	0.0532404	6.837606837606837	98.54	791 (21)	24 (1)	3
dla34.in1k-train	🚧	1	51271.09	0.104225	0.15712696487269573	100.0	469 (18)	230 (8)	17
ese_vovnet19b_dw.ra_in1k-train	🚧	1	98798.79	0.207539	0.31701448439179186	100.0	383 (25)	176 (10)	16
facebook/deit-base-patch16-224	🚧	1	28152.33	0.0645301	8.766546857192951	98.34	685 (17)	1 (1)	1
facebook/deit-base-patch16-224-train	🚧	1	34051.86	0.0135963	0.8779862507353134	100.0	1854 (27)	127 (8)	2
ghostnet_100.in1k-train	🚧	1	164724.83	0.665053	0.5503092738118822	100.0	1469 (33)	562 (12)	64
ghostnetv2_100.in1k	🚧	1	81170.24	0.694666	8.968609865470851	99.65	683 (18)	24 (2)	68
ghostnetv2_100.in1k-train	🚧	1	72813.84	0.451241	0.2580358824698163	100.0	2001 (39)	852 (17)	68
googlenet	🚧	1	147537.26	0.449198	24.078979051288226	99.67	214 (15)	1 (1)	51
hrnet_w18.ms_aug_in1k	🚧	1	122150.89	0.203783	5.1652892561983474	99.65	1209 (11)	31 (1)	0
hrnet_w18.ms_aug_in1k-train	🚧	1	220129.59	0.068379	0.09654159059993841	100.0	3998 (21)	1973 (9)	0
inception_v4.tf_in1k	🚧	1	134730.62	0.0824491	6.294454585510165	99.09	495 (11)	14 (1)	84
inception_v4.tf_in1k-train	🚧	1	194641.97	0.0222046	0.03174487280623088	100.0	1851 (24)	932 (11)	80
mixer_b16_224.goog_in21k	🚧	1	17043.65	0.0919227	9.143275121148394	3.65	356 (11)	1 (1)	0
mixer_b16_224.goog_in21k-train	🚧	1	36026.15	0.0168269	0.8453299745555678	100.0	959 (18)	101 (6)	0
mobilenetv1_100.ra4_e3600_r224_in1k-train	🚧	1	60787.42	0.270605	0.34602555052665085	100.0	258 (16)	164 (7)	0
regnet_y_128gf	🚧	1	344402.67	0.00192155	0.01659929841405323	98.91	447 (10)	3 (1)	0
ssd300_vgg16	🚧	1	157675.17	0.277711	0.6674631727194452	N/A	332 (30)	8 (5)	37
ssdlite320_mobilenet_v3_large	🚧	1	187960.71	1.06538	0.41905528177277146	41.24	522 (31)	7 (4)	32
swin_b	🚧	1	131255.31	0.0782158	3.903810118675827	99.54	2492 (32)	110 (2)	479
swin_s	🚧	1	30085.54	0.0933425	3.5318217136398955	99.68	2492 (32)	110 (2)	479
swin_t	🚧	1	127891.92	0.175615	7.341065922771986	99.76	1238 (32)	50 (2)	227
swin_v2_b	🚧	1	155984.62	0.0539692	2.893518518518518	23.56	3140 (40)	158 (3)	473
swin_v2_s	🚧	1	29427.15	0.108186	3.4775351231047433	36.71	3140 (40)	158 (3)	473
swin_v2_t	🚧	1	97601.96	0.201151	6.237135907191417	54.34	1562 (40)	74 (3)	221
tf_efficientnet_lite0.in1k-train	🚧	1	136011.23	0.292251	0.13154917571286498	100.0	452 (17)	285 (8)	5
tf_efficientnet_lite1.in1k-train	🚧	1	142142.64	0.230875	0.10081011004431613	100.0	587 (17)	370 (8)	5
tf_efficientnet_lite2.in1k-train	🚧	1	124647.59	0.192633	0.07623084223646046	100.0	587 (17)	370 (8)	5
tf_efficientnet_lite3.in1k	🚧	1	108309.6	0.353552	3.7418147801683816	99.15	221 (9)	5 (1)	5
tf_efficientnet_lite3.in1k-train	🚧	1	114089.02	0.120481	0.05423225832343143	100.0	668 (17)	426 (9)	5
tf_efficientnet_lite4.in1k	🚧	1	155553.34	0.188086	2.314814814814815	99.21	275 (9)	6 (1)	5
tf_efficientnet_lite4.in1k-train	🚧	1	140577.17	0.0684103	0.03741863784932637	100.0	830 (17)	529 (9)	5
vit_b_16	🚧	1	33951.28	0.064918	6.692992436918547	99.52	552 (17)	1 (1)	1
vit_b_32	🚧	1	18068.67	0.185095	6.989097008666479	98.73	552 (17)	1 (1)	1
vit_h_14	🚧	1	695117.12	0.00130755	0.3776506353971941	98.14	1452 (17)	1 (1)	1
vit_l_16	🚧	1	56903.86	0.0196752	3.623844899438304	99.73	1092 (17)	1 (1)	1
vit_l_32	🚧	1	38114.73	0.0561787	4.78423117405033	99.06	1092 (17)	1 (1)	1
xception71.tf_in1k-train	🚧	1	150427.21	0.0179414	0.01836345950682726	100.0	1378 (18)	806 (7)	0
FLAN-T5	❌	N/A	N/A	0.231409	N/A	N/A	20020 (38)	N/A	N/A
Falcon-7B	❌	N/A	N/A	0.0159877	N/A	N/A	2600 (27)	N/A	N/A
GPTNeo	❌	N/A	N/A	0.0948277	N/A	N/A	2733 (35)	N/A	N/A
Llama	❌	N/A	N/A	0.00554626	N/A	N/A	3690 (35)	N/A	N/A
OPT	❌	N/A	N/A	0.0400944	N/A	N/A	4003 (32)	N/A	N/A
Stable Diffusion V2	❌	N/A	N/A	0.00134308	N/A	N/A	1870 (29)	N/A	N/A
ViLT	❌	N/A	N/A	0.0625589	N/A	N/A	766 (29)	N/A	N/A
Whisper	❌	N/A	N/A	0.00319721	N/A	N/A	4310 (21)	N/A	N/A
YOLOv5	❌	N/A	N/A	0.0541112	N/A	N/A	236 (13)	N/A	N/A
codegen	❌	N/A	N/A	0.141725	N/A	N/A	9183 (37)	N/A	N/A
speecht5-tts	❌	N/A	N/A	0.0190955	N/A	N/A	6942 (40)	N/A	N/A
t5-base	❌	N/A	N/A	0.161917	N/A	N/A	14681 (38)	N/A	N/A
t5-large	❌	N/A	N/A	0.0217367	N/A	N/A	22696 (38)	N/A	N/A
t5-small	❌	N/A	N/A	0.392642	N/A	N/A	6118 (38)	N/A	N/A

Explanation of Metrics

Model: Name of the model.
Status: Indicates whether the model is:

✅ End-to-end on device: All PyTorch operations have been converted to TT-NN operations.
🚧 Compiled: The converted model runs but some operations still fallback to PyTorch. This may be due to an unsupported operation or configuration.
❌ Traced: The model does not run but its PyTorch operations are traced for future development. This may indicate a temporary incompatibility with a compiler pass.
Batch: Batch size used for inference
Compiled First Run (ms): Time until the first compiled run finishes (ms), including compilation time and warming caches.
Original Throughput (Inferences Per Second): Execution throughput (in inferences per second) of the model before conversion.
Compiled Throughput (Inferences Per Second): Execution throughput (in inferences per second) of the model after conversion, once caches are warm.
Accuracy (%): Model accuracy on a predefined test dataset after conversion.
Torch Ops Before (Unique Ops): The total number of operations used by the model in the original Torch implementation. The number in parentheses represents the total unique ops.
Torch Ops Remain (Unique Ops): The total number of operations used after conversion to TT-NN. The number in parentheses represents the total unique ops.
To/From Device Ops: The number of to/from_device operations (data transfer to/from the device).

Contributing

Whether you are new to Tenstorrent hardware or an experienced developer, there are many ways to contribute.

Getting Started

Start with our high level Contribution guide. You can find more information here:

We encourage contributions and offer 🤑 Bounties for some issues.

Development Environment

To get started with development, you'll need a Wormhole or Blackhole Tenstorrent accelerator card, which:

can be ordered on the Tenstorrent website
can be requested on Koyeb

Install the development dependencies:

pip install -r requirements-dev.txt
pip install -e .

You can build the wheel file with

python -m build

Project Structure

torch_ttnn/: Main package directory containing the core implementation
tests/: Test files for the project including model suites. We use pytest as our testing framework.
tools/: Development and utility scripts
docs/: Project documentation and reports
demo/: Example code and usage demonstrations

Questions and Support

If you have questions or need help getting started, please:

Review the existing documentation
Ask PyTorch TT-NN DeepWiki or TT-Metal DeepWiki
Ask on Discord
Open an issue on GitHub

Name		Name	Last commit message	Last commit date
Latest commit History 687 Commits
.github		.github
demo		demo
dockerfile		dockerfile
docs		docs
tests		tests
tools		tools
torch_ttnn		torch_ttnn
tracer		tracer
.clang-format		.clang-format
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
VERSION		VERSION
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PyTorch 2.0 TTNN Compiler

🚀 Quick Start

Installation

✨ Basic Usage

📊 Model Support

Explanation of Metrics

Contributing

Getting Started

Development Environment

Project Structure

Questions and Support

About

Uh oh!

Releases

Packages

Languages

License

FoundInSPAM/pytorch2.0_ttnn

Folders and files

Latest commit

History

Repository files navigation

PyTorch 2.0 TTNN Compiler

🚀 Quick Start

Installation

✨ Basic Usage

📊 Model Support

Explanation of Metrics

Contributing

Getting Started

Development Environment

Project Structure

Questions and Support

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages