Skip to content

Latest commit

 

History

History
2087 lines (2055 loc) · 44.6 KB

validated_model_list.md

File metadata and controls

2087 lines (2055 loc) · 44.6 KB

Validated Models

  1. Validated Quantization Examples

    1.1. TensorFlow Models with TensorFlow 2.10.0

    1.2. PyTorch Models with Torch 1.12.1+cpu in PTQ Mode

    1.3. PyTorch Models with Torch 1.12.1+cpu in QAT Mode

    1.4. PyTorch Models with Torch and Intel® Extension for PyTorch* 1.11.0+cpu

    1.5. ONNX Models with ONNX Runtime 1.12.1

    1.6. MXNet Models with MXNet 1.7.0

  2. Validated Pruning Examples

  3. Validated Knowledge Distillation Examples

  4. Validated ONNX QDQ INT8 Models on Multiple Hardware through ONNX Runtime

Validated Quantization Examples

Performance results test on ​​09/24/2022 with Intel Xeon Platinum 8380 Scalable processor, using 1 socket, 4 cores/instance, 8 instances and batch size 1.

Performance varies by use, configuration and other factors. See platform configuration for configuration details. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks

TensorFlow Models with TensorFlow 2.10.0

Model Example Accuracy Performance
Throughput(samples/sec)
INT8 FP32 Accuracy Ratio[(INT8-FP32)/FP32] INT8 FP32 Performance Ratio[INT8/FP32]
EfficientNet pb 76.74% 76.76% -0.03% 91.43 69.41 1.32x
Faster R-CNN Inception ResNet V2 pb 37.65% 38.33% -1.77% 2.53 1.62 1.57x
Faster R-CNN Inception ResNet V2 SavedModel 37.77% 38.33% -1.46% 2.54 1.61 1.58x
Faster R-CNN ResNet101 pb 30.34% 30.39% -0.16% 27.63 13.12 2.11x
Faster R-CNN ResNet101 SavedModel 30.33% 30.39% -0.20% 27.71 11.06 2.51x
Faster R-CNN ResNet50 pb 26.65% 26.59% 0.23% 33.64 16.33 2.06x
Inception ResNet V2 pb 80.34% 80.40% -0.07% 29.25 23.43 1.25x
Inception V1 pb 70.44% 69.74% 1.00% 163.14 133.44 1.22x
Inception V2 pb 74.34% 73.97% 0.50% 133.49 111.5 1.20x
Inception V3 pb 76.71% 76.75% -0.05% 91.67 64.02 1.43x
Inception V4 pb 80.18% 80.27% -0.11% 56.87 37.09 1.53x
Mask R-CNN Inception V2 pb 28.50% 28.73% -0.80% 36.06 27.15 1.33x
Mask R-CNN Inception V2 CKPT 28.50% 28.73% -0.80% 36.1 25.06 1.44x
MobileNet V1 pb 71.85% 70.96% 1.25% 374.38 226.03 1.66x
MobileNet V2 pb 71.85% 70.96% 1.25% 374.38 226.03 1.66x
ResNet101 pb 77.50% 76.45% 1.37% 92.47 65.56 1.41x
ResNet50 Fashion pb 78.04% 78.12% -0.10% 359.18 244.38 1.47x
ResNet50 V1.0 pb 74.11% 74.27% -0.22% 172.66 87.28 1.98x
ResNet50 V1.5 pb 76.23% 76.46% -0.30% 153.37 87.24 1.76x
SSD MobileNet V1 pb 23.12% 23.13% -0.04% 151.92 112.24 1.35x
SSD MobileNet V1 CKPT 23.11% 23.13% -0.09% 153.18 67.79 2.26x
SSD ResNet34 pb 21.71% 22.09% -1.72% 30.99 8.65 3.58x
SSD ResNet50 V1 pb 37.76% 38.00% -0.63% 23.04 14.75 1.56x
SSD ResNet50 V1 CKPT 37.82% 38.00% -0.47% 23 11.94 1.93x
VGG16 pb 72.64% 70.89% 2.47% 178.99 83.67 2.14x
VGG19 pb 72.69% 71.01% 2.37% 156.11 71.5 2.18x

PyTorch Models with Torch 1.12.1+cpu in PTQ Mode

Model Example Accuracy Performance
Throughput (samples/sec)
INT8 FP32 Acc Ratio[(INT8-FP32)/FP32] INT8 FP32 Performance Ratio[INT8/FP32]
ALBERT base MRPC EAGER 88.85% 88.50% 0.40% 26 21.22 1.23x
Barthez MRPC EAGER 83.92% 83.81% 0.14% 128.66 70.86 1.82x
BERT base MRPC FX 89.90% 90.69% -0.88% 203.38 101.29 2.01x
BERT base RTE FX 69.31% 69.68% -0.53% 216.22 102.72 2.10x
BERT base SST2 FX 91.06% 91.86% -0.88% 218.2 101.86 2.14x
BERT base STSB FX 64.12% 62.57% 2.48% 73.65 29.61 2.49x
BERT large COLA FX 92.79 93.16 -0.39% 36.54 9.89 3.70x
BERT large MRPC FX 89.50% 90.38% -0.97% 74.11 29.69 2.50x
BERT large QNLI FX 90.90% 91.82% -1.00% 72.45 29.66 2.44x
BERT large RTE FX 73.65% 74.01% -0.49% 41.53 29.67 1.40x
BlendCNN EAGER 68.40% 68.40% 0.00% 3878.48 3717.52 1.04x
CamemBERT base MRPC EAGER 86.70% 86.82% -0.14% 188.97 98.9 1.91x
Ctrl MRPC EAGER 81.87% 81.22% 0.80% 18.68 7.25 2.58x
Deberta MRPC EAGER 90.88% 90.91% -0.04% 124.43 68.74 1.81x
DistilBERT base MRPC EAGER 88.23% 89.16% -1.05% 347.47 200.76 1.73x
DistilBERT base MRPC FX FX 88.54% 89.16% -0.69% 382.74 198.25 1.93x
FlauBERT MRPC EAGER 79.87% 80.19% -0.40% 561.35 370.2 1.52x
HuBERT FX 97.69% 97.84% -0.15% 9.82 7.2 1.36x
Inception V3 EAGER 69.43% 69.52% -0.13% 409.34 181.95 2.25x
Longformer MRPC EAGER 91.01% 91.46% -0.49% 18.73 14.66 1.28x
mBart WNLI EAGER 56.34% 56.34% 0.00% 54.35 25.14 2.16x
MobileNet V2 EAGER 70.54% 71.84% -1.81% 639.87 490.05 1.31x
lvwerra/pegasus-samsum EAGER 42.1 42.67 -1.35% 3.41 1.07 3.19x
PeleeNet EAGER 71.64% 72.10% -0.64% 419.42 316.98 1.32x
ResNet18 EAGER 69.57% 69.76% -0.27% 686.03 332.13 2.07x
ResNet18 FX 69.54% 69.76% -0.31% 611.36 333.27 1.83x
ResNet50 EAGER 75.98% 76.15% -0.21% 327.14 162.46 2.01x
ResNeXt101_32x8d EAGER 79.08% 79.31% -0.29% 175.93 61.09 2.88x
Roberta Base MRPC EAGER 88.25% 88.18% 0.08% 197.96 99.35 1.99x
Se_ResNeXt50_32x4d EAGER 78.98% 79.08% -0.13% 308.19 144.6 2.13x
SqueezeBERT MRPC EAGER 86.87% 87.65% -0.89% 186.26 155.67 1.20x
SSD ResNet 34 FX 19.52 19.63 -0.59% 19.09 6.88 2.78x
Transfo-xl MRPC EAGER 81.97% 81.20% 0.94% 9.65 7.06 1.37x
Wave2Vec2 FX 95.71% 96.60% -0.92% 23.69 19.58 1.21x
Xlm Roberta base MRPC EAGER 88.03% 88.62% -0.67% 114.31 99.34 1.15x
YOLO V3 EAGER 24.60% 24.54% 0.21% 71.81 31.38 2.29x

PyTorch Models with Torch 1.12.1+cpu in QAT Mode

Model Example Accuracy Performance
Throughput (samples/sec)
INT8 FP32 Acc Ratio[(INT8-FP32)/FP32] INT8 FP32 Performance Ratio[INT8/FP32]
ResNet18 EAGER 69.84% 69.76% 0.11% 690.73 330.85 2.09x
ResNet18 FX 69.74% 69.76% -0.03% 614.83 334.35 1.84x
BERT base MRPC QAT FX 89.70% 89.46% 0.27% 127.45 82.68 1.54x
ResNet50 EAGER 76.05% 76.15% -0.13% 410.44 168.81 2.43x

PyTorch Models with Torch and Intel® Extension for PyTorch* 1.11.0+cpu

Model Example Accuracy Performance
Throughput (samples/sec)
INT8 FP32 Acc Ratio[(INT8-FP32)/FP32] INT8 FP32 Performance Ratio[INT8/FP32]
bert-large-uncased-whole-word-masking-finetuned-squad IPEX 92.9 93.16 -0.28% 31.35 9.97 3.14x
ResNeXt101_32x16d_wsl IPEX 69.48% 69.76% -0.40% 1189.15 680 1.75x
ResNet50 IPEX 76.07% 76.15% -0.10% 677.69 381.59 1.78x
SSD ResNet34 IPEX 19.95% 20.00% -0.25% 24.07 6.71 3.59x
DistilBERT base MRPC IPEX 86 86.84 -0.96% 98.02 62.4 1.57x

ONNX Models with ONNX Runtime 1.12.1

Model Example Accuracy Performance
Throughput(samples/sec)
INT8 FP32 Acc Ratio[(INT8-FP32)/FP32] INT8 FP32 Performance Ratio[INT8/FP32]
AlexNet QLinear 54.73% 54.79% -0.11% 960.18 469.17 2.05x
AlexNet QDQ 54.71% 54.79% -0.15% 962.71 466.56 2.06x
ArcFace QLinear 99.80% 99.80% 0.00% 235.14 130 1.81x
BERT base MRPC DYNAMIC QLinear 85.29% 86.03% -0.86% 294.05 125.85 2.34x
BERT base MRPC STATIC QLinear 85.29% 86.03% -0.86% 604.07 256.93 2.35x
BERT SQuAD QLinear 80.44 80.67 -0.29% 93.21 51.45 1.81x
BERT SQuAD QDQ 80.44 80.67 -0.29% 93.27 51.67 1.80x
CaffeNet QLinear 56.21% 56.30% -0.16% 1501.21 536.1 2.80x
CaffeNet QDQ 56.25% 56.30% -0.09% 1493.36 533.09 2.80x
DistilBERT base MRPC QLinear 84.80% 84.56% 0.28% 1372.84 485.95 2.83x
DistilBERT base MRPC QDQ 84.56% 84.56% 0.00% 541.43 480.25 1.13x
EfficientNet QLinear 77.57% 77.70% -0.17% 1250.63 753.09 1.66x
EfficientNet QDQ 77.61% 77.70% -0.12% 1130.67 748.12 1.51x
Emotion Ferplus QLinear 7.86% 8.00% -1.75% 336.52 163.72 2.06x
Faster R-CNN QLinear 34.05% 34.37% -0.93% 16.36 6.18 2.65x
Faster R-CNN QDQ 33.97% 34.37% -1.16% 10.26 6.18 1.66x
FCN QLinear 64.54% 64.98% -0.67% 40.05 12.08 3.31x
FCN QDQ QDQ 64.65% 64.98% -0.50% 26.73 12.04 2.22x
GoogleNet QLinear 67.71% 67.79% -0.12% 740.16 587.54 1.26x
GoogleNet QDQ 67.73% 67.79% -0.09% 770.51 567.88 1.36x
Inception V1 QLinear 67.21% 67.24% -0.04% 824.15 601.92 1.37x
Inception V1 QDQ 67.21% 67.24% -0.04% 819.85 597.46 1.37x
Mask R-CNN QLinear 33.41% 33.72% -0.92% 14.18 5.78 2.45x
Mask R-CNN QDQ 33.30% 33.72% -1.25% 9.42 5.7 1.65x
Mobile bert MRPC QLinear 86.27% 86.27% 0.00% 613.72 506.41 1.21x
MobileBERT SQuAD MLPerf QLinear 89.82 90.03 -0.23% 88.41 76.07 1.16x
MobileNet V2 QLinear 65.59% 66.89% -1.94% 2454.53 1543.79 1.59x
MobileNet V2 QDQ 65.82% 66.89% -1.60% 2164.97 1564.21 1.38x
MobileNet V3 MLPerf QLinear 75.58% 75.74% -0.21% 2147.42 1046.69 2.05x
MobileNet V3 MLPerf QDQ 75.57% 75.74% -0.22% 1877.1 1054.88 1.78x
MobileNetV2 (ONNX Model Zoo) QLinear 68.38% 69.48% -1.58% 2751.7 1797.64 1.53x
MobileNetV2 (ONNX Model Zoo) QDQ 68.51% 69.48% -1.40% 2656.23 1835.74 1.45x
ResNet50 v1.5 MLPerf QLinear 0.7615 0.7646 -0.41% 764.901 434.141 1.76x
ResNet50 v1.5 MLPerf QDQ 0.7614 0.7646 -0.42% 575.952 433.75 1.33x
ResNet50 V1.5 QLinear 0.7226 0.7229 -0.04% 761.12 432.615 1.76x
ResNet50 V1.5 QDQ 0.722 0.7229 -0.12% 575.032 432.894 1.33x
ResNet50 V1.5 (ONNX Model Zoo) QLinear 74.81% 74.99% -0.24% 885.64 454.02 1.95x
ResNet50 V1.5 (ONNX Model Zoo) QDQ 74.76% 74.99% -0.31% 603.72 455.86 1.32x
Roberta Base MRPC QLinear 89.71% 89.95% -0.27% 644.636 254.791 2.53x
ShuffleNet V2 QLinear 66.13% 66.36% -0.35% 2298.55 1480.87 1.55x
ShuffleNet V2 QDQ 66.12% 66.36% -0.36% 1951.11 1490.78 1.31x
SqueezeNet QLinear 56.54% 56.87% -0.58% 2588.97 1605.92 1.61x
SqueezeNet QDQ 56.54% 56.87% -0.58% 2566.18 1936.79 1.32x
SSD MobileNet V1 QLinear 22.45% 23.10% -2.81% 725.83 570.24 1.27x
SSD MobileNet V1 QDQ 22.45% 23.10% -2.81% 666.01 539.77 1.23x
SSD MobileNet V1 (ONNX Model Zoo) QLinear 22.86% 23.03% -0.74% 641.56 519.93 1.23x
SSD MobileNet V1 (ONNX Model Zoo) QDQ 22.86% 23.03% -0.74% 633.61 492.5 1.29x
SSD MobileNet V2 QLinear 24.04% 24.68% -2.59% 542.68 401.56 1.35x
SSD QLinear 18.84% 18.98% -0.74% 31.33 8.87 3.53x
SSD QDQ 18.63% 18.98% -1.84% 23.98 8.95 2.68x
Tiny YOLOv3 QLinear 12.08% 12.43% -2.82% 648.62 518.97 1.25x
VGG16 QLinear 66.67% 66.69% -0.03% 221.93 99.51 2.23x
VGG16 (ONNX Model Zoo) QLinear 72.32% 72.40% -0.11% 319.54 99.9 3.20x
VGG16 (ONNX Model Zoo) QDQ 72.31% 72.40% -0.12% 319.41 99.94 3.20x
VGG16 QDQ 66.69% 66.69% 0.00% 307.52 99.24 3.10x
YOLOv3 QLinear 26.82% 28.74% -6.68% 124.24 54.03 2.30x
YOLOv4 QLinear 33.25% 33.71% -1.36% 49.76 32.99 1.51x
ZFNet QLinear 55.84% 55.96% -0.21% 459.38 261.93 1.75x
ZFNet QDQ 55.86% 55.96% -0.18% 460.66 264.34 1.74x

MXNet Models with MXNet 1.7.0

Model Accuracy Performance
Throughput(samples/sec)
INT8 FP32 Acc Ratio[(INT8-FP32)/FP32] INT8 FP32 Performance Ratio[INT8/FP32]
Inception V3 77.80% 77.65% 0.20% 86.52 47.98 1.80x
MobileNet V1 71.60% 72.23% -0.86% 441.59 337.52 1.31x
MobileNet V3 MLPerf 70.80% 70.87% -0.10% 272.87 211.51 1.29x
ResNet v1 152 78.28% 78.54% -0.33% 65.2 37.05 1.76x
ResNet18 V1.0 70.01% 70.14% -0.19% 423.98 235.98 1.80x
ResNet50 V1.0 75.91% 76.33% -0.55% 180.69 100.49 1.80x
SqueezeNet 56.80% 56.97% -0.28% 311.23 198.61 1.57x
SSD MobileNet V1 74.94% 75.54% -0.79% 43.5 25.77 1.69x
SSD ResNet50 V1.0 80.21% 80.23% -0.03% 31.64 15.13 2.09x

Validated Pruning Examples

Model Task
Dataset
Dense Accuracy
Sparse Accuracy
Relative Drop Sparsity ratio
Sparsity Pattern
Comments
Balanced or unbalanced ratio
ResNet18 image classification
ImageNet
top-1% acc = 69.76
top-1% acc = 69.47
-0.42% 30% magnitude
ResNet50 image classification
ImageNet
top-1% acc = 76.13
top-1% acc = 76.11
-0.03% 30% magnitude
ResNet50 image classification
ImageNet
top-1% acc = 76.13
top-1% acc = 76.01
-0.16% 30% magnitude
Post Training Quantization
ResNet50 image classification
ImageNet
top-1% acc = 76.13
top-1% acc = 75.90
-0.30% 30% magnitude
Quantization Aware Training
Bert-Large question answering
SQuAD-v1.1
f1=91.34
f1=90.7
-0.07% 80%
structured 2x1
group lasso
unbalanced
Bert-Base text classification
MNLI
[m, mm] = [84.57, 84.79]
[m, mm] = [82.45, 83.27]
[-2.51%, -1.80%] 70%
unstructured
Prune once for all
balanced
Bert-Base text classification
MNLI
[m, mm] = [84.57, 84.79]
[m, mm] = [83.20, 84.11]
[-1.62%, -0.80%] 50%
structured 1:2
Prune once for all
balanced
Bert-Base text classification
SST-2
accuracy = 92.32
accuracy = 91.51
-0.88% 70%
unstructured
Prune once for all
balanced
Bert-Base text classification
SST-2
accuracy = 92.32
accuracy = 92.20
-0.13% 50%
structured 1:2
Prune once for all
balanced
Bert-Base text classification
SST-2
accuracy = 92.32
accuracy = 91.97
-0.38% 20%
unstructured
gradient sensitivity
balanced
Bert-Base text classification
QQP
[accuracy, f1] = [91.10, 88.05]
[accuracy, f1] = [90.48, 87.06]
[-0.68%, -1.12%] 70%
unstructured
Prune once for all
balanced
Bert-Base text classification
QQP
[accuracy, f1] = [91.10, 88.05]
[accuracy, f1] = [90.92, 87.78]
[-0.20%, -0.31%] 50%
structured 1:2
Prune once for all
balanced
Bert-Base text classification
QNLI
accuracy = 91.54
accuracy = 90.39
-1.26% 70%
unstructured
Prune once for all
balanced
Bert-Base text classification
QNLI
accuracy = 91.54
accuracy = 90.87
-0.73% 50%
structured 1:2
Prune once for all
balanced
Bert-Base question answering [em, f1] = [79.34, 87.10]
[em, f1] = [77.27, 85.75]
[-2.61%, -1.54%] 70%
unstructured
Prune once for all
balanced
Bert-Base question answering [em, f1] = [79.34, 87.10]
[em, f1] = [78.03, 86.50]
[-1.65%, -0.69%] 50%
structured 1:2
Prune once for all
balanced
Bert-Mini question answering
SQuAD-v1.1
f1]=76.87
f1=76.2
-0.80% 80%
structured 4x1
snip momentum
unbalanced
Bert-Mini question answering
SQuAD-v1.1
f1=76.87
f1=77.62
+0.98% 50%
structured 2:4
snip momentum
balanced
Distilbert-base-uncased question answering
SQuAD-v1.1
f1]=86.90
f1=86.15
-0.86% 80%
structured 4x1
snip momentum
unbalanced
Distilbert-base-uncased question answering
SQuAD-v1.1
f1=86.90
f1=87.50
+0.69% 50%
structured 2:4
snip momentum
balanced
Bert-base-uncased question answering
SQuAD-v1.1
f1]=88.59
f1=87.78
-0.92% 80%
structured 4x1
snip momentum
unbalanced
Bert-base-uncased question answering
SQuAD-v1.1
f1=88.59
f1=89.40
+0.91% 50%
structured 2:4
snip momentum
balanced
Bert-large question answering
SQuAD-v1.1
f1]=91.23
f1=90.91
-0.35% 80%
structured 4x1
snip momentum
unbalanced
Bert-large question answering
SQuAD-v1.1
f1=91.23
f1=91.67
+0.48% 50%
structured 2:4
snip momentum
balanced
Bert-Mini text classification
MRPC
f1=87.52
f1=87.22
-0.34% 90%
structured 4x1
snip momentum
unbalanced
Bert-Mini text classification
MRPC
f1=87.52
f1=87.33
-0.22% 90%
structured 4x1
snip momentum
balanced
Bert-Mini text classification
MRPC
f1=87.52
f1=86.89
-0.72% 50%
structured 2:4
snip momentum
balanced
Bert-Mini text classification
MRPC
f1=87.52
f1=86.8
-0.83% 60%
structured per channel
snip momentum
unbalanced
Distilbert-base-uncased text classification
MRPC
f1=90.26
f1=89.85
-0.46% 90%
structured 4x1
snip momentum
unbalanced
Distilbert-base-uncased text classification
MRPC
f1=90.26
f1=90.88
+0.69% 50%
structured 2:4
snip momentum
balanced
Bert-Mini text classification
SST-2
accuracy=87.61
accuracy=86.92
-0.79% 90%
structured 4x1
snip momentum
unbalanced
Bert-Mini text classification
SST-2
accuracy=87.61
accuracy=87.73
+0.14% 50%
structured 2:4
snip momentum
balanced
Bert-Mini text classification
SST-2
accuracy=87.61
accuracy=86.92
-0.79% 50%
structured per channel
snip momentum
unbalanced

Validated Knowledge Distillation Examples

Example Name Dataset Student
(Metrics)
Teacher
(Metrics)
Student With Distillation
(Metrics Improvement)
Student With Distributed Distillation
(Metrics Improvement)
MobileNet example CIFAR-10 MobileNetV2-0.35
(0.7965 ACC)
WideResNet40-2
(0.9522 ACC)
0.8178 ACC
(0.0213 ACC)
0.8235 ACC
(0.027 ACC)
CNN example CIFAR-100 CNN-2
(0.5494 ACC)
CNN-10
(0.7153 ACC)
0.5540 ACC
(0.0046 ACC)
0.5523 ACC
(0.0029 ACC)
VGG example CIFAR-100 VGG-8-BN
(0.7022 ACC)
VGG-13-BN
(0.7415 ACC)
0.7025 ACC
(0.0003 ACC)
WIP
ResNet example ImageNet ResNet18
(0.6739 ACC)
ResNet50
(0.7399 ACC)
0.6845 ACC
(0.0106 ACC)
WIP
BlendCnn example MRPC BlendCnn
(0.7034 ACC)
BERT-Base
(0.8382 ACC)
0.7034 ACC
(0 ACC)
WIP
BiLSTM example SST-2 BiLSTM
(0.8314 ACC)
RoBERTa-Base
(0.9403 ACC)
0.9048 ACC
(0.0734 ACC)
WIP
DistilBERT example SQuAD DistilBERT
(0.7323/0.8256 EM/F1)
BERT-Base
(0.8084/0.8814 EM/F1)
0.7442/0.8371 EM/F1
(0.0119/0.0115 EM/F1)
WIP
TinyBERT example MNLI TinyBERT
(0.8018/0.8044 m/mm)
BERT-Base
(0.8363/0.8411 m/mm)
0.8025/0.8074 m/mm
(0.0007/0.0030 m/mm)
WIP
BERT-3 example QQP BERT-3
(0.8626/0.8213 EM/F1)
BERT-Base
(0.9091/0.8782 EM/F1)
0.8684/0.8259 EM/F1
(0.0058/0.0046 EM/F1)
WIP
DistilRoBERTa example COLA DistilRoBERTa
(0.6057 ACC)
RoBERTa-Large
(0.6455 ACC)
0.6187 ACC
(0.0130 ACC)
WIP

Validated ONNX QDQ INT8 Models on Multiple Hardware through ONNX Runtime

Model (ONNX QDQ) AWS c6i.2xlarge (Intel)
CPU Execution Provider
AWS c6a.2xlarge (AMD)
CPU Execution Provider
AWS c6g.2xlarge (ARM)
CPU Execution Provider
NVidia A100
CUDA Execution Provider
ResNet50 74.76% 68.95% 74.76% 74.75%
BERT-base 85.54% 84.56% 85.54% 84.31%
ResNet50 V1.5 72.20% 67.70% 72.20% 72.29%
MobileNet V2 65.82% 58.56% 65.83% 65.63%
SSD MobileNet V1 22.45% 16.53% 22.45% 22.35%
DistilBERT base MRPC 84.56% 83.82% 84.56% 84.56%
SqueezeNet 56.54% 53.52% 56.54% 56.55%
SSD 18.63% 18.54% 18.63% 18.61%
AlexNet 54.71% 47.06% 54.71% 54.79%
CaffeNet 56.25% 52.35% 56.27% 56.24%
GoogleNet 67.73% 63.56% 67.72% 67.76%
ZFNet 55.86% 45.09% 55.86% 55.89%
Inception V1 67.21% 63.03% 67.20% 67.21%
SSD MobileNet V1 (ONNX Model Zoo) 22.86% 16.94% 22.80% 22.87%
Mobile bert MRPC 85.54% 84.56% 85.54% 85.54%
Roberta base MRPC 89.46% 90.44% 89.71% 89.71%
ResNet50 V1.5 MLPerf 76.14% 72.80% 76.14% 76.17%
VGG16 66.69% 64.25% 66.69% 66.64%
VGG16 (ONNX Model Zoo) 72.31% 69.35% 72.32% 72.34%
MobileNet V3 MLPerf 75.57% 70.78% 75.56% 75.52%
EfficientNet 77.61% 76.52% 77.56% 77.60%
MobileNet V2 (ONNX Model Zoo) 68.51% 62.48% 68.58% 68.48%
ShuffleNet V2 66.12% 58.41% 66.11% 66.11%