DeepSparkInference
推理模型库作为DeepSpark
开源社区的核心项目,于2024年3月正式开源,一期甄选了48个推理模型示例,涵盖计算机视觉,自然语言处理,语音识别等领域,后续将逐步拓展更多AI领域。
DeepSparkInference
中的模型提供了在国产推理引擎IGIE
或ixRT
下运行的推理示例和指导文档,部分模型提供了基于国产通用GPU智铠100的评测结果。
IGIE
(Iluvatar GPU Inference Engine)是基于TVM框架研发的高性能、高通用、全流程的AI推理引擎。支持多框架模型导入、量化、图优化、多算子库支持、多后端支持、算子自动调优等特性,为推理场景提供易部署、高吞吐量、低延迟的完整方案。
ixRT
(Iluvatar CoreX RunTime)是天数智芯自研的高性能推理引擎,专注于最大限度发挥天数智芯通用GPU 的性能,实现各领域模型的高性能推理。ixRT
支持动态形状推理、插件和INT8/FP16推理等特性。
DeepSparkInference
将按季度进行版本更新,后续会逐步丰富模型类别并拓展大模型推理。
Model | Engine | Supported | IXUCA SDK |
---|---|---|---|
Baichuan2-7B | vLLM |
✅ | 4.3.0 |
ChatGLM-3-6B | vLLM |
✅ | 4.3.0 |
ChatGLM-3-6B-32K | vLLM |
✅ | 4.3.0 |
CosyVoice2-0.5B | PyTorch |
✅ | 4.3.0 |
DeepSeek-R1-Distill-Llama-8B | vLLM |
✅ | 4.3.0 |
DeepSeek-R1-Distill-Llama-70B | vLLM |
✅ | 4.3.0 |
DeepSeek-R1-Distill-Qwen-1.5B | vLLM |
✅ | 4.3.0 |
DeepSeek-R1-Distill-Qwen-7B | vLLM |
✅ | 4.3.0 |
DeepSeek-R1-Distill-Qwen-14B | vLLM |
✅ | 4.3.0 |
DeepSeek-R1-Distill-Qwen-32B | vLLM |
✅ | 4.3.0 |
ERNIE-4.5-21B-A3B | FastDeploy |
✅ | 4.3.0 |
ERNIE-4.5-300B-A47B | FastDeploy |
✅ | 4.3.0 |
GLM-4V | vLLM |
✅ | 4.3.0 |
InternLM3 | LMDeploy |
✅ | 4.3.0 |
Llama2-7B | vLLM |
✅ | 4.3.0 |
Llama2-7B | TRT-LLM |
✅ | 4.3.0 |
Llama2-13B | TRT-LLM |
✅ | 4.3.0 |
Llama2-70B | TRT-LLM |
✅ | 4.3.0 |
Llama3-70B | vLLM |
✅ | 4.3.0 |
E5-V | vLLM |
✅ | 4.3.0 |
MiniCPM-o | vLLM |
✅ | 4.3.0 |
MiniCPM-V | vLLM |
✅ | 4.3.0 |
Qwen-7B | vLLM |
✅ | 4.3.0 |
Qwen-VL | vLLM |
✅ | 4.3.0 |
Qwen2-VL | vLLM |
✅ | 4.3.0 |
Qwen2.5-VL | vLLM |
✅ | 4.3.0 |
Qwen1.5-7B | vLLM |
✅ | 4.3.0 |
Qwen1.5-7B | TGI |
✅ | 4.3.0 |
Qwen1.5-14B | vLLM |
✅ | 4.3.0 |
Qwen1.5-32B Chat | vLLM |
✅ | 4.3.0 |
Qwen1.5-72B | vLLM |
✅ | 4.3.0 |
Qwen2-7B Instruct | vLLM |
✅ | 4.3.0 |
Qwen2-72B Instruct | vLLM |
✅ | 4.3.0 |
StableLM2-1.6B | vLLM |
✅ | 4.3.0 |
Whisper | vLLM |
✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
AlexNet | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
CLIP | FP16 | ✅ | ✅ | 4.3.0 |
Conformer-B | FP16 | ✅ | 4.3.0 | |
ConvNeXt-Base | FP16 | ✅ | ✅ | 4.3.0 |
ConvNext-S | FP16 | ✅ | 4.3.0 | |
ConvNeXt-Small | FP16 | ✅ | ✅ | 4.3.0 |
ConvNeXt-Tiny | FP16 | ✅ | 4.3.0 | |
CSPDarkNet53 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
CSPResNet50 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
CSPResNeXt50 | FP16 | ✅ | ✅ | 4.3.0 |
DeiT-tiny | FP16 | ✅ | ✅ | 4.3.0 |
DenseNet121 | FP16 | ✅ | ✅ | 4.3.0 |
DenseNet161 | FP16 | ✅ | ✅ | 4.3.0 |
DenseNet169 | FP16 | ✅ | ✅ | 4.3.0 |
DenseNet201 | FP16 | ✅ | ✅ | 4.3.0 |
EfficientNet-B0 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
EfficientNet-B1 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
EfficientNet-B2 | FP16 | ✅ | ✅ | 4.3.0 |
EfficientNet-B3 | FP16 | ✅ | ✅ | 4.3.0 |
EfficientNet-B4 | FP16 | ✅ | ✅ | 4.3.0 |
EfficientNet-B5 | FP16 | ✅ | ✅ | 4.3.0 |
EfficientNet-B6 | FP16 | ✅ | 4.3.0 | |
EfficientNetV2 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
EfficientNetv2_rw_t | FP16 | ✅ | ✅ | 4.3.0 |
EfficientNetv2_s | FP16 | ✅ | ✅ | 4.3.0 |
GoogLeNet | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
HRNet-W18 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
InceptionV3 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
Inception-ResNet-V2 | FP16 | ✅ | 4.3.0 | |
INT8 | ✅ | 4.3.0 | ||
Mixer_B | FP16 | ✅ | 4.3.0 | |
MNASNet0_5 | FP16 | ✅ | 4.3.0 | |
MNASNet0_75 | FP16 | ✅ | 4.3.0 | |
MNASNet1_0 | FP16 | ✅ | 4.3.0 | |
MNASNet1_3 | FP16 | ✅ | 4.3.0 | |
MobileNetV2 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
MobileNetV3_Large | FP16 | ✅ | 4.3.0 | |
MobileNetV3_Small | FP16 | ✅ | ✅ | 4.3.0 |
MViTv2_base | FP16 | ✅ | 4.2.0 | |
RegNet_x_16gf | FP16 | ✅ | 4.3.0 | |
RegNet_x_1_6gf | FP16 | ✅ | 4.3.0 | |
RegNet_x_3_2gf | FP16 | ✅ | 4.3.0 | |
RegNet_x_32gf | FP16 | ✅ | 4.3.0 | |
RegNet_x_400mf | FP16 | ✅ | 4.3.0 | |
RegNet_y_1_6gf | FP16 | ✅ | 4.3.0 | |
RegNet_y_16gf | FP16 | ✅ | 4.3.0 | |
RegNet_y_3_2gf | FP16 | ✅ | 4.3.0 | |
RegNet_y_32gf | FP16 | ✅ | 4.3.0 | |
RegNet_y_400mf | FP16 | ✅ | 4.3.0 | |
RepVGG | FP16 | ✅ | ✅ | 4.3.0 |
Res2Net50 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
ResNeSt50 | FP16 | ✅ | 4.3.0 | |
ResNet101 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
ResNet152 | FP16 | ✅ | 4.3.0 | |
INT8 | ✅ | 4.3.0 | ||
ResNet18 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
ResNet34 | FP16 | ✅ | 4.3.0 | |
INT8 | ✅ | 4.3.0 | ||
ResNet50 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
ResNetV1D50 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
ResNeXt50_32x4d | FP16 | ✅ | ✅ | 4.3.0 |
ResNeXt101_64x4d | FP16 | ✅ | ✅ | 4.3.0 |
ResNeXt101_32x8d | FP16 | ✅ | ✅ | 4.3.0 |
SEResNet50 | FP16 | ✅ | 4.3.0 | |
ShuffleNetV1 | FP16 | ✅ | 4.3.0 | |
ShuffleNetV2_x0_5 | FP16 | ✅ | ✅ | 4.3.0 |
ShuffleNetV2_x1_0 | FP16 | ✅ | ✅ | 4.3.0 |
ShuffleNetV2_x1_5 | FP16 | ✅ | ✅ | 4.3.0 |
ShuffleNetV2_x2_0 | FP16 | ✅ | ✅ | 4.3.0 |
SqueezeNet 1.0 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
SqueezeNet 1.1 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
SVT Base | FP16 | ✅ | 4.3.0 | |
Swin Transformer | FP16 | ✅ | 4.3.0 | |
Swin Transformer Large | FP16 | ✅ | 4.3.0 | |
Twins_PCPVT | FP16 | ✅ | 4.3.0 | |
VAN_B0 | FP16 | ✅ | 4.3.0 | |
VGG11 | FP16 | ✅ | 4.3.0 | |
VGG13 | FP16 | ✅ | 4.3.0 | |
VGG13_BN | FP16 | ✅ | 4.3.0 | |
VGG16 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
VGG19 | FP16 | ✅ | 4.3.0 | |
VGG19_BN | FP16 | ✅ | 4.3.0 | |
ViT | FP16 | ✅ | 4.3.0 | |
Wide ResNet50 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
Wide ResNet101 | FP16 | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
ATSS | FP16 | ✅ | ✅ | 4.3.0 |
CenterNet | FP16 | ✅ | ✅ | 4.3.0 |
DETR | FP16 | ✅ | 4.3.0 | |
FCOS | FP16 | ✅ | ✅ | 4.3.0 |
FoveaBox | FP16 | ✅ | ✅ | 4.3.0 |
FSAF | FP16 | ✅ | ✅ | 4.3.0 |
GFL | FP16 | ✅ | 4.3.0 | |
HRNet | FP16 | ✅ | ✅ | 4.3.0 |
PAA | FP16 | ✅ | ✅ | 4.3.0 |
RetinaFace | FP16 | ✅ | ✅ | 4.3.0 |
RetinaNet | FP16 | ✅ | ✅ | 4.3.0 |
RTMDet | FP16 | ✅ | 4.3.0 | |
SABL | FP16 | ✅ | 4.3.0 | |
SSD | FP16 | ✅ | 4.3.0 | |
YOLOF | FP16 | ✅ | 4.3.0 | |
YOLOv3 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
YOLOv4 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
YOLOv5 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
YOLOv5s | FP16 | ✅ | 4.3.0 | |
INT8 | ✅ | 4.3.0 | ||
YOLOv6 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
YOLOv7 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
YOLOv8 | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
YOLOv9 | FP16 | ✅ | ✅ | 4.3.0 |
YOLOv10 | FP16 | ✅ | ✅ | 4.3.0 |
YOLOv11 | FP16 | ✅ | ✅ | 4.3.0 |
YOLOv12 | FP16 | ✅ | 4.3.0 | |
YOLOv13 | FP16 | ✅ | 4.3.0 | |
YOLOX | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
FaceNet | FP16 | ✅ | 4.3.0 | |
INT8 | ✅ | 4.3.0 |
Model | Prec. | IGIE | IXUCA SDK |
---|---|---|---|
Kie_layoutXLM | FP16 | ✅ | 4.3.0 |
SVTR | FP16 | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
HRNetPose | FP16 | ✅ | 4.3.0 | |
Lightweight OpenPose | FP16 | ✅ | 4.3.0 | |
RTMPose | FP16 | ✅ | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
Mask R-CNN | FP16 | ✅ | 4.2.0 | |
SOLOv1 | FP16 | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
UNet | FP16 | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
FastReID | FP16 | ✅ | 4.3.0 | |
DeepSort | FP16 | ✅ | 4.3.0 | |
INT8 | ✅ | 4.3.0 | ||
RepNet-Vehicle-ReID | FP16 | ✅ | 4.3.0 |
Model | vLLM | IxFormer | IXUCA SDK |
---|---|---|---|
Aria | ✅ | 4.3.0 | |
Chameleon-7B | ✅ | 4.3.0 | |
CLIP | ✅ | 4.3.0 | |
Fuyu-8B | ✅ | 4.3.0 | |
H2OVL Mississippi | ✅ | 4.3.0 | |
Idefics3 | ✅ | 4.3.0 | |
InternVL2-4B | ✅ | 4.3.0 | |
LLaVA | ✅ | 4.3.0 | |
LLaVA-Next-Video-7B | ✅ | 4.3.0 | |
Llama-3.2 | ✅ | 4.3.0 | |
MiniCPM-V 2 | ✅ | 4.3.0 | |
Pixtral | ✅ | 4.3.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
ALBERT | FP16 | ✅ | 4.3.0 | |
BERT Base NER | INT8 | ✅ | 4.3.0 | |
BERT Base SQuAD | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | 4.3.0 | ||
BERT Large SQuAD | FP16 | ✅ | ✅ | 4.3.0 |
INT8 | ✅ | ✅ | 4.3.0 | |
DeBERTa | FP16 | ✅ | 4.3.0 | |
RoBERTa | FP16 | ✅ | 4.3.0 | |
RoFormer | FP16 | ✅ | 4.3.0 | |
VideoBERT | FP16 | ✅ | 4.2.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
Conformer | FP16 | ✅ | ✅ | 4.3.0 |
Transformer ASR | FP16 | ✅ | 4.2.0 |
Model | Prec. | IGIE | ixRT | IXUCA SDK |
---|---|---|---|---|
Wide & Deep | FP16 | ✅ | 4.3.0 |
Docker Installer | IXUCA SDK | Introduction |
---|---|---|
corex-docker-installer-4.3.0-*-py3.10-x86_64.run | 4.3.0 | 适用小模型推理 |
corex-docker-installer-4.3.0-*-llm-py3.10-x86_64.run | 4.3.0 | 适用大模型推理 |
请参见 DeepSpark Code of Conduct on Gitee or on GitHub。
请参见 DeepSparkInference Contributing Guidelines。
DeepSparkInference仅提供公共数据集的下载和预处理脚本。这些数据集不属于DeepSparkInference,DeepSparkInference也不对其质量或维护负责。请确保您具有这些数据集的使用许可,基于这些数据集训练的模型仅可用于非商业研究和教育。
致数据集所有者:
如果不希望您的数据集公布在DeepSparkInference上或希望更新DeepSparkInference中属于您的数据集,请在Gitee或Github上提交issue,我们将按您的issue删除或更新。衷心感谢您对我们社区的支持和贡献。
本项目许可证遵循Apache-2.0。