The collection of pre-trained, state-of-the-art AI models.
ailia SDK is a self-contained, cross-platform, high-speed inference SDK for AI. The ailia SDK provides a consistent C++ API across Windows, Mac, Linux, iOS, Android, Jetson, and Raspberry Pi platforms. It also supports Unity (C#), Python, Rust, Flutter(Dart) and JNI for efficient AI implementation. The ailia SDK makes extensive use of the GPU through Vulkan and Metal to enable accelerated computing.
ailia MODELS tutorial 日本語版
357 models as of October 2nd, 2024
- 2024.10.02 Add florence2
- 2024.09.15 Add bert-vits2, pytorch_wavenet
- 2024.09.12 Add gpt-sovits-v2
- 2024.09.10 Add segment-anything-2 (video mode)
- 2024.08.27 Add segment-anything-2 (image mode)
- 2024.08.20 Add bert_ner_japanese
- 2024.08.16 Add latent-consistency-model-txt2img, fbcnn
- 2024.08.15 Add volo, elegant, depth_anything, drbn_skf, codeformer, dtln
- 2024.08.10 Add TripoSR, japanese-reranker-cross-encoder
- 2024.08.09 Add mahalanobis-ad, t5_base_japanese_ner
- 2024.08.08 Add sdxl-turbo, sd-turbo
- 2024.08.05 Migrate to ailia Tokenizer 1.3 from Transformers
- More information in our Wiki
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
mars | MARS: Motion-Augmented RGB Stream for Action Recognition | Pytorch | 1.2.4 and later | EN JP | |
st-gcn | ST-GCN | Pytorch | 1.2.5 and later | EN JP | |
ax_action_recognition | Realtime-Action-Recognition | Pytorch | 1.2.7 and later | ||
va-cnn | View Adaptive Neural Networks (VA) for Skeleton-based Human Action Recognition | Pytorch | 1.2.7 and later | ||
driver-action-recognition-adas | driver-action-recognition-adas-0002 | OpenVINO | 1.2.5 and later | ||
action_clip | ActionCLIP | Pytorch | 1.2.7 and later |
Model | Reference | Exported From | Supported Ailia Version | Date | Blog | |
---|---|---|---|---|---|---|
mahalanobisad | MahalanobisAD-pytorch | Pytorch | 1.2.9 and later | May 2020 | ||
spade-pytorch | Sub-Image Anomaly Detection with Deep Pyramid Correspondences | Pytorch | 1.2.6 and later | May 2020 | ||
padim | PaDiM-Anomaly-Detection-Localization-master | Pytorch | 1.2.6 and later | Nov 2020 | EN JP | |
patchcore | PatchCore_anomaly_detection | Pytorch | 1.2.6 and later | Jun 2021 |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
crnn_audio_classification | crnn-audio-classification | Pytorch | 1.2.5 and later | EN JP |
transformer-cnn-emotion-recognition | Combining Spatial and Temporal Feature Representions of Speech Emotion by Parallelizing CNNs and Transformer-Encoders | Pytorch | 1.2.5 and later | |
audioset_tagging_cnn | PANNs: Large-Scale Pretrained Audio Neural Networks for Audio Pattern Recognition | Pytorch | 1.2.9 and later | |
clap | CLAP | Pytorch | 1.2.6 and later | |
microsoft clap | CLAP | Pytorch | 1.2.11 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
hifigan | HiFi-GAN | Pytorch | 1.2.9 and later | |
deep music enhancer | On Filter Generalization for Music Bandwidth Extension Using Deep Neural Networks | Pytorch | 1.2.6 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
pytorch_wavenet | pytorch_wavenet | Pytorch | 1.2.14 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
unet_source_separation | source_separation | Pytorch | 1.2.6 and later | EN JP |
voicefilter | VoiceFilter | Pytorch | 1.2.7 and later | EN JP |
rnnoise | rnnoise | Keras | 1.2.15 and later | |
dtln | Dual-signal Transformation LSTM Network | Tensorflow | 1.3.0 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
narabas | narabas: Japanese phoneme forced alignment tool | Pytorch | 1.2.11 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
crepe | torchcrepe | Pytorch | 1.2.10 and later | JP |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
auto_speech | AutoSpeech: Neural Architecture Search for Speaker Recognition | Pytorch | 1.2.5 and later | EN JP |
wespeaker | WeSpeaker | Onnxruntime | 1.2.9 and later | |
pyannote-audio | Pyannote-audio | Pytorch | 1.2.15 and later | JP |
Model | Reference | Exported From | Supported Ailia Version | Date | Blog |
---|---|---|---|---|---|
deepspeech2 | deepspeech.pytorch | Pytorch | 1.2.2 and later | Oct 2017 | EN JP |
whisper | Whisper | Pytorch | 1.2.10 and later | Dec 2022 | JP |
reazon_speech | ReazonSpeech | Pytorch | 1.4.0 and later | Jan 2023 | |
distil-whisper | Hugging Face - Distil-Whisper | Pytorch | 1.2.16 and later | Nov 2023 | |
reazon_speech2 | ReazonSpeech2 | Pytorch | 1.4.0 and later | Feb 2024 | |
kotoba-whisper | kotoba-whisper | Pytorch | 1.2.16 and later | Apr 2024 |
Model | Reference | Exported From | Supported Ailia Version | Date | Blog |
---|---|---|---|---|---|
pytorch-dc-tts | Efficiently Trainable Text-to-Speech System Based on Deep Convolutional Networks with Guided Attention | Pytorch | 1.2.6 and later | Oct 2017 | EN JP |
tacotron2 | Tacotron2 | Pytorch | 1.2.15 and later | Feb 2018 | JP |
vall-e-x | VALL-E-X | Pytorch | 1.2.15 and later | Mar 2023 | JP |
Bert-VITS2 | Bert-VITS2 | Pytorch | 1.2.16 and later | Aug 2023 | |
gpt-sovits | GPT-SoVITS | Pytorch | 1.4.0 and later | Feb 2024 | JP |
gpt-sovits-v2 | GPT-SoVITS | Pytorch | 1.4.0 and later | Aug 2024 |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
silero-vad | Silero VAD | Pytorch | 1.2.15 and later | JP |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
rvc | Retrieval-based-Voice-Conversion-WebUI | Pytorch | 1.2.12 and later | JP |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
U-2-Net | U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection | Pytorch | 1.2.2 and later | EN JP | |
u2net-portrait-matting | U^2-Net - Portrait matting | Pytorch | 1.2.7 and later | ||
u2net-human-seg | U^2-Net - human segmentation | Pytorch | 1.2.4 and later | ||
deep-image-matting | Deep Image Matting | Keras | 1.2.3 and later | EN JP | |
indexnet | Indices Matter: Learning to Index for Deep Image Matting | Pytorch | 1.2.7 and later | ||
modnet | MODNet: Trimap-Free Portrait Matting in Real Time | Pytorch | 1.2.7 and later | ||
background_matting_v2 | Real-Time High-Resolution Background Matting | Pytorch | 1.2.9 and later | ||
cascade_psp | CascadePSP | Pytorch | 1.2.9 and later | ||
rembg | Rembg | Pytorch | 1.2.4 and later | ||
dis_seg | Highly Accurate Dichotomous Image Segmentation | Pytorch | 1.2.10 and later | ||
gfm | Bridging Composite and Real: Towards End-to-end Deep Image Matting | Pytorch | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
crowdcount-cascaded-mtl | CNN-based Cascaded Multi-task Learning of High-level Prior and Density Estimation for Crowd Counting (Single Image Crowd Counting) |
Pytorch | 1.2.1 and later | EN JP | |
c-3-framework | Crowd Counting Code Framework(C^3-Framework) | Pytorch | 1.2.5 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
clothing-detection | Clothing-Detection | Pytorch | 1.2.1 and later | EN JP | |
mmfashion | MMFashion | Pytorch | 1.2.5 and later | EN JP | |
mmfashion_tryon | MMFashion virtual try-on | Pytorch | 1.2.8 and later | ||
mmfashion_retrieval | MMFashion In-Shop Clothes Retrieval | Pytorch | 1.2.5 and later | ||
fashionai-key-points-detection | A Pytorch Implementation of Cascaded Pyramid Network for FashionAI Key Points Detection | Pytorch | 1.2.5 and later | ||
person-attributes-recognition-crossroad | person-attributes-recognition-crossroad-0230 | Pytorch | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
monodepth2 | Monocular depth estimation from a single image | Pytorch | 1.2.2 and later | ||
midas | Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer |
Pytorch | 1.2.4 and later | EN JP | |
fcrn-depthprediction | Deeper Depth Prediction with Fully Convolutional Residual Networks | TensorFlow | 1.2.6 and later | ||
fast-depth | ICRA 2019 "FastDepth: Fast Monocular Depth Estimation on Embedded Systems" | Pytorch | 1.2.5 and later | ||
lap-depth | LapDepth-release | Pytorch | 1.2.9 and later | ||
hitnet | ONNX-HITNET-Stereo-Depth-estimation | Pytorch | 1.2.9 and later | ||
crestereo | ONNX-CREStereo-Depth-Estimation | Pytorch | 1.2.13 and later | ||
mobilestereonet | MobileStereoNet | Pytorch | 1.2.13 and later | ||
zoe_depth | ZoeDepth | Pytorch | 1.3.0 and later | ||
DepthAnything | DepthAnything | Pytorch | 1.2.9 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
latent-diffusion-txt2img | Latent Diffusion - txt2img | Pytorch | 1.2.10 and later | ||
stable-diffusion-txt2img | Stable Diffusion | Pytorch | 1.2.14 and later | JP | |
control_net | ControlNet | Pytorch | 1.2.15 and later | ||
sd-turbo | Hugging Face - SD-Turbo | Pytorch | 1.2.16 and later | ||
sdxl-turbo | Hugging Face - SDXL-Turbo | Pytorch | 1.2.16 and later | ||
latent-consistency-models | latent-consistency-models | Pytorch | 1.2.16 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
riffusion | Riffusion | Pytorch | 1.2.16 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
latent-diffusion-inpainting | Latent Diffusion - inpainting | Pytorch | 1.2.10 and later | ||
latent-diffusion-superresolution | Latent Diffusion - Super-resolution | Pytorch | 1.2.10 and later | ||
DA-CLIP | DA-CLIP | Pytorch | 1.2.16 and later | ||
marigold | Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation | Pytorch | 1.2.16 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
yolov1-face | YOLO-Face-detection | Darknet | 1.1.0 and later | ||
yolov3-face | Face detection using keras-yolov3 | Keras | 1.2.1 and later | ||
blazeface | BlazeFace-PyTorch | Pytorch | 1.2.1 and later | EN JP | |
face-mask-detection | Face detection using keras-yolov3 | Keras | 1.2.1 and later | EN JP | |
dbface | DBFace : real-time, single-stage detector for face detection, with faster speed and higher accuracy |
Pytorch | 1.2.2 and later | ||
retinaface | RetinaFace: Single-stage Dense Face Localisation in the Wild. | Pytorch | 1.2.5 and later | JP | |
anime-face-detector | Anime Face Detector | Pytorch | 1.2.6 and later | ||
face-detection-adas | face-detection-adas-0001 | OpenVINO | 1.2.5 and later | ||
mtcnn | mtcnn | Keras | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
vggface2 | VGGFace2 Dataset for Face Recognition | Caffe | 1.1.0 and later | ||
arcface | pytorch implement of arcface | Pytorch | 1.2.1 and later | EN JP | |
insightface | InsightFace: 2D and 3D Face Analysis Project | Pytorch | 1.2.5 and later | ||
cosface | Pytorch implementation of CosFace | Pytorch | 1.2.10 and later | ||
facenet_pytorch | Face Recognition Using Pytorch | Pytorch | 1.2.6 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
face_classification | Real-time face detection and emotion/gender classification | Keras | 1.1.0 and later | ||
age-gender-recognition-retail | age-gender-recognition-retail-0013 | OpenVINO | 1.2.5 and later | EN JP | |
mivolo | MiVOLO: Multi-input Transformer for Age and Gender Estimation | Pytorch | 1.2.13 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
ferplus | FER+ | CNTK | 1.2.2 and later | ||
hsemotion | HSEmotion (High-Speed face Emotion recognition) library | Pytorch | 1.2.5 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
gazeml | A deep learning framework based on Tensorflow for the training of high performance gaze estimation |
TensorFlow | 1.2.0 and later | ||
mediapipe_iris | irislandmarks.pytorch | Pytorch | 1.2.2 and later | EN JP | |
ax_gaze_estimation | ax Gaze Estimation | Pytorch | 1.2.2 and later | EN JP |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
hopenet | deep-head-pose | Pytorch | 1.2.2 and later | EN JP | |
6d_repnet | 6D Rotation Representation for Unconstrained Head Pose Estimation (Pytorch) | Pytorch | 1.2.6 and later | ||
L2CS_Net | L2CS_Net | Pytorch | 1.2.9 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
facial_feature | kaggle-facial-keypoints | Pytorch | 1.2.0 and later | ||
face_alignment | 2D and 3D Face alignment library build using pytorch | Pytorch | 1.2.1 and later | EN JP | |
prnet | Joint 3D Face Reconstruction and Dense Alignment with Position Map Regression Network |
TensorFlow | 1.2.2 and later | ||
facemesh | facemesh.pytorch | Pytorch | 1.2.2 and later | EN JP | |
facemesh_v2 | MediaPipe Face landmark detection | Pytorch | 1.2.9 and later | JP | |
3ddfa | Towards Fast, Accurate and Stable 3D Dense Face Alignment | Pytorch | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
ax_facial_features | ax Facial Features | Pytorch | 1.2.5 and later | EN | |
face-anti-spoofing | Lightweight Face Anti Spoofing | Pytorch | 1.2.5 and later | EN JP |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
gfpgan | GFP-GAN: Towards Real-World Blind Face Restoration with Generative Facial Prior | Pytorch | 1.2.10 and later | JP | |
codeformer | CodeFormer: Towards Robust Blind Face Restoration with Codebook Lookup Transformer | Pytorch | 1.2.9 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
sber-swap | SberSwap | Pytorch | 1.2.12 and later | JP | |
facefusion | FaceFusion | ONNXRuntime | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
flavr | FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation | Pytorch | 1.2.7 and later | EN JP | |
cain | Channel Attention Is All You Need for Video Frame Interpolation | Pytorch | 1.2.5 and later | ||
film | FILM: Frame Interpolation for Large Motion | Tensorflow | 1.2.10 and later | ||
rife | Real-Time Intermediate Flow Estimation for Video Frame Interpolation | Pytorch | 1.2.13 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
pytorch-gan | Code repo for the Pytorch GAN Zoo project (used to train this model) | Pytorch | 1.2.4 and later | ||
council-gan | Council-GAN | Pytorch | 1.2.4 and later | ||
restyle-encoder | ReStyle | Pytorch | 1.2.9 and later | ||
sam | Age Transformation Using a Style-Based Regression Model | Pytorch | 1.2.9 and later | ||
encoder4editing | Designing an Encoder for StyleGAN Image Manipulation | Pytorch | 1.2.10 and later | ||
lipgan | LipGAN | Keras | 1.2.15 and later | JP |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
yolov3-hand | Hand detection branch of Face detection using keras-yolov3 | Keras | 1.2.1 and later | ||
hand_detection_pytorch | hand-detection.PyTorch | Pytorch | 1.2.2 and later | ||
blazepalm | MediaPipePyTorch | Pytorch | 1.2.5 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
blazehand | MediaPipePyTorch | Pytorch | 1.2.5 and later | EN JP | |
hand3d | ColorHandPose3D network | TensorFlow | 1.2.5 and later | ||
minimal-hand | Minimal Hand | TensorFlow | 1.2.8 and later | ||
v2v-posenet | V2V-PoseNet | Pytorch | 1.2.6 and later | ||
hands_segmentation_pytorch | hands-segmentation-pytorch | Pytorch | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
illustration2vec | Illustration2Vec | Caffe | 1.2.2 and later | ||
image_captioning_pytorch | Image Captioning pytorch | Pytorch | 1.2.5 and later | EN JP | |
blip2 | Hugging Face - BLIP-2 | Pytorch | 1.2.16 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
vit | Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale) | Pytorch | 1.2.7 and later | EN JP | |
swin-transformer | Swin Transformer | Pytorch | 1.2.6 and later | ||
clip | CLIP | Pytorch | 1.2.9 and later | EN JP | |
japanese-clip | Japanese-CLIP | Pytorch | 1.2.15 and later | ||
japanese-stable-clip-vit-l-16 | japanese-stable-clip-vit-l-16 | Pytorch | 1.2.11 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
partialconv | Partial Convolution Layer for Padding and Image Inpainting | Pytorch | 1.2.0 and later | ||
weather-prediction-from-image | Weather Prediction From Image - (Warmth Of Image) | Keras | 1.2.5 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
inpainting-with-partial-conv | pytorch-inpainting-with-partial-conv | PyTorch | 1.2.6 and later | EN JP | |
inpainting_gmcnn | Image Inpainting via Generative Multi-column Convolutional Neural Networks | TensorFlow | 1.2.6 and later | ||
3d-photo-inpainting | 3D Photography using Context-aware Layered Depth Inpainting | Pytorch | 1.2.7 and later | ||
deepfillv2 | Free-Form Image Inpainting with Gated Convolution | Pytorch | 1.2.9 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
nafnet | NAFNet: Nonlinear Activation Free Network for Image Restoration | Pytorch | 1.2.10 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
llava | LLaVA | Pytorch | 1.2.16 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
landmarks_classifier_asia | Landmarks classifier_asia_V1.1 | TensorFlow Hub | 1.2.4 and later | EN JP | |
places365 | Release of Places365-CNNs | Pytorch | 1.2.5 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
mlsd | M-LSD: Towards Light-weight and Real-time Line Segment Detection | TensorFlow | 1.2.8 and later | EN JP | |
dexined | DexiNed: Dense Extreme Inception Network for Edge Detection | Pytorch | 1.2.5 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog | |
---|---|---|---|---|---|
agllnet | AGLLNet: Attention Guided Low-light Image Enhancement (IJCV 2021) | Pytorch | 1.2.9 and later | EN JP | |
drbn_skf | DRBN SKF | Pytorch | 1.2.14 and later |
Model | Reference | Exported From | Supported Ailia Version | Blog |
---|---|---|---|---|
bert | pytorch-pretrained-bert | Pytorch | 1.2.2 and later | EN JP |
bert_maskedlm | huggingface/transformers | Pytorch | 1.2.5 and later | |
bert_question_answering | huggingface/transformers | Pytorch | 1.2.5 and later | |
bert_zero_shot_classification | huggingface/transformers | Pytorch | 1.2.5 and later |