At Ikomia, we deeply believe that sharing scientific knowledge is the key to success, that's why we make research-based algorithms ready-to-use for developers.
The main goal of Ikomia is to take existing Python code and wrap it as ready-to-use algorithm for Ikomia API (our Python library) and Ikomia STUDIO (our desktop software). With this approach, we can easily integrate individual repos from researchers or labs and also awesome frameworks like OpenCV, Detectron2, OpenMMLab or Hugging Face so that developers can benefit from the best state-of-the-art algorithms in a single framework.
Table of Contents
Name | Language | Description | Original repository |
---|---|---|---|
auto_annotate | Auto-annotate images with GroundingDINO and SAM models | Made by Ikomia | |
dataset_classification | Load classification dataset | Made by Ikomia | |
dataset_coco | Load COCO 2017 dataset | Made by Ikomia | |
dataset_cwfid | Load Crop/Weed Field Image Dataset (CWFID) for semantic segmentation | Link | |
dataset_pascal_voc | Load PascalVOC dataset | Made by Ikomia | |
dataset_via | Load VGG Image Annotator dataset | Made by Ikomia | |
dataset_wgisd | Load Wine Grape Instance Segmentation Dataset (WGISD) | Link | |
dataset_wildreceipt | Load Wildreceipt dataset | Made by Ikomia | |
dataset_yolo | Load YOLO dataset | Made by Ikomia |
Name | Language | Description | Original repository |
---|---|---|---|
infer_covidnet | A tailored Deep Convolutional Neural Network Design for detection of COVID-19 cases from chest radiography images. | Link | |
infer_emotion_fer_plus | Facial emotion recognition using DNN trained from crowd-sourced label distribution. | Link | |
infer_resnet_action_recognition | Human action recognition with spatio-temporal 3D CNNs. | Link | |
infer_timm_image_classification | Infer timm image classification models | Link | |
infer_torchvision_mnasnet | MnasNet inference model for image classification. | Link | |
infer_torchvision_resnet | ResNet inference model for image classification. | Link | |
infer_torchvision_resnext | ResNeXt inference model for image classification. | Link | |
infer_yolo_v8_classification | Inference with YOLOv8 image classification models | Made by Ikomia | |
train_timm_image_classification | Train timm image classification models | Link | |
train_torchvision_mnasnet | Training process for MnasNet convolutional network. | Link | |
train_torchvision_resnet | Training process for ResNet convolutional network. | Link | |
train_torchvision_resnext | Training process for ResNeXt convolutional network. | Link | |
train_yolo_v8_classification | Train YOLOv8 classification models. | Made by Ikomia |
Name | Language | Description | Original repository |
---|---|---|---|
infer_colorful_image_colorization | Automatic colorization of grayscale image based on neural network. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_hf_stable_diffusion | Stable diffusion models from Hugging Face. | Link | |
infer_kandinsky_2 | Kandinsky 2.2 text2image diffusion model. | Link | |
infer_kandinsky_2_controlnet_depth | Kandinsky 2.2 controlnet depth diffusion model. | Link | |
infer_kandinsky_2_image_mixing | Kandinsky 2.2 image mixing diffusion model. | Link | |
infer_kandinsky_2_img2img | Kandinsky 2.2 image-to-image diffusion model. | Link | |
infer_neural_style_transfer | Neural network method to paint given image in the style of the reference image. | Link | |
infer_pulid | Pure and Lightning ID customization (PuLID) is a novel tuning-free ID customization method for text-to-image generation. | Link | |
infer_stable_cascade | Stable Cascade is a diffusion model trained to generate images given a text prompt. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_background_matting | Real-Time High-Resolution Background Matting | Link | |
infer_modnet_portrait_matting | Inference of MODNet Portrait Matting. | Link | |
infer_p3m_portrait_matting | Inference of Privacy-Preserving Portrait Matting (P3M) | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_face_inpainting | Face inpainting using Segformer for segmentation and RealVisXL for inpainting. | Made by Ikomia | |
infer_hf_stable_diffusion_inpaint | Stable diffusion inpainting models from Hugging Face. | Link | |
infer_kandinsky_2_inpaint | Kandinsky 2.2 inpainting diffusion model. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_detectron2_instance_segmentation | Infer Detectron2 instance segmentation models | Link | |
infer_detectron2_pointrend | PointRend inference model of Detectron2 for instance segmentation. | Link | |
infer_hf_instance_seg | Instance segmentation using models from Hugging Face. | Link | |
infer_sparseinst | Infer Sparseinst instance segmentation models | Link | |
infer_torchvision_mask_rcnn | Mask R-CNN inference model for object detection and segmentation. | Link | |
infer_yolact | A simple, fully convolutional model for real-time instance segmentation. | Link | |
infer_yolo_v7_instance_segmentation | Inference for YOLO v7 instance segmentation models | Link | |
infer_yolo_v8_seg | Inference with YOLOv8 segmentation models | Link | |
infer_yolop_v2 | Panoptic driving Perception using YoloPv2 | Link | |
train_detectron2_instance_segmentation | Train Detectron2 instance segmentation models | Link | |
train_mmlab_segmentation | Train for MMLAB segmentation models | Link | |
train_sparseinst | Train Sparseinst instance segmentation models | Link | |
train_torchvision_mask_rcnn | Training process for Mask R-CNN convolutional network. | Link | |
train_yolo_v7_instance_segmentation | Train for YOLO v7 instance segmentation models | Link | |
train_yolo_v8_seg | Train YOLOv8 instance segmentation models. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_detectron2_densepose | Detectron2 inference model for human pose detection. | Link | |
infer_detectron2_keypoints | Inference for Detectron2 keypoint models | Link | |
infer_mmlab_pose_estimation | Inference for pose estimation models from mmpose | Link | |
infer_yolo_v7_keypoints | YOLOv7 pose estimation models. | Link | |
infer_yolo_v8_pose_estimation | Inference with YOLOv8 pose estimation models | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_detectron2_detection | Inference for Detectron2 detection models | Link | |
infer_detectron2_retinanet | RetinaNet inference model of Detectron2 for object detection. | Link | |
infer_detectron2_tridentnet | TridentNet inference model of Detectron2 for object detection. | Link | |
infer_face_detection_kornia | Face detection using the Kornia API | Link | |
infer_google_vision_face_detection | Face detection using Google cloud vision API. | Link | |
infer_google_vision_landmark_detection | Landmark Detection detects popular natural and human-made structures within an image. | Link | |
infer_google_vision_logo_detection | Logo Detection detects popular product logos within an image using the Google cloud vision API. | Link | |
infer_google_vision_object_localization | The Vision API can detect and extract multiple objects in an image with Object Localization. | Link | |
infer_grounding_dino | Inference of the Grounding DINO model | Link | |
infer_mmlab_detection | Inference for MMDET from MMLAB detection models | Link | |
infer_torchvision_faster_rcnn | Faster R-CNN inference model for object detection. | Link | |
infer_yolo_v5 | Ultralytics YoloV5 object detection models. | Made by Ikomia | |
infer_yolo_v7 | YOLOv7 object detection models. | Link | |
infer_yolo_v8 | Inference with YOLOv8 models | Link | |
infer_yolo_v9 | Object detection with YOLOv9 models | Link | |
infer_yolo_v10 | Run inference with YOLOv10 models | Link | |
infer_yolo_world | YOLO-World is a real-time zero-shot object detection modelthat leverages the power of open-vocabulary learning to recognize and localize a wide range of objects in images. | Link | |
infer_yolop_v2 | Panoptic driving Perception using YoloPv2 | Link | |
infer_yolor | Inference for YoloR object detection models | Link | |
train_detectron2_detection | Train for Detectron2 detection models | Link | |
train_mmlab_detection | Train for MMLAB detection models | Link | |
train_torchvision_faster_rcnn | Training process for Faster R-CNN convolutional network. | Link | |
train_yolo_v5 | Train Ultralytics YoloV5 object detection models. | Link | |
train_yolo_v7 | Train YOLOv7 object detection models. | Link | |
train_yolo_v8 | Train YOLOv8 object detection models. | Link | |
train_yolo_v9 | Train YOLOv9 models | Link | |
train_yolo_v10 | Train YOLOv10 object detection models. | Link | |
train_yolor | Train YoloR object detection models | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_bytetrack | Infer ByteTrack for object tracking | Link | |
infer_deepsort | Multiple Object Tracking algorithm (MOT) combining a deep association metricwith the well known SORT algorithm for better performance. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_google_vision_ocr | Detects and extracts text from any image. | Link | |
infer_mmlab_text_detection | Inference for MMOCR from MMLAB text detection models | Link | |
infer_mmlab_text_recognition | Inference for MMOCR from MMLAB text recognition models | Link | |
train_mmlab_kie | Train for MMOCR from MMLAB KIE models | Link | |
train_mmlab_text_detection | Training process for MMOCR from MMLAB in text detection | Link | |
train_mmlab_text_recognition | Training process for MMOCR from MMLAB in text recognition | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_raft_optical_flow | Estimate the optical flow from a video using a RAFT model. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_depth_anything | Depth Anything is a highly practical solution for robust monocular depth estimation | Link | |
infer_google_vision_image_properties | Image Properties feature detects general attributes of the image, such as dominant color. | Link | |
infer_google_vision_label_detection | Detect and extract information about entities in an image, across a broad group of categories. | Link | |
infer_google_vision_safe_search | Safe Search detects explicit content such as adult content or violent content within an image. | Link | |
infer_google_vision_web_detection | Web Detection detects Web references to an image. | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_detectron2_panoptic_segmentation | Infer Detectron2 panoptic segmentation models | Made by Ikomia | |
infer_hf_image_seg | Panoptic segmentation using models from Hugging Face. | Made by Ikomia | |
infer_mmlab_segmentation | Inference for MMLAB segmentation models | Link | |
train_mmlab_segmentation | Train for MMLAB segmentation models | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_detectron2_deeplabv3plus | DeepLabv3+ inference model of Detectron2 for semantic segmentation. | Link | |
infer_hf_semantic_seg | Semantic segmentation using models from Hugging Face. | Link | |
infer_mmlab_segmentation | Inference for MMLAB segmentation models | Link | |
infer_mobile_segment_anything | Inference for Mobile Segment Anything Model (SAM). | Link | |
infer_segment_anything | Inference for Segment Anything Model (SAM). | Link | |
infer_transunet | TransUNet inference for semantic segmentation | Link | |
infer_unet | Multi-class semantic segmentation using Unet, the default model was trained on Kaggle's Carvana Images dataset | Link | |
infer_yolop_v2 | Panoptic driving Perception using YoloPv2 | Link | |
train_detectron2_deeplabv3plus | Training process for DeepLabv3+ model of Detectron2. | Link | |
train_hf_semantic_seg | Train models for semantic segmentationwith transformers from HuggingFace. | Link | |
train_mmlab_segmentation | Train for MMLAB segmentation models | Link | |
train_transunet | Training process for TransUNet model. | Link | |
train_unet | multi-class semantic segmentation using Unet | Link |
Name | Language | Description | Original repository |
---|---|---|---|
infer_swinir_super_resolution | Image restoration algorithms with Swin Transformer | Made by Ikomia |