Skip to content

JersonGB22/ImageSegmentation-TensorFlow-PyTorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

25 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Image Segmentation

This repository presents the implementation of Image Segmentation models, a task in the field of Computer Vision that involves classifying each pixel in an image into a category or a specific instance of a category. This task can be divided into three types:

  • Semantic segmentation: Assigns a class label to each pixel in an image without distinguishing between different instances of the same class.

  • Instance segmentation: Goes beyond Object Detection by labeling each pixel that belongs to a detected object with a specific class and instance. In this way, the models not only provide the coordinates of the bounding box, along with class labels and confidence scores, but also generate binary masks for each detected instance in an image.

  • Panoptic segmentation: Combines semantic segmentation and instance segmentation by assigning each pixel in an image both a class and an instance label. This allows for a detailed segmentation of complex scenes.

Currently, image segmentation is used in a wide range of highly important fields. It plays a key role in medicine by helping identify and analyze tissues and tumors in diagnostic images; in autonomous driving, where it aids in detecting and classifying roads, pedestrians, and obstacles; in environmental monitoring, by using satellite images to classify different types of terrain and detect changes; in robotics, enabling precise object localization and manipulation; and in augmented reality and video editing, improving the integration of digital elements into real-world scenes.

Implemented Models:

Some of the models in this repository are built and trained from scratch using Convolutional Neural Networks (CNNs). In other cases, fine-tuning is applied through transfer learning, making use of high-performing pretrained models such as Transformers and YOLO11-seg, trained on large datasets. These projects use frameworks like TensorFlow, PyTorch, Hugging Face, and Ultralytics.

In addition, training and fine-tuning are carried out using hardware resources such as TPUs or GPUs available in Google Colab, depending on the project's requirements.

Most of the notebooks in this repository include data augmentation techniques applied to the training set to improve the model's generalization ability. These techniques are implemented manually using libraries like Albumentations or automatically (e.g., with YOLO11). Strategies such as callbacks and learning rate schedulers are also used to prevent overfitting and achieve optimal performance.

Below are the evaluation results of the models implemented to date. In cases where the validation or test set is unavailable or not publicly accessible, the evaluation was performed exclusively on the available split.

πŸ“Š Panoptic Segmentation

Dataset Domain Model $\text{PQ}$ $\text{SQ}$ $\text{RQ}$ Eval. Set
LaRS Maritime obstacle detection Mask2Former-Swin-Tiny 0.564 0.791 0.686 Validation

πŸ“Š Instance Segmentation

Dataset Domain Model $\text{mAP}^{\text{mask}}_{50}$ $\text{mAP}^{\text{mask}}_{50-95}$ Eval. Set
SBD General object segmentation YOLO11l-seg 0.895 0.719 Validation
PanNuke Histopathology (nucleus segmentation) YOLO11s-seg 0.700 / 0.692 0.464 / 0.455 Validation / Test
USIS10K Underwater scene analysis YOLO11l-seg 0.634 / 0.635 0.490 / 0.495 Validation / Test
BDD100K Autonomous driving YOLO11m-seg 0.464 0.266 Validation

πŸ“Š Semantic Segmentation

Dataset Domain Model $\text{mIoU}$ $\text{Dice Score}$ Eval. Set
UW-Madison GI Tract Medical imaging (gastrointestinal tract) SegFormer-B3 0.900 0.946 Validation
LandCover.ai Aerial land‑cover classification SegFormer-B3 0.870 / 0.872 0.928 / 0.929 Validation / Test
CamVid Autonomous driving SegFormer-B2 0.869 0.927 Validation
Carvana Binary segmentation of cars U-Net 0.995 0.997 Validation
CUB-200-2011 Binary segmentation of 200 bird species ConvNeXt-Base U-Net 0.955 0.977 Test
Caltech-101 Binary segmentation of 101 object classes ConvNeXt-Base U-Net 0.932 0.965 Test

Visual Results on Multiple Datasets

LaRS


BDD100K


USIS10K


SBD


PanNuke


LandCover.ai


CamVid


UW-Madison GI Tract


Caltech-101


CUB-200-2011


Carvana

More results can be found in the respective notebooks.

Technological Stack

Python TensorFlow PyTorch Hugging Face Ultralytics

Scikit-learn OpenCV Pandas Plotly

Contact

Gmail LinkedIn GitHub