<a href="https://colab.research.google.com/github/AndreessaLopes/exemplo-colaboratory/blob/main/Ia.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://youtu.be/cEgF0YknpZw

## Train custom instance segmentation model using Detectron2 - on your own dataset

Create your own dataset by annotating for object detection using your favorite annotation software that can export annotations as COCO JSON format. I have used https://www.makesense.ai/ for my tutorial. I used the polygon tool to annotate objects and exported annotations as, "Single file in COCO JSON format"

**Install Detectron2**

In [None]:
!nvidia-smi


In [None]:
!python -m pip install pyyaml==5.1
import sys, os, distutils.core
# Note: This is a faster way to install detectron2 in Colab, but it does not include all functionalities (e.g. compiled operators).
# See https://detectron2.readthedocs.io/tutorials/install.html for full installation instructions
!git clone 'https://github.com/facebookresearch/detectron2'
dist = distutils.core.run_setup("./detectron2/setup.py")
!python -m pip install {' '.join([f"'{x}'" for x in dist.install_requires])}
sys.path.insert(0, os.path.abspath('./detectron2'))

# Atualizar detectron2
#!python -m pip install --upgrade 'git+https://github.com/facebookresearch/detectron2.git'

In [None]:
import torch, detectron2
!nvcc --version
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]
print("torch: ", TORCH_VERSION, "; cuda: ", CUDA_VERSION)
print("detectron2:", detectron2.__version__)

In [None]:
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import os, json, cv2, random
from google.colab.patches import cv2_imshow

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog, DatasetCatalog

The default models are trained on natural images so let us go ahead and load a natural image to see if detectron is working. **We will run a pre-trained model on this image.**

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
im = cv2.imread("/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/train/11.jpg")
cv2_imshow(im)

We create a detectron2 config and a detectron2 DefaultPredictor to run inference on this image.

In [None]:
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
# Find a model from detectron2's model zoo.  https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg)
outputs = predictor(im)

In [None]:
# look at the outputs - tensors and bounding boxes.
print(outputs["instances"].pred_classes)
print(outputs["instances"].pred_boxes)

In [None]:
# We can use `Visualizer` to draw the predictions on the image.
v = Visualizer(im[:, :, ::-1], MetadataCatalog.get(cfg.DATASETS.TRAIN[0]), scale=0.8)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])

Now that we know the model is working on a natural image, let us test on a scientific image.

Let us make some predictions. Remember that the model mask_rcnn_R_50_FPN_3x has no idea about our sample and did not get trained on cells or mitochondria or on any scientific images.

Of course, our cells are mislabeled as Tie, Pizza, etc. Let us train on a custom dataset.

# Train on a custom dataset

Import the necessary function to register datasets in the COCO format. Let us register both the training and validation datasets. Please note that we are working with training (and validation) data that is is the coco format where we have a single JSON file that describes all the annotations from all training images. <p>
Here, we are naming our training data as 'my_dataset_train' and the validation data as 'my_dataset_val'.



In [None]:
from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset_train", {}, "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/train/1807-1732.json", "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/train")
register_coco_instances("my_dataset_val", {}, "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/val/1580-1730.json", "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/val")

Let us extract the metadata and dataset dictionaries for both training and validation datasets. These can be used later for other purposes, like visualization, model training, evaluation, etc. We will see a visualization example right away.

In [None]:
train_metadata = MetadataCatalog.get("my_dataset_train")
train_dataset_dicts = DatasetCatalog.get("my_dataset_train")

In [None]:
val_metadata = MetadataCatalog.get("my_dataset_val")
val_dataset_dicts = DatasetCatalog.get("my_dataset_val")


In [None]:
from matplotlib import pyplot as plt

In [None]:
# Visualize some random samples
for d in random.sample(train_dataset_dicts, 3):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=train_metadata, scale=0.5)
    out = visualizer.draw_dataset_dict(d)
    cv2_imshow(out.get_image()[:, :, ::-1])

# Train

Now we are ready to train a Mask R-CNN model using the Detectron2 library. We start by setting up a configuration file (.cfg) for the model. The configuration file contains many details including the output directory path, training dataset information, pre-trained weights, base learning rate, maximum number of iterations, etc.

In [None]:
from detectron2.engine import DefaultTrainer

cfg = get_cfg()
cfg.OUTPUT_DIR = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/models"
cfg.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ()
cfg.DATALOADER.NUM_WORKERS = 2
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")  # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 2  # This is the real "batch size" commonly known to deep learning people
cfg.SOLVER.BASE_LR = 0.00025  # pick a good LR
cfg.SOLVER.MAX_ITER = 1000    # 300 iterations seems good enough for this toy dataset; you will need to train longer for a practical dataset
cfg.SOLVER.STEPS = []        # do not decay learning rate
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 256   # The "RoIHead batch size". 128 is faster, and good enough for this toy dataset (default: 512)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4  # only has one class (ballon). (see https://detectron2.readthedocs.io/tutorials/datasets.html#update-the-config-for-new-datasets)
# NOTE: this config means the number of classes, but a few popular unofficial tutorials incorrect uses num_classes+1 here.

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = DefaultTrainer(cfg)
trainer.resume_or_load(resume=False)
trainer.train()

In [None]:
from PIL import Image
import os

# Defina o caminho para a pasta que contém as imagens
pasta_com_imagens = '/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/train'

# Defina o tamanho desejado
desired_size = (4624, 4624)

# Liste todos os arquivos na pasta
arquivos_de_imagem = os.listdir(pasta_com_imagens)

# Itere sobre os arquivos de imagem
for arquivo in arquivos_de_imagem:
    # Verifique se o arquivo é uma imagem (você pode ajustar a verificação de extensão conforme necessário)
    if arquivo.endswith(('.jpg', '.jpeg', '.png', '.bmp', '.gif')):
        # Construa o caminho completo do arquivo
        caminho_completo = os.path.join(pasta_com_imagens, arquivo)

        # Abra a imagem
        img = Image.open(caminho_completo)

        # Redimensione a imagem para o tamanho desejado
        img = img.resize(desired_size)

        # Salve a imagem redimensionada de volta no mesmo local
        img.save(caminho_completo)

        # Feche a imagem
        img.close()


In [None]:
from PIL import Image
import os

# Defina o caminho para a pasta que contém as imagens
pasta_com_imagens = '/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/val'

# Defina o tamanho desejado
desired_size = (4624, 4624)

# Liste todos os arquivos na pasta
arquivos_de_imagem = os.listdir(pasta_com_imagens)

# Itere sobre os arquivos de imagem
for arquivo in arquivos_de_imagem:
    # Verifique se o arquivo é uma imagem (você pode ajustar a verificação de extensão conforme necessário)
    if arquivo.endswith(('.jpg', '.jpeg', '.png', '.bmp', '.gif')):
        # Construa o caminho completo do arquivo
        caminho_completo = os.path.join(pasta_com_imagens, arquivo)

        # Abra a imagem
        img = Image.open(caminho_completo)

        # Redimensione a imagem para o tamanho desejado
        img = img.resize(desired_size)

        # Salve a imagem redimensionada de volta no mesmo local
        img.save(caminho_completo)

        # Feche a imagem
        img.close()


Train

Save the config file, for potential future use

In [None]:
# Look at training curves in tensorboard:
%load_ext tensorboard
%tensorboard --logdir output

In [None]:
import yaml
# Save the configuration to a config.yaml file
# Save the configuration to a config.yaml file
config_yaml_path = "/content/drive/MyDrive/ColabNotebooks/models/Detectron2_Models/config.yaml"
with open(config_yaml_path, 'w') as file:
    yaml.dump(cfg, file)

# Inference & evaluation using the trained model

In [None]:
# Inference should use the config with parameters that are used in training
# cfg now already contains everything we've set previously. We changed it a little bit for inference:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")  # path to the model we just trained
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7   # set a custom testing threshold
predictor = DefaultPredictor(cfg)

Verify segmentation on random validation images

In [None]:
from detectron2.utils.visualizer import ColorMode

for d in random.sample(val_dataset_dicts, 3):    #select number of images for display
    im = cv2.imread(d["file_name"])
    outputs = predictor(im)
    v = Visualizer(im[:, :, ::-1],
                   metadata=val_metadata,
                   scale=0.5,
                   instance_mode=ColorMode.IMAGE_BW   # remove the colors of unsegmented pixels. This option is only available for segmentation models
    )
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])

Check average precision and recall. (Need more validation data than just 2 images with handful of annotations)

In [None]:
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader
evaluator = COCOEvaluator("my_dataset_val", output_dir="./output")
val_loader = build_detection_test_loader(cfg, "my_dataset_val")
print(inference_on_dataset(predictor.model, val_loader, evaluator))
# another equivalent way to evaluate the model is to use `trainer.test`

**Load a new image and segment it.**

In [None]:
new_im = cv2.imread("/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test/2030.jpeg")
if new_im is None:
    print("Image file does not exist or path is incorrect.")
else:
    outputs  = predictor(new_im)

    # We can use `Visualizer` to draw the predictions on the image.
    v = Visualizer(new_im[:, :, ::-1], metadata=train_metadata)
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))

    cv2_imshow(out.get_image()[:, :, ::-1])

**Process multiple images in a directory and save the results in an output directory**

In [None]:
from PIL import Image
import os

# Defina o caminho para a pasta que contém as imagens
pasta_com_imagens = '/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test'

# Defina o tamanho desejado
desired_size = (4624, 4624)

# Liste todos os arquivos na pasta
arquivos_de_imagem = os.listdir(pasta_com_imagens)

# Itere sobre os arquivos de imagem
for arquivo in arquivos_de_imagem:
    # Verifique se o arquivo é uma imagem (você pode ajustar a verificação de extensão conforme necessário)
    if arquivo.endswith(('.jpg', '.jpeg', '.png', '.bmp', '.gif')):
        # Construa o caminho completo do arquivo
        caminho_completo = os.path.join(pasta_com_imagens, arquivo)

        # Abra a imagem
        img = Image.open(caminho_completo)

        # Redimensione a imagem para o tamanho desejado
        img = img.resize(desired_size)

        # Salve a imagem redimensionada de volta no mesmo local
        img.save(caminho_completo)

        # Feche a imagem
        img.close()

In [None]:
import os
import cv2
from detectron2.engine import DefaultPredictor
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog

# Directory path to the input images folder
input_images_directory = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test"

# Output directory where the segmented images will be saved
output_directory = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test_results"

# Assuming 'cfg' is defined before this point
# Load the model
predictor = DefaultPredictor(cfg)

# Load metadata for visualization
train_metadata = MetadataCatalog.get(cfg.DATASETS.TRAIN[0])

# Loop over the images in the input folder
for image_filename in os.listdir(input_images_directory):
    image_path = os.path.join(input_images_directory, image_filename)

    # Check if the item is a file and ends with an image extension
    if os.path.isfile(image_path) and image_filename.lower().endswith(('.png', '.jpg', '.jpeg')):
        new_im = cv2.imread(image_path, cv2.IMREAD_COLOR)
    else:
        continue

    height, width = new_im.shape[:2]

    # Perform prediction on the new image
    outputs = predictor(new_im)

    # We can use `Visualizer` to draw the predictions on the image.
    v = Visualizer(new_im[:, :, ::-1], metadata=train_metadata)
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))

    # Create the output filename with "_result" suffix
    result_filename = os.path.splitext(image_filename)[0] + "_result.png"
    output_path = os.path.join(output_directory, result_filename)

    # Save the segmented image
    cv2.imwrite(output_path, out.get_image()[:, :, ::-1])

print("Segmentation of all images completed.")



**Segment images and save object level information into a csv file.**




In [None]:
import csv
import os
import cv2
from skimage.measure import regionprops, label

# Assuming you have already defined the 'predictor' object and loaded the model.
# Also, make sure 'metadata' is defined appropriately.

# Directory path to the input images folder
input_images_directory = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test"

# Output directory where the CSV file will be saved
output_csv_path = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test_results/output_objects.csv"  # Add '.csv' extension to the CSV file

# Check if the output_csv_path already exists
if os.path.exists(output_csv_path):
    # Delete the output_csv_path if it already exists
    os.remove(output_csv_path)

# Create the output_csv_path as a file
with open(output_csv_path, 'w', newline='') as csvfile:
    csvwriter = csv.writer(csvfile)

    # Write the header row in the CSV file
    csvwriter.writerow(["File Name", "Class Name", "Object Number", "Area", "Centroid", "BoundingBox"])  # Add more columns as needed for other properties

    # Loop over the images in the input folder
    for image_filename in os.listdir(input_images_directory):
        image_path = os.path.join(input_images_directory, image_filename)
        new_im = cv2.imread(image_path)

        # Perform prediction on the new image
        if new_im is not None:
            outputs = predictor(new_im)  # Format is documented at https://detectron2.read

**Generate plots to understand the objects**

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# Path to the CSV file containing the object-level information
csv_file_path = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test_results/output_objects.csv"   # Update with your CSV file path

# Load the CSV file into a pandas DataFrame
df = pd.read_csv(csv_file_path)

# Get class names from train_metadata.thing_classes
class_names = train_metadata.thing_classes

# Group the data by both "File Name" and "Class Name" and calculate the average number of objects per image for each class
# first group the data by both "File Name" and "Class Name" and count the number of objects within each group.
#Then, group the data by "Class Name" only and calculate the mean of the counts, which gives us the average number of objects per image for each class.
avg_objects_per_class = df.groupby(["File Name", "Class Name"])["Object Number"].count().reset_index()
avg_objects_per_class = avg_objects_per_class.groupby("Class Name")["Object Number"].mean().reset_index()

# Plot: Average number of objects per image for each class
plt.figure(figsize=(10, 6))
sns.barplot(x="Class Name", y="Object Number", data=avg_objects_per_class, ci=None, order=class_names)
plt.xticks(rotation=45)
plt.xlabel("Class Name")
plt.ylabel("Average Number of Objects per Image")
plt.title("Average Number of Objects per Image for Each Class")
plt.tight_layout()
plt.show()


# Group the data by class and calculate the average area of objects for each class
avg_area_per_class = df.groupby("Class Name")["Area"].mean().reset_index()

# Plot: Average area of objects for each class
plt.figure(figsize=(10, 6))
sns.barplot(x="Class Name", y="Area", data=avg_area_per_class, ci=None, order=class_names)
plt.xticks(rotation=45)
plt.xlabel("Class Name")
plt.ylabel("Average Area of Objects")
plt.title("Average Area of Objects for Each Class")
plt.tight_layout()
plt.show()




**Saving binary (actually multinary) images for each class for further processing.** Here, for each input image we will save n images corresponding to the number of classes. In our example, we will save 4 images for each image corresponding to the 4 classes. Each of these images will contain objects numbered 1, 2, 3, etc. - basically instance segmentation like images. These images can be used for further downstream processing.

In [None]:
import os
import cv2
import numpy as np
import torch
from detectron2.utils.visualizer import Visualizer

# Directory path to the input images folder
input_images_directory = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test"

# Output directory where the segmented images will be saved
output_directory = "/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/test_results_instance"  # Update this with your desired output directory

# Create the output directory if it doesn't exist
os.makedirs(output_directory, exist_ok=True)

# Loop over the images in the input folder
for image_filename in os.listdir(input_images_directory):
    image_path = os.path.join(input_images_directory, image_filename)

    # Check if the image is loaded successfully
    new_im = cv2.imread(image_path)
    if new_im is None:
        print(f"Error loading image: {image_filename}")
        continue

    # Perform prediction on the new image
    outputs = predictor(new_im)  # Format is documented at https://detectron2.readthedocs.io/tutorials/models.html#model-output-format

    # Create a dictionary to store the mask for each class with unique integer labels
    if len(outputs["instances"].pred_masks) == 0:
      class_masks = {}
    else:
      class_masks = {class_name: torch.zeros_like(outputs["instances"].pred_masks[0], dtype=torch.uint8, device=torch.device("cuda:0"))
                for class_name in train_metadata.thing_classes}

    # Assign a unique integer label to each object in the mask
    for i, pred_class in enumerate(outputs["instances"].pred_classes):
        class_name = train_metadata.thing_classes[pred_class]
        class_masks[class_name] = torch.where(outputs["instances"].pred_masks[i].to(device=torch.device("cuda:0")),
                                              i + 1,
                                              class_masks[class_name])

    # Save the masks for each class with unique integer labels
    for class_name, class_mask in class_masks.items():
        # Convert the tensor to a NumPy array and then to a regular (CPU) array
        class_mask_np = class_mask.cpu().numpy()

        # Create the output filename with _class_name_result.png extension
        class_filename = os.path.splitext(image_filename)[0] + f"_{class_name}_result.png"
        class_output_path = os.path.join(output_directory, class_filename)

        # Save the image with unique integer labels
        cv2.imwrite(class_output_path, class_mask_np.astype(np.uint8))

print("Segmentation of all images completed.")



# END

**Interested in panoptic segmentation?**

In [None]:
my_new_image = cv2.imread("/content/drive/MyDrive/Fotos - Café Maduro/training/dataset/val/2020.jpeg")
cv2_imshow(my_new_image)

In [None]:
# Inference with instance segmentation
cfg_inst = get_cfg()
cfg_inst.merge_from_file(model_zoo.get_config_file("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml"))
cfg_inst.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
# Find a model from detectron2's model zoo.  https://github.com/facebookresearch/detectron2/blob/main/MODEL_ZOO.md
cfg_inst.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
predictor = DefaultPredictor(cfg_inst)
outputs = predictor(my_new_image)

v = Visualizer(my_new_image[:, :, ::-1], MetadataCatalog.get(cfg_inst.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])

**Panoptic segmentation = Instance segmentation + Semantic Segmentation**





In [None]:
# Inference with a panoptic segmentation model
cfg_pan = get_cfg()
cfg_pan.merge_from_file(model_zoo.get_config_file("COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml"))
cfg_pan.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-PanopticSegmentation/panoptic_fpn_R_101_3x.yaml")
predictor = DefaultPredictor(cfg_pan)
panoptic_seg, segments_info = predictor(my_new_image)["panoptic_seg"]
v = Visualizer(my_new_image[:, :, ::-1], MetadataCatalog.get(cfg_pan.DATASETS.TRAIN[0]), scale=1.2)
out = v.draw_panoptic_seg_predictions(panoptic_seg.to("cpu"), segments_info)
cv2_imshow(out.get_image()[:, :, ::-1])