<a href="https://colab.research.google.com/github/JorgeAnsotegui/TFM/blob/main/YoloV8/Entrenamiento_YoloV8.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Importamos las librerias y modulos necesarios:**

In [1]:
from ultralytics import YOLO
from matplotlib import pyplot as plt
from PIL import Image

**Import a model and populate it with pre-trained weights.**
<p>
Here, we are importing an instance segmentation model with weights. For a list of pre-trained models, checkout: https://docs.ultralytics.com/models/yolov8/#key-features

In [2]:
#Instance
model = YOLO('yolov8n-seg.yaml')  # build a new model from YAML
model = YOLO('yolov8n-seg.pt')  # Transfer the weights from a pretrained model (recommended for training)

Downloading https://github.com/ultralytics/assets/releases/download/v8.2.0/yolov8n-seg.pt to 'yolov8n-seg.pt'...


100%|██████████| 6.74M/6.74M [00:00<00:00, 15.7MB/s]


On colab, you may encounter encoding issues while working with certain libraries (e.g., installing roboflow) so let's go ahead and run the following cell.  

In [16]:
#Withut this Colab is giving an error when installing Roboflow
# import locale
# import os
# locale.getpreferredencoding = lambda: "UTF-8"

Let us load the YAML file that contains the names of our classes, number of classes and the directories for train, valid, and test datasets, respectively.

In [5]:
yaml = "Dataset\Polipos265_Detectron2_YoloV8\data.yaml"
# this is the YAML file Roboflow wrote for us that we're loading into this notebook with our data
%cat Dataset\Polipos265_Detectron2_YoloV8\data.yaml

train: ../train/images
val: ../val/images
test: ../test/images

nc: 1
names: ['Polipo']

roboflow:
  workspace: master-c9yad
  project: polipos
  version: 1
  license: CC BY 4.0
  url: https://universe.roboflow.com/master-c9yad/polipos/dataset/1

In [7]:
# define number of classes based on YAML
import yaml
with open(yaml, 'r') as stream:
    num_classes = str(yaml.safe_load(stream)['nc'])

**Train the model**

In [49]:
ruta_raiz = "models/YoloV8_Models/results/"

In [50]:
#Define a project --> Destination directory for all results
project = ruta_raiz
#Define subdirectory for this specific training
name = "250_epochs-" #note that if you run the training again, it creates a directory: 3_epochs-2
ruta_modelo = os.path.join(ruta_raiz, name)

In [None]:
# Train the model
results = model.train(data=yaml,
                      project=project,
                      name=name,
                      epochs=250,
                      patience=0, # I am setting patience=0 to disable early stopping.
                      batch=4,
                      imgsz=512)

In [None]:
# Start tensorboard
# Launch after you have started training
# # %reload_ext tensorboard
%load_ext tensorboard
%tensorboard --logdir ruta_modelo --port 6007

All training curves, metrics, and other results are stored as images in the 'runs' directory. Let us open a couple of these images. <p>
Please note that from here on I will be working the model that I've already trained for 50 epochs.

In [None]:
from IPython.display import Image

In [None]:
ruta_imagen = os.path.join(ruta_modelo, "results.png")
Image(filename=ruta_imagen)


In [None]:
ruta_train_batch = os.path.join(ruta_modelo, "train_batch2.jpg")
Image(filename=ruta_train_batch, width=900)

**Run inference**

Now that our model is trained, we can use it for inference.

In [42]:
import glob

#List the saved models in 'runs' directory. Note that you will see multiple 'train' subdirectories numbered 1, 2, 3, etc. The exact number depends on the number of epochs.
ruta_weights = os.path.join(ruta_modelo, "weights")

# Usar glob para obtener la lista de archivos en la ruta especificada
archivos_en_ruta = glob.glob(ruta_weights + "/*")

# Mostrar los archivos encontrados
for archivo in archivos_en_ruta:
    print(archivo)

/content/drive/MyDrive/TFM/models/YoloV8_Models/results/3_epochs-/weights/last.pt
/content/drive/MyDrive/TFM/models/YoloV8_Models/results/3_epochs-/weights/best.pt


You can load the best model or the latest. I am picking the latest.

In [None]:
ruta_last_weight = os.path.join(ruta_weights, "last.pt")

my_new_model = YOLO(ruta_last_weight)

Load an image and perform inference (segmentation).

In [None]:
import random
# Directorio que contiene las imágenes de prueba
ruta_imagenes = "/content/drive/MyDrive/TFM/dataset_Detectron2_V3/test/images"

# Obtener la lista de nombres de archivo en el directorio
archivos = os.listdir(ruta_imagenes)

# Elegir un nombre de archivo al azar
nombre_archivo = random.choice(archivos)

# Ruta completa de la imagen seleccionada
ruta_imagen = os.path.join(ruta_imagenes, nombre_archivo)

# Hacer la inferencia en la imagen seleccionada
new_results = my_new_model.predict(ruta_imagen, conf=0.5)  # Ajusta el umbral de confianza según sea necesario

In [None]:
new_image = '/content/drive/MyDrive/ColabNotebooks/data/NuInsSeg_Nuclei_dataset/yolo_dataset/test/images/human_liver_22.png'
new_results = my_new_model.predict(new_image, conf=0.5)  #Adjust conf threshold


The results are stored in a variable 'new_results'. Since we only have one image for segmentation, we will only have one set of results. Therefore, let us work with that one result.

In [None]:
# Suponiendo que new_results[0] es una instancia de algún objeto que tiene un método plot() que devuelve la imagen
new_result_array = new_results[0].plot()

# Si new_result_array es el resultado de la función plot(), y es una imagen en formato RGB o BGR

# Si la imagen está en formato BGR, es posible que necesite convertirla a RGB antes de mostrarla con matplotlib
new_result_array_rgb = new_result_array[:, :, ::-1]  # Esta línea invierte el orden de los canales de color (BGR a RGB)

# Crear una figura de matplotlib y mostrar la imagen
plt.figure(figsize=(9, 9))
plt.imshow(new_result_array_rgb)
plt.axis('off')  # Desactivar los ejes si no son necesarios

# Mostrar la figura
plt.show()


# Segmenting and analyzing multiple images

Now, let us segment all our test images, perform measurements for all objects, capture them into a pandas dataframe and save the dataframe into a csv file.

In [None]:
import os
import csv
import cv2
import matplotlib.pyplot as plt
from skimage.measure import label, regionprops

# Directory path to the input images folder
input_images_directory = "/content/drive/MyDrive/ColabNotebooks/data/NuInsSeg_Nuclei_dataset/yolo_dataset/test/images"

# Output directory where the CSV file will be saved
output_csv_path = "/content/drive/MyDrive/ColabNotebooks/data/NuInsSeg_Nuclei_dataset/yolo_dataset/test_results/output_objects_yolo.csv"

# Extract the directory name from the full path
output_dir_name = os.path.dirname(output_csv_path)

# Check if the directory exists
if not os.path.exists(output_dir_name):
    os.makedirs(output_dir_name)

# List of valid image extensions. This ensures that the code doesn't throw
# errors if your directory has non-images, like .json or other text files.
valid_extensions = ['.jpg', '.jpeg', '.png', '.bmp', '.tiff', '.gif']

# Open the CSV file for writing
with open(output_csv_path, 'w', newline='') as csvfile:
    csvwriter = csv.writer(csvfile)

    # Write the header row in the CSV file
    csvwriter.writerow(["File Name", "Class Name", "Object Number", "Area", "Centroid", "BoundingBox"])

    # Loop over the images in the input folder
    for image_filename in os.listdir(input_images_directory):
        # Check if the file has a valid image extension
        if not any(image_filename.lower().endswith(ext) for ext in valid_extensions):
            continue

        image_path = os.path.join(input_images_directory, image_filename)
        new_im = cv2.imread(image_path)

        # Perform prediction on the new image
        new_results = my_new_model.predict(new_im, conf=0.2)  # Adjust conf threshold


        # Access the bounding boxes and class labels from new_results
        bounding_boxes = new_results[0].boxes.data.cpu().numpy()  # Move to CPU and convert to NumPy array
        class_labels = [0 for _ in range(len(bounding_boxes))]  # Assuming all objects are 'Nuclei'

        # Write the object-level information to the CSV file
        for i, bbox in enumerate(bounding_boxes):
            object_number = i + 1
            x1, y1, x2, y2 = bbox[:4]  # Only take the first 4 values
            area = (x2 - x1) * (y2 - y1)
            centroid = ((x1 + x2) / 2, (y1 + y2) / 2)
            bounding_box = (x1, y1, x2, y2)

            #
            class_name = 'Nuclei'  # Since all objects are 'Nuclei' in this example

            csvwriter.writerow([image_filename, class_name, object_number, area, centroid, bounding_box])

print("Object-level information saved to CSV file.")



0: 512x512 58 Nucleis, 15.4ms
Speed: 3.7ms preprocess, 15.4ms inference, 4.0ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 38 Nucleis, 10.4ms
Speed: 2.0ms preprocess, 10.4ms inference, 3.0ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 112 Nucleis, 10.3ms
Speed: 2.3ms preprocess, 10.3ms inference, 4.8ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 47 Nucleis, 10.1ms
Speed: 2.0ms preprocess, 10.1ms inference, 3.4ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 7 Nucleis, 8.1ms
Speed: 3.2ms preprocess, 8.1ms inference, 2.5ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 15 Nucleis, 9.4ms
Speed: 2.1ms preprocess, 9.4ms inference, 2.8ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 51 Nucleis, 11.7ms
Speed: 1.5ms preprocess, 11.7ms inference, 3.3ms postprocess per image at shape (1, 3, 512, 512)

0: 512x512 39 Nucleis, 8.9ms
Speed: 2.2ms preprocess, 8.9ms inference, 4.4ms postprocess per image at shape

Object-level information saved to CSV file.
