<a href="https://colab.research.google.com/github/bnsreenu/python_for_microscopists/blob/master/334_training_YOLO_V8_EM_platelets_converted_labels.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

https://youtu.be/ytlhMAF6ok0

# **Custom instance model training using YOLOv8**
<p>
This code walks you through the process of training a custom YOLO v8 model using your own data. Here, I am using a public dataset that shows multiple classes for segmentation. This is the same dataset from tutorial 330 (Detectron2) - https://youtu.be/cEgF0YknpZw

<p>
Dataset from: https://leapmanlab.github.io/dense-cell/
<br>
Direct link to the dataset: https://www.dropbox.com/s/68yclbraqq1diza/platelet_data_1219.zip

**Data courtesy of:**
Guay, M.D., Emam, Z.A.S., Anderson, A.B. et al. ​
Dense cellular segmentation for EM using 2D–3D neural network ensembles. Sci Rep 11, 2561 (2021). ​
<p>
To prepare this dataset for YOLO, the binary masks were converted to the YOLO format. Please follow this tutorial to learn about this process. <br>
(https://youtu.be/NYeJvxe5nYw)

<p>

If you already have annotations in COCO format JSON file, for example by annotating using makesense (https://www.makesense.ai/) then the annotations can be imported to Roboflow for conversion to YOLO format. Otherwise, if you are starting from scratch, just annotate datasets on Roboflow. (https://roboflow.com/). You just need to upload your images along with the JSON file and Roboflow will convert them to any other format, in our case YOLO v8. <p>

For information about YOLO models: <p>
https://docs.ultralytics.com/models/yolov8/#key-features
<p>


**Install the required libraries:**

Let us start by installing ultralytics library. All other libraries should be pre-installed on colab. If you are working on a local system, please make sure you install matplotlib, Pillow, numpy, Seaborn, and roboflow. You may also want to install pandas and other libraries depending on the task.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [None]:
cd drive/My Drive/hojung/yolo_1023

In [None]:
# Install the ultralytics package using pip
!pip install ultralytics

In [None]:
from ultralytics import YOLO
from matplotlib import pyplot as plt
from PIL import Image

**Import a model and populate it with pre-trained weights.**
<p>
Here, we are importing an instance segmentation model with weights. For a list of pre-trained models, checkout: https://docs.ultralytics.com/models/yolov8/#key-features

In [None]:
#Instance
model = YOLO('yolov8n-seg.yaml')  # build a new model from YAML
model = YOLO('yolov8n-seg.pt')  # Transfer the weights from a pretrained model (recommended for training)

**Install Roboflow**
<p>
to directly read our training data. For colab, we are going to find a workaround to handle encoding issues by the platform. In fact, we may encounter encoding issues for other tasks in Colab so let's go ahead and run the following cell.  

In [None]:
#Withut this Colab is giving an error when installing Roboflow
import locale
locale.getpreferredencoding = lambda: "UTF-8"

In case your annotations are on Roboflow, you can directly import the training data using your API key

In [None]:
#!pip install roboflow --quiet
#%cd /content/

In [None]:
# To hide your API key from others, you can use getpass
#from getpass import getpass
#token = getpass('Enter Token Here') #I stored my token in a file on my Google Drive. I will enter it when prompted here.

In [None]:
# Import your data from Roboflow
"""
from roboflow import Roboflow
rf = Roboflow(api_key=token)
project = rf.workspace("python-for-microscopists-nceyk").project("3d-em-platelet")
dataset = project.version(2).download("yolov8")
"""
#Change the working directory to the downloaded data directory and check the yaml file.
#%cd /content/your_dataset

Let us load the YAML file that contains the names of our classes, number of classes and the directories for train, valid, and test datasets, respectively.

In [None]:
# this is the YAML file Roboflow wrote for us that we're loading into this notebook with our data
#%cat /content/drive/MyDrive/ColabNotebooks/data/3D-EM-Platelet/YOLOV8-data/data.yaml

In [None]:
pwd

In [None]:
obj_name="speed_hump"
yaml_name="data_"+obj_name+".yaml"
proj_fname = obj_name+"-"

In [None]:
ls

In [None]:
# define number of classes based on YAML
import yaml
#with open("/content/drive/MyDrive/ColabNotebooks/data/3D-EM-Platelet/YOLOV8-data/data.yaml", 'r') as stream:
with open(yaml_name, 'r') as stream:
    num_classes = str(yaml.safe_load(stream)['nc'])
    print( num_classes)

**Train the model**

In [None]:
#Define a project --> Destination directory for all results
project = "hojung_results"
#Define subdirectory for this specific training
name = proj_fname #note that if you run the training again, it creates a directory: 200_epochs-2

In [None]:
# Train the model
results = model.train(data=yaml_name,
                      project=project,
                      name=name,
                      epochs=20,
                      patience=0, #I am setting patience=0 to disable early stopping.
                      batch=8,
                      imgsz=512)

All training curves, metrics, and other results are stored as images in the 'runs' directory. Let us open a couple of these images.

In [None]:
from IPython.display import Image

In [None]:
Image("hojung_results/"+proj_fname+"/results.png")

In [None]:
Image(filename="hojung_results/"+proj_fname+"/train_batch1.jpg", width=900)

**Run inference**

Now that our model is trained, we can use it for inference.

You can load the best model or the latest. I am picking the latest.

In [None]:
"hojung_results/"+proj_fname+"/weights/best.pt"

In [None]:
my_new_model = YOLO("hojung_results/"+proj_fname+"/weights/best.pt")

In [None]:
import numpy as np
from PIL import Image
import os
test_dir2="datasets/test/"+obj_name
test_dir_mask=test_dir2+"/mask/"+obj_name
test_dir=test_dir2+"/images"
conf_D={}

for confidence_value in range(1, 9):
    confidence_value = confidence_value/10
    print(confidence_value)

    new_results = my_new_model.predict(test_dir, imgsz=512, conf=confidence_value)  #Adjust conf threshold

    L_A=[]
    for i in range(len(new_results)):
        if  new_results[i].masks:
            pred_A=new_results[i].masks.data.cpu().numpy()
        else:
            pred_A=np.zeros((512,512))
        L_A.append(pred_A)

    LL=os.listdir(test_dir_mask)
    LL.sort()
    TOT=np.zeros(5)
    tot_pixel = 512*512
    for i in range(len(L_A)):
        if L_A[i].ndim==3:
            k,_,_=L_A[i].shape
            mask_pred_A=np.zeros((512,512))

            for j in range(k):
                mask_pred_A +=L_A[i][j]

        else:
            mask_pred_A = L_A[i]
        mask_pred_A[mask_pred_A>0.5]=1
        mask_A = np.array(Image.open(test_dir_mask+"/"+LL[i]))
        mask_A=mask_A[:,:,0]
        mask_A[mask_A > 1]=1
        overlap = np.sum(mask_A*mask_pred_A) # Logical AND
        union = mask_A + mask_pred_A # Logical OR
        union[union > 1]=1
        union = np.sum(union)

        if mask_pred_A.sum() < 0.1:
            Precision = 0.0
        else:
            Precision = np.sum(overlap)/np.sum(mask_pred_A)
        Recall = overlap/np.sum(mask_A)

        if (Precision+Recall) == 0:
            F1 = 0.0
        else:
            F1 = (2*Precision*Recall)/(Precision+Recall)

        IOU = overlap/union

        Pixel_Accuracy = np.sum(np.sum(mask_pred_A == mask_A))/tot_pixel

        tot = np.array([Precision,Recall,F1,IOU,Pixel_Accuracy])

        TOT = TOT + tot

    print("TOTALLLLL:",TOT/len(L_A))
    conf_D[confidence_value]=TOT/len(L_A)
conf_D

In [None]:
print(zzz)