# Initial analysis

## Executive summary

- the train dataset was modified to be used in this study
- not all images seem to be perfectly annotated
- accuracy > XXX % on test set

## More information

**1 & 2 - restructure of the validation and train datasets & creating labels**

As observed and explained on the dataset & paper, the validation dataset was a part of the training set. However this makes no sense. Because of that, the current train will be renamed as train+val and the new train_fixed will be the original dataset without the validation images. This structure fixed the methodological mistake done originally in the dataset (validating in training data).

**3 - img labels**

From the images it is possible to see that the labels and annotations might not be perfect, some seem not to be perfectly annotated, some missing, and such. With that, the model can be only as good as the input data, so that might be a upper limit to its performance.

**4 - training & results**

**(extra) possible improvements**

- increase / improve the dataset
- test better and different models, get larger versions on yolo, other architectures
- improve the pre processing, confirm dataset, test and improve other processes
- improve post-processing, train & evaluate better the _conf_ and _iou_ values to better select all objects avoiding false positives


## imports & configs


In [1]:
#### default imports ####
import numpy as np
import os
import sys
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

### specific imports ###
import matplotlib.pyplot as plt
import xml.etree.ElementTree as ET
from matplotlib.patches import Ellipse

# forces local code to be reloaded to avoid problems
%load_ext autoreload
%autoreload 2

#### important configs ####
# uses seaborn configs for prettier graphs
sns.set_theme()
# shows thousand separator for values
pd.options.display.float_format = '{:,.2f}'.format
# enable import from src/
sys.path.append('..')  

#### paths ####
# change path to base folder
project_path = "/mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection"

## auxiliar functions


In [2]:
def get_annotations(xml_path):
    tree = ET.parse(xml_path)
    root = tree.getroot()
    sample_annotations = []

    for neighbor in root.iter("object"):
        label = neighbor.find("name").text
        xmin = int(neighbor.find("bndbox").find("xmin").text)
        ymin = int(neighbor.find("bndbox").find("ymin").text)
        xmax = int(neighbor.find("bndbox").find("xmax").text)
        ymax = int(neighbor.find("bndbox").find("ymax").text)

        #     print(xmin, ymin, xmax, ymax)
        sample_annotations.append([label, xmin, ymin, xmax, ymax])

    return sample_annotations

# 0 - data prep


In [3]:
%cd {project_path}

# # clone dataset in the data/raw folder
# if os.path.exists('data/'):
#     os.removedirs('data/')
# os.makedirs('data/processed')
# os.makedirs('data/raw')

# %cd "data/raw"

# !git clone git@github.com:MahmudulAlam/Complete-Blood-Cell-Count-Dataset.git

# %cd {project_path}

/mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection


  self.shell.db['dhist'] = compress_dhist(dhist)[-100:]


# 1 - validation dataset problem


In [4]:
# # where the images & labels are
# raw_path = "data/raw/Complete-Blood-Cell-Count-Dataset"

# # gets the path for all files
# df_all = pd.DataFrame()
# for dirname, _, filenames in os.walk(raw_path):
#     paths = [dirname + "/" + filename for filename in filenames]
#     folder_name = os.path.split(dirname)[-1]
#     df_all = pd.concat([df_all, pd.DataFrame({"path": paths})], ignore_index=True)

# # transforms to df
# df_all = pd.DataFrame(df_all)

# # also gets the filename
# df_all["filename"] = df_all["path"].apply(lambda s: s.split("/")[-1])

# # and finally check possible extensions
# extensions = df_all["path"].apply(lambda s: s.split(".")[-1])
# extensions.value_counts()

In [5]:
# # creates a reference for the dataset (which folder it is from)
# df_all["dataset"] = df_all["path"].apply(lambda s: s.split("/")[-3])
# df_all["dataset"].value_counts()

In [6]:
# # check if all the files in validation dataset are also in the training one
# for filename in df_all[df_all["dataset"] == "Validation"]["filename"]:
#     if filename not in df_all[df_all["dataset"] == "Training"]["filename"].values:
#         print(filename)

**Conclusion:** Here it is possible to observe that all files in the validation folder are (as explained in the paper & GitHub) duplicated from the training dataset. This utilization is a methodological problem, so will not be used in our study as it is.


# 2 - fixing datasets


## .1 - adapting train, val & test


In [7]:
# # coping all the datasets to the processed folder
# !cp -r data/raw/Complete-Blood-Cell-Count-Dataset/Training data/processed/Training
# !cp -r data/raw/Complete-Blood-Cell-Count-Dataset/Validation data/processed/Validation
# !cp -r data/raw/Complete-Blood-Cell-Count-Dataset/Testing data/processed/Testing
# print('Done!')

In [8]:
# # removing the duplicated images from the training dataset
# removed_files = 0
# for validation_file_path in df_all[df_all["dataset"] == "Validation"]["path"]:
#     validation_file_path_processed_folder = validation_file_path.replace(
#         "raw", "processed"
#     ).replace("Complete-Blood-Cell-Count-Dataset/", "")

#     if os.path.exists(
#         validation_file_path_processed_folder.replace("Validation", "Training")
#     ):
#         os.remove(
#             validation_file_path_processed_folder.replace("Validation", "Training")
#         )
#         removed_files += 1

# print(f"Removed {removed_files} files")

## .2 - verify and recreate df


In [9]:
# # where the images & labels are
# processed_path = "data/processed"

# # gets the path for all files
# df_processed = pd.DataFrame()
# for dirname, _, filenames in os.walk(processed_path):
#     paths = [dirname + "/" + filename for filename in filenames]
#     folder_name = os.path.split(dirname)[-1]
#     df_processed = pd.concat(
#         [df_processed, pd.DataFrame({"path": paths})], ignore_index=True
#     )

# # transforms to df
# df_processed = pd.DataFrame(df_processed)

# # also gets the filename
# df_processed["filename"] = df_processed["path"].apply(lambda s: s.split("/")[-1])

# # and finally check possible extensions
# extensions = df_processed["path"].apply(lambda s: s.split(".")[-1])
# extensions.value_counts()

In [10]:
# processed_images = df_processed[
#     df_processed["filename"].apply(lambda s: s.split(".")[-1] in ["jpg"])
# ]
# # check if all the files in validation dataset are also in the training one
# for filename in df_all[df_all["dataset"] == "Validation"]["filename"]:
#     if filename not in df_all[df_all["dataset"] == "Training"]["filename"].values:
#         print(filename)

## .3 - creating labels


In [11]:
# # definitions for the dataset
# WIDTH = 640
# HEIGHT = 480
# cells_id = {"RBC": 0, "WBC": 1, "Platelets": 2}

# cells_classes = list(cells_id.keys())
# cells_classes
# # saves the dataset into the yolo format
# for i, (index, row) in enumerate(processed_images.iterrows()):
#     # get annotations
#     annotations = get_annotations(
#         row["path"].replace("Images", "Annotations").replace("jpg", "xml")
#     )

#     # get label path
#     label_path = row["path"].replace("Images", "labels").replace("jpg", "txt")

#     # create folders
#     os.makedirs(os.path.split(label_path)[0], exist_ok=True)

#     # save annotations
#     with open(label_path, "w") as file:
#         for label, xmin, ymin, xmax, ymax in annotations:
#             # get the center of the rectangle
#             x_center = (xmin + xmax) / 2
#             y_center = (ymin + ymax) / 2

#             # normalize the values
#             x_center /= WIDTH
#             y_center /= HEIGHT
#             width = (xmax - xmin) / WIDTH
#             height = (ymax - ymin) / HEIGHT

#             # save the values
#             file.write(f"{cells_id[label]} {x_center} {y_center} {width} {height}\n")

# print("done!")

## .4 - saving yaml


In [12]:
# yaml_file = "data/processed/blood_cell_dataset.yaml"

# full_path = "/mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/data/processed"

# train_images_dir = "Training/images/"

# val_images_dir = "Validation/images/"

# test_images_dir = "Testing/images/"

# names_str = ""

# for item in cells_classes:

# names_str = names_str + ", '%s'" % item

# names_str = "names: [" + names_str[1:] + "]"

# with open(yaml_file, "w") as wobj:

# wobj.write("path: %s\n" % full_path)

# wobj.write("train: %s\n" % train_images_dir)

# wobj.write("val: %s\n" % val_images_dir)

# # wobj.write("test: %s\n" % test_images_dir)

# wobj.write("nc: %d\n" % len(cells_classes))

# wobj.write(names_str + "\n")

# 3 - img visualization


In [13]:
# # select only images
# images = df_all[df_all["filename"].apply(lambda s: s.split(".")[-1] in ["jpg"])]

# label_colors = {"RBC": "red", "WBC": "white", "Platelets": "purple"}

# # show 3 images
# fig, ax = plt.subplots(1, 3, figsize=(15, 5))
# for i, (index, row) in enumerate(images.sample(3, random_state=42).iterrows()):
#     # img show
#     img = plt.imread(row["path"])
#     ax[i].imshow(img)
#     ax[i].axis("off")
#     ax[i].set_title(f"{row['dataset']} - {row['filename']}")

#     # get annotations
#     annotations = get_annotations(
#         row["path"].replace("Images", "Annotations").replace("jpg", "xml")
#     )
#     print(annotations)

#     # show annotations
#     for label, xmin, ymin, xmax, ymax in annotations:
#         ax[i].add_patch(
#             # plt.Rectangle(
#             #     (xmin, ymin),
#             #     xmax - xmin,
#             #     ymax - ymin,
#             #     linewidth=2,
#             #     edgecolor=label_colors[label],
#             #     facecolor="none",
#             # )
#             Ellipse(
#                 ((xmin + xmax) / 2, (ymin + ymax) / 2),
#                 xmax - xmin,
#                 ymax - ymin,
#                 linewidth=2,
#                 edgecolor=label_colors[label],
#                 facecolor="none",
#             )
#         )
#         ax[i].text(xmin, ymin, label, fontsize=12, color="k")

**Info:** From the images it is possible to see that the labels and annotations might not be perfect, some seem not to be perfectly annotated, some missing, and such. With that, the model can be only as good as the input data, so that might be a upper limit to its performance.


# 4 - training


In [14]:
from ultralytics import YOLO, settings

# # remove and recreate folders
# if os.path.exists("yolo_data/"):
#     %rm -r yolo_data/
# os.makedirs("yolo_data/datasets", exist_ok=True)
# os.makedirs("yolo_data/weights", exist_ok=True)
# os.makedirs("yolo_data/runs", exist_ok=True)

# Update a setting
settings.update(
    {
        "datasets_dir": "yolo_data/datasets",
        "weights_dir": "yolo_data/weights",
        "runs_dir": "yolo_data/runs",
    }
)

## .1 - initial baseline


In [15]:
# from ultralytics import YOLO

# # Load a model
# model = YOLO("models/yolov8n.pt")  # load a pretrained model (recommended for training)

# results = model.train(
#     data="data/processed/blood_cell_dataset.yaml",
#     epochs=3,
#     imgsz=640,
#     batch=4,
# )

# # show results
# for dataset in ["train", "val"]:
#     print("-" * 30)
#     print(dataset)
#     print("-" * 30)

#     results = model.val(split=dataset)
#     print(results.results_dict)

## .2 - hyperparam tuning


In [16]:
# from ultralytics import YOLO

# # Load a model
# model = YOLO("models/yolov8n.pt")  # load a pretrained model (recommended for training)

# results = model.tune(data="data/processed/blood_cell_dataset.yaml", use_ray=True)

# # show results
# for dataset in ["train", "val"]:
#     print("-" * 30)
#     print(dataset)
#     print("-" * 30)

#     results = model.val(split=dataset)
#     print(results.results_dict)

0,1
Current time:,2024-03-26 19:06:44
Running for:,00:00:24.92
Memory:,4.8/7.6 GiB

Trial name,# failures,error file
_tune_1319d_00000,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00000_0_bgr=0.3711,box=0.0389,cls=2.1656,copy_paste=0.4410,degrees=26.3565,fliplr=0.0968,flipud=0.4812,hsv_h=0.0271,hs_2024-03-26_19-06-19/error.txt"
_tune_1319d_00001,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00001_1_bgr=0.4649,box=0.1236,cls=3.2402,copy_paste=0.7064,degrees=10.2755,fliplr=0.0474,flipud=0.0078,hsv_h=0.0840,hs_2024-03-26_19-06-19/error.txt"
_tune_1319d_00002,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00002_2_bgr=0.7950,box=0.1308,cls=1.4118,copy_paste=0.7779,degrees=23.0410,fliplr=0.7148,flipud=0.0759,hsv_h=0.0531,hs_2024-03-26_19-06-19/error.txt"
_tune_1319d_00003,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00003_3_bgr=0.0617,box=0.1246,cls=2.6225,copy_paste=0.6028,degrees=24.5582,fliplr=0.2839,flipud=0.5856,hsv_h=0.0615,hs_2024-03-26_19-06-19/error.txt"
_tune_1319d_00004,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00004_4_bgr=0.6845,box=0.0529,cls=0.8542,copy_paste=0.1495,degrees=33.1614,fliplr=0.6759,flipud=0.0466,hsv_h=0.0499,hs_2024-03-26_19-06-19/error.txt"
_tune_1319d_00005,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00005_5_bgr=0.4120,box=0.1782,cls=2.9638,copy_paste=0.5097,degrees=23.4233,fliplr=0.4923,flipud=0.8136,hsv_h=0.0815,hs_2024-03-26_19-06-19/error.txt"
_tune_1319d_00006,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00006_6_bgr=0.0659,box=0.1087,cls=2.5421,copy_paste=0.5807,degrees=1.3995,fliplr=0.5301,flipud=0.5630,hsv_h=0.0864,hsv_2024-03-26_19-06-19/error.txt"
_tune_1319d_00007,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00007_7_bgr=0.4715,box=0.0316,cls=1.3059,copy_paste=0.2251,degrees=2.1704,fliplr=0.4388,flipud=0.2609,hsv_h=0.0860,hsv_2024-03-26_19-06-19/error.txt"
_tune_1319d_00008,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00008_8_bgr=0.7082,box=0.0527,cls=0.9937,copy_paste=0.1330,degrees=8.1114,fliplr=0.0937,flipud=0.9348,hsv_h=0.0479,hsv_2024-03-26_19-06-19/error.txt"
_tune_1319d_00009,1,"/home/branco/ray_results/_tune_2024-03-26_19-06-19/_tune_1319d_00009_9_bgr=0.6641,box=0.1380,cls=2.0261,copy_paste=0.9615,degrees=3.5635,fliplr=0.7646,flipud=0.4024,hsv_h=0.0661,hsv_2024-03-26_19-06-19/error.txt"

Trial name,status,loc,bgr,box,cls,copy_paste,degrees,fliplr,flipud,hsv_h,hsv_s,hsv_v,lr0,lrf,mixup,momentum,mosaic,perspective,scale,shear,translate,warmup_epochs,warmup_momentum,weight_decay
_tune_1319d_00000,ERROR,172.24.125.125:36357,0.371059,0.0389146,2.16556,0.441026,26.3565,0.0967778,0.481155,0.0270671,0.278034,0.343725,0.00379769,0.28605,0.853943,0.751195,0.0538385,0.000763472,0.213569,5.03471,0.74082,2.73976,0.582952,0.000473064
_tune_1319d_00001,ERROR,172.24.125.125:36358,0.464892,0.123556,3.24024,0.706442,10.2755,0.0474072,0.00776726,0.0839548,0.803034,0.29888,0.0710477,0.947558,0.435411,0.742416,0.711516,0.000242365,0.0555368,0.155871,0.733807,3.44085,0.293628,0.000667698
_tune_1319d_00002,ERROR,172.24.125.125:36700,0.795023,0.130813,1.41185,0.777924,23.041,0.714808,0.0759179,0.0531031,0.692856,0.207286,0.0359456,0.509895,0.218391,0.938708,0.533576,0.000615737,0.0856417,6.64948,0.433069,2.07758,0.948279,1.41616e-05
_tune_1319d_00003,ERROR,172.24.125.125:36705,0.0616779,0.124621,2.62248,0.602784,24.5582,0.28393,0.585594,0.0614872,0.286684,0.162391,0.0871631,0.254564,0.86555,0.963587,0.354937,0.000264511,0.666278,9.34885,0.634754,1.64331,0.832179,0.00063924
_tune_1319d_00004,ERROR,172.24.125.125:36969,0.68446,0.0528901,0.854161,0.149548,33.1614,0.67587,0.046551,0.0498766,0.0495777,0.680249,0.0384447,0.101571,0.00575697,0.87431,0.912389,0.000314133,0.168062,6.09032,0.792766,3.79447,0.27435,0.000186349
_tune_1319d_00005,ERROR,172.24.125.125:36970,0.412045,0.178247,2.96378,0.50971,23.4233,0.49226,0.813564,0.08151,0.256977,0.225689,0.0624636,0.13051,0.641903,0.654399,0.962866,0.00034602,0.745396,1.25829,0.624238,0.10077,0.406291,0.000567144
_tune_1319d_00006,ERROR,172.24.125.125:37229,0.0659276,0.10872,2.54214,0.580727,1.39949,0.53007,0.562975,0.0864006,0.692018,0.254461,0.0154045,0.57784,0.679385,0.640411,0.457263,0.000773378,0.732638,7.17354,0.250165,0.824727,0.349073,0.000489934
_tune_1319d_00007,ERROR,172.24.125.125:37230,0.471512,0.0316324,1.30594,0.22514,2.17041,0.438808,0.260917,0.0860186,0.130999,0.428727,0.0656304,0.189287,0.163994,0.628169,0.690567,0.000947846,0.347038,6.73174,0.131516,1.78414,0.644688,0.000366854
_tune_1319d_00008,ERROR,172.24.125.125:37495,0.708248,0.052658,0.993741,0.133045,8.11144,0.0937266,0.934751,0.04785,0.63378,0.791338,0.00659714,0.959226,0.815453,0.709723,0.894115,6.74434e-05,0.0681818,0.144801,0.5694,3.79008,0.585897,0.000608719
_tune_1319d_00009,ERROR,172.24.125.125:37496,0.664109,0.137975,2.02606,0.961544,3.56353,0.764644,0.402358,0.0660559,0.129158,0.639024,0.0989235,0.445234,0.934814,0.61084,0.483845,0.000622262,0.864144,5.81172,0.354958,0.23922,0.199366,0.000304534


[36m(_tune pid=36358)[0m New https://pypi.org/project/ultralytics/8.1.34 available 😃 Update with 'pip install -U ultralytics'


2024-03-26 19:06:24,597	ERROR tune_controller.py:1374 -- Trial task failed for trial _tune_1319d_00000
Traceback (most recent call last):
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
             ^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/worker.py", line 2624, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): [36mray::ImplicitFunc.train()[39m (pid=36357, ip=172.24.

[36m(_tune pid=36358)[0m Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i5-13450HX)
[36m(_tune pid=36358)[0m [34m[1mengine/trainer: [0mtask=detect, mode=train, model=models/yolov8n.pt, data=data/processed/blood_cell_dataset.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_fram

2024-03-26 19:06:29,326	ERROR tune_controller.py:1374 -- Trial task failed for trial _tune_1319d_00002
Traceback (most recent call last):
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
             ^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/worker.py", line 2624, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): [36mray::ImplicitFunc.train()[39m (pid=36700, ip=172.24.

[36m(_tune pid=36700)[0m Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i5-13450HX)[32m [repeated 3x across cluster][0m
[36m(_tune pid=36700)[0m [34m[1mengine/trainer: [0mtask=detect, mode=train, model=models/yolov8n.pt, data=data/processed/blood_cell_dataset.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=Fa

2024-03-26 19:06:34,254	ERROR tune_controller.py:1374 -- Trial task failed for trial _tune_1319d_00005
Traceback (most recent call last):
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
             ^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/worker.py", line 2624, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): [36mray::ImplicitFunc.train()[39m (pid=36970, ip=172.24.

[36m(_tune pid=36969)[0m New https://pypi.org/project/ultralytics/8.1.34 available 😃 Update with 'pip install -U ultralytics'[32m [repeated 3x across cluster][0m
[36m(_tune pid=36969)[0m Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i5-13450HX)[32m [repeated 2x across cluster][0m
[36m(_tune pid=36969)[0m [34m[1mengine/trainer: [0mtask=detect, mode=train, model=models/yolov8n.pt, data=data/processed/blood_cell_dataset.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300

2024-03-26 19:06:39,562	ERROR tune_controller.py:1374 -- Trial task failed for trial _tune_1319d_00006
Traceback (most recent call last):
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
             ^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/worker.py", line 2624, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): [36mray::ImplicitFunc.train()[39m (pid=37229, ip=172.24.

[36m(_tune pid=37230)[0m New https://pypi.org/project/ultralytics/8.1.34 available 😃 Update with 'pip install -U ultralytics'[32m [repeated 2x across cluster][0m
[36m(_tune pid=37230)[0m Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CPU (13th Gen Intel Core(TM) i5-13450HX)[32m [repeated 2x across cluster][0m
[36m(_tune pid=37230)[0m [34m[1mengine/trainer: [0mtask=detect, mode=train, model=models/yolov8n.pt, data=data/processed/blood_cell_dataset.yaml, epochs=100, time=None, patience=100, batch=16, imgsz=640, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=train, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=None, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300

2024-03-26 19:06:44,518	ERROR tune_controller.py:1374 -- Trial task failed for trial _tune_1319d_00008
Traceback (most recent call last):
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/air/execution/_internal/event_manager.py", line 110, in resolve_future
    result = ray.get(future)
             ^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/auto_init_hook.py", line 22, in auto_init_wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/client_mode_hook.py", line 103, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/branco/miniconda3/envs/bc_detec/lib/python3.11/site-packages/ray/_private/worker.py", line 2624, in get
    raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(RuntimeError): [36mray::ImplicitFunc.train()[39m (pid=37495, ip=172.24.

[36m(_tune pid=37496)[0m New https://pypi.org/project/ultralytics/8.1.34 available 😃 Update with 'pip install -U ultralytics'[32m [repeated 2x across cluster][0m


2024-03-26 19:06:44,656	ERROR tune.py:1038 -- Trials did not complete: [_tune_1319d_00000, _tune_1319d_00001, _tune_1319d_00002, _tune_1319d_00003, _tune_1319d_00004, _tune_1319d_00005, _tune_1319d_00006, _tune_1319d_00007, _tune_1319d_00008, _tune_1319d_00009]
2024-03-26 19:06:44,657	INFO tune.py:1042 -- Total run time: 24.99 seconds (24.80 seconds for the tuning loop).


------------------------------
train
------------------------------
Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 3050 6GB Laptop GPU, 6144MiB)
YOLOv8n summary (fused): 168 layers, 3151904 parameters, 0 gradients, 8.7 GFLOPs

Dataset 'coco.yaml' images not found ⚠️, missing path '/mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/yolo_data/datasets/coco/val2017.txt'
Downloading https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels-segments.zip to '/mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/yolo_data/datasets/coco2017labels-segments.zip'...


100%|██████████| 169M/169M [00:07<00:00, 22.6MB/s]
Unzipping /mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/yolo_data/datasets/coco2017labels-segments.zip to /mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/yolo_data/datasets/coco...:   1%|          | 1019/122232 [00:08<22:57, 88.02file/s]

## 3 - best model on test


In [18]:
# # Load model
# model = YOLO("yolo_data/runs/detect/train/weights/best.pt")


# # show results
# for dataset in ["train", "val", "test"]:
#     print("-" * 30)
#     print(dataset)
#     print("-" * 30)

#     results = model.val(split=dataset)
#     print(results.results_dict)

------------------------------
train
------------------------------
Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 3050 6GB Laptop GPU, 6144MiB)
Model summary (fused): 168 layers, 3006233 parameters, 0 gradients, 8.1 GFLOPs


[34m[1mval: [0mScanning /mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/data/processed/Training/labels.cache... 240 images, 0 backgrounds, 0 corrupt: 100%|██████████| 240/240 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 15/15 [00:04<00:00,  3.47it/s]


                   all        240       3140      0.833      0.863      0.895        0.6
                   RBC        240       2641      0.642      0.875      0.816      0.554
                   WBC        240        244      0.961          1      0.992        0.8
             Platelets        240        255      0.898      0.714      0.878      0.445
Speed: 0.5ms preprocess, 4.1ms inference, 0.0ms loss, 6.8ms postprocess per image
Results saved to [1myolo_data/runs/detect/val3[0m
{'metrics/precision(B)': 0.833433060250894, 'metrics/recall(B)': 0.8630504884018482, 'metrics/mAP50(B)': 0.8951037886622125, 'metrics/mAP50-95(B)': 0.5997314773492299, 'fitness': 0.6292687084805282}
------------------------------
val
------------------------------
Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 3050 6GB Laptop GPU, 6144MiB)


[34m[1mval: [0mScanning /mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/data/processed/Validation/labels.cache... 60 images, 0 backgrounds, 0 corrupt: 100%|██████████| 60/60 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:02<00:00,  1.70it/s]


                   all         60        812      0.774      0.885      0.879      0.576
                   RBC         60        700      0.576      0.874      0.795      0.529
                   WBC         60         63      0.957      0.984      0.985      0.749
             Platelets         60         49       0.79      0.796      0.858       0.45
Speed: 1.4ms preprocess, 12.3ms inference, 0.0ms loss, 12.6ms postprocess per image
Results saved to [1myolo_data/runs/detect/val4[0m
{'metrics/precision(B)': 0.7739581050726904, 'metrics/recall(B)': 0.884777021919879, 'metrics/mAP50(B)': 0.8793331271631004, 'metrics/mAP50-95(B)': 0.5760130940472524, 'fitness': 0.6063450973588373}
------------------------------
test
------------------------------
Ultralytics YOLOv8.1.33 🚀 Python-3.11.8 torch-2.2.1+cu121 CUDA:0 (NVIDIA GeForce RTX 3050 6GB Laptop GPU, 6144MiB)


[34m[1mval: [0mScanning /mnt/c/Users/nicol/My Drive/personal/coding projects/2024/blood-cell-detection/data/processed/Testing/labels.cache... 60 images, 0 backgrounds, 0 corrupt: 100%|██████████| 60/60 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 4/4 [00:02<00:00,  1.57it/s]


                   all         60        908      0.868      0.891      0.939      0.659
                   RBC         60        792      0.775      0.965       0.95      0.701
                   WBC         60         61      0.964          1      0.995      0.804
             Platelets         60         55      0.867       0.71      0.873      0.472
Speed: 1.8ms preprocess, 7.2ms inference, 0.1ms loss, 18.8ms postprocess per image
Results saved to [1myolo_data/runs/detect/val5[0m
{'metrics/precision(B)': 0.8683767326536994, 'metrics/recall(B)': 0.8914900080496357, 'metrics/mAP50(B)': 0.9392578897444391, 'metrics/mAP50-95(B)': 0.659138041862484, 'fitness': 0.6871500266506796}
