# THE FOLLOWING NOTEBOOK HAS BE RUN ON KAGGLE

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

## Creating and Copying Directories in Kaggle Environment

This code is designed to manage directories and copy data within the Kaggle environment.

### Libraries Imported:

- **os**: Provides a way of using operating system dependent functionality, such as reading or writing to the file system.
- **shutil**: Offers a higher-level file operation interface which includes functions to copy and remove directories.

### Directory Management:

- The code first checks if the directory `/kaggle/working/fish-tracking-dataset` exists.
- If the directory doesn't exist, it creates the directory using the `os.makedirs()` function. *(Note: This part of the code is commented out.)*

### Data Copying:

- The `shutil.copytree()` function is used to copy the entire directory tree from `/kaggle/input/fish-tracking-dataset` to `/kaggle/working/fish-tracking-dataset`. This means all files and sub-directories within the source directory will be copied to the destination directory.


In [None]:
import os
import shutil
import os
#if not os.path.exists('/kaggle/working/fish-tracking-dataset'):
 #         os.makedirs('/kaggle/working/fish-tracking-dataset')
shutil.copytree('/kaggle/input/fish-tracking-dataset', '/kaggle/working/fish-tracking-dataset')

In [None]:
!pip install ultralytics

In [None]:
!pip install pyyaml

In [None]:
import os
import shutil
import os
#if not os.path.exists('/kaggle/working/fish-tracking-dataset'):
 #         os.makedirs('/kaggle/working/fish-tracking-dataset')
shutil.copytree('/kaggle/input/deepfish/Prepared_Deepfish', '/kaggle/working/deepfish')

## Loading YOLO Models with Ultralytics

This code snippet demonstrates how to load YOLO (You Only Look Once) models using the Ultralytics library.

### Libraries Imported:

- **ultralytics**: The Ultralytics library provides functionalities for the YOLO object detection framework.

### Loading Models:

There are multiple ways to load a YOLO model using Ultralytics:

1. **Building a New Model from YAML**:
   - `model = YOLO('yolov8n.yaml')`: This line of code builds a new YOLO model based on the architecture specified in the 'yolov8n.yaml' file. *(Note: This line is commented out in the provided code.)*

2. **Loading a Pretrained Model**:
   - `model = YOLO('yolov8s.pt')`: This line loads a pretrained YOLO model from the 'yolov8s.pt' file. Using pretrained models is recommended when training on a new dataset as it can leverage the knowledge from the pretrained weights.

3. **Building from YAML and Transferring Weights**:
   - `model = YOLO('yolov8n.yaml').load('yolov8n.pt')`: This line first builds a YOLO model from the 'yolov8n.yaml' file and then transfers the weights from the 'yolov8n.pt' file. *(Note: This line is commented out in the provided code.)*



In [None]:
from ultralytics import YOLO
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
# Load a model
#model = YOLO('yolov8n.yaml')  # build a new model from YAML
model = YOLO('yolov8s.pt')  # load a pretrained model (recommended for training)
#model = YOLO('yolov8n.yaml').load('yolov8n.pt')  # build from YAML and transfer weights

## Creating and Saving a YAML Configuration File

This code snippet demonstrates how to define a configuration in Python, and then save it as a YAML (YAML Ain't Markup Language) file using the `yaml` library.

### Libraries Imported:

- **yaml**: The `yaml` library provides functionalities for YAML parsing, which is a human-readable data serialization format.

### Configuration Data:

- A Python dictionary named `config` is defined, which contains various configuration settings:
  - `path`: Working directory path.
  - `train`: Relative path to the training images.
  - `val`: Relative path to the validation images.
  - `nc`: Number of classes.
  - `names`: List of class names.
  - `fl_gamma`: (Commented out) Can be used to specify a focal loss gamma value if needed.

### Saving to YAML File:

- The desired file path for the YAML file is specified as `yaml_file_path`.
- The `with` statement is used to open this file in write mode.
- The `yaml.dump()` function is then used to write the `config` dictionary to the file in YAML format.
- Finally, a confirmation message is printed to indicate the successful saving of the configuration to the YAML file.



In [None]:
import yaml

# Define your configuration data as a Python dictionary
config = {
    #'path': '/kaggle/working/fishtrach-yolo',
    #'path': '/kaggle/working',
    'path': '/kaggle/working/deepfish',
    'train': 'train/images',  # train images (relative to 'path') 4 images
    'val': 'val/images',      # val images (relative to 'path') 4 images
    'nc': 1,                   # Number of classes
    'names': ['Fish'],         # Class names
    # 'fl_gamma': 2.0          # Uncomment this line if needed
}

# Define the file path where you want to save the YAML file
yaml_file_path = '/kaggle/working/config.yaml'  # Update with your desired path and file name
#yaml_file_path = '/kaggle/working/config.yaml'

# Write the YAML data to the file
with open(yaml_file_path, 'w') as yaml_file:
    yaml.dump(config, yaml_file)

print(f"YAML configuration file saved at: {yaml_file_path}")


## Training the YOLO Model with Ultralytics

This code snippet demonstrates how to train a YOLO model using the Ultralytics library based on a specified configuration.

### Model Training:

The `train()` method of the YOLO model is called with the following parameters:

- **data**: Path to the YAML configuration file that contains dataset paths, class names, and other related information.
- **epochs**: Number of training epochs. An epoch is one complete forward and backward pass of all the training examples.
- **imgsz**: Image size for training. All training images will be resized to this size.
- **pretrained**: If set to `True`, the model will use pretrained weights. This can help in achieving better accuracy faster.
- **name**: Name of the training run. Useful for distinguishing between different training sessions.
- **patience**: Number of epochs with no improvement after which training will be stopped. Helps in preventing overfitting.
- **flipud**: Probability of flipping an image vertically during data augmentation.
- **batch**: Batch size for training. Determines the number of samples that will be used in each iteration to update the model's weights.
- **optimizer**: Optimization algorithm to be used. In this case, Stochastic Gradient Descent (SGD) is used.
- **augment**: If set to `True`, data augmentation techniques will be applied to the training images. This can help in improving the model's generalization.

The result of the training process is stored in the `results` variable.



In [None]:
results = model.train(data='/kaggle/working/config.yaml',
                      epochs=20,
                      imgsz=640,
                      pretrained = True ,
                      name= "training_deepfish",
                      patience = 35,
                      flipud=0.5,
                      batch = 32,
                      optimizer = 'SGD',
                      augment = True
                      )

## Validating the YOLO Model with Ultralytics

This code snippet demonstrates how to validate a trained YOLO model using the Ultralytics library.

### Model Loading:

- **model**: The YOLO model is loaded using the `YOLO` class from Ultralytics. The path to the best weights from a previous training run is provided to initialize the model.

### Model Validation:

- **results**: The `val()` method of the YOLO model is called without any parameters to validate the model using the default settings.

### Validation with Different Confidence Thresholds:

The model is further validated using different confidence thresholds to understand its performance at various levels of confidence:

- The `for` loop iterates over a list of confidence thresholds: `0.25`, `0.15`, and `0.05`.
- For each confidence threshold, the `val()` method is called with the following parameters:
  - **name**: A custom name for the validation run, indicating the confidence threshold used.
  - **conf**: The confidence threshold. Only detections with a confidence score above this threshold will be considered.
  - **iou**: Intersection over Union (IoU) threshold set to `0.8`. It determines how much overlap an accurate detection should have with the ground truth for it to be considered correct.

This validation process helps in understanding the model's performance at different confidence levels and can guide decisions on the optimal confidence threshold to use for detections.



In [None]:
model = YOLO('/kaggle/working/runs/detect/training_doubleDataset/weights/best.pt')
results = model.val()
for i in [0.25,0.15,0.05]:
  results = model.val(name= f'cofidence: {i}', conf= i , iou=0.8)

In [None]:
model = YOLO('/kaggle/working/runs/detect/fish_small/weights/best.pt')

## Importing Essential Libraries for Image Processing and Visualization

This code snippet imports various Python libraries that are commonly used for image processing, data manipulation, and visualization.

### Libraries Imported:

- **pandas**: A powerful library for data manipulation and analysis, particularly with structured data.
- **numpy**: Fundamental package for numerical computations in Python.
- **PIL**: Python Imaging Library (also known as Pillow) is used for opening, manipulating, and saving image files.
- **Image**: A module from PIL to specifically handle image operations.
- **IPython.display**: Provides functionalities to display objects within Jupyter.
- **matplotlib.pyplot**: Provides a MATLAB-like plotting framework in Python.
- **glob**: Useful for retrieving files/pathnames matching a specified pattern.
- **random**: Provides functions to generate random numbers.
- **cv2**: OpenCV library, a powerful tool for computer vision tasks.
- **warnings**: Used to control the behavior of warning messages in Python.

In [None]:
import pandas as pd
import numpy as np
import PIL 
from PIL import Image
from IPython.display import display
import matplotlib.pyplot as plt
from glob import glob
import random
import cv2
import warnings
warnings.simplefilter('ignore')

## Displaying Random Sample Images from Validation Set

This code snippet is designed to randomly select and display images from the validation set.

### Setting Up Paths and Parameters:

- **root_path**: Specifies the path where validation images are located. The `*` at the end of the path is a wildcard that matches all files in the directory.
- **num_samples**: Number of random sample images to display, set to 4.

### Retrieving and Sampling Images:

- **images_data**: Uses `glob(root_path)` to retrieve all file paths that match the specified pattern, effectively getting paths to all images in the directory.
- **random_image**: Randomly selects `num_samples` image paths from `images_data` using `random.sample()`.

### Displaying Images:

- A matplotlib figure is created with a specified size.
- A `for` loop iterates over the range of `num_samples`:
  - For each iteration, a subplot is created.
  - The image is read using `cv2.imread()` and displayed using `plt.imshow()`.
  - The axis labels are turned off for a cleaner display.

### Visual Output:

- The output will be a 2x2 grid displaying four randomly selected images from the validation set.



In [None]:
root_path = '/kaggle/working/val/images/*'
num_samples = 4
images_data = glob(root_path)
random_image = random.sample(images_data, num_samples)

plt.figure(figsize=(12,10))
for i in range(num_samples):
    plt.subplot(2,2,i+1)
    plt.imshow(cv2.imread(random_image[i]))
    plt.axis('off')

## Predicting and Displaying Object Detections on Sample Images

This code snippet is designed to predict object detections on a set of randomly selected images using the trained YOLO model and then display the results.

### Predicting Detections:

- An empty list `images` is initialized to store the processed images with bounding boxes.
- A `for` loop iterates over the range of `num_samples`:
  - For each image, the YOLO model's `predict()` method is called to get the detection results.
  - The results include bounding boxes (`box`), class names (`names`), and other attributes.
  - A series of print statements display information about each detected object:
    - Total number of detected fish in the image.
    - Label of the detected object (e.g., "Fish").
    - Coordinates of the bounding box.
    - Confidence score of the detection.

### Storing Processed Images:

- The `output.plot()` method is used to generate an image with bounding boxes drawn around detected objects.
- The color channels of the image are reversed (`[:, :, ::-1]`) to convert from BGR to RGB format (as OpenCV reads images in BGR format by default).
- The processed image is then appended to the `images` list.

### Visual Output:

- The `images` list will contain the randomly selected sample images with bounding boxes drawn around detected objects, ready for visualization.



In [None]:
images = []
for i in range(num_samples):
    yolo_outputs = model.predict(random_image[i])
    output = yolo_outputs[0]
    box = output.boxes
    names = output.names
    print('**********************')
    for j in range(len(box)):
        labels = names[box.cls[j].item()]
        coordinates = box.xyxy[j].tolist()
        confidence = np.round(box.conf[j].item(), 2)
        print(f'In this image {len(box)} fish has been detected.')
        print(f'Fish {j + 1} is: {labels}')
        print(f'Coordinates are: {coordinates}')
        print(f'Confidence is: {confidence}')
        print('-------')
        
    # Store the image in the 'images' list
    images.append(output.plot()[:, :, ::-1])

In [None]:
plt.figure(figsize=(12,10))
for i, img in enumerate(images):
    plt.subplot(2, 2, i + 1)
    plt.imshow(img)
    plt.axis('off') 

## Visualizing Training and Validation Losses

This code snippet is designed to visualize the training and validation losses over epochs for both box and class losses.

### Data Preprocessing:

- **Removing Spaces**: The leading and trailing spaces from the column names of the `result` dataframe are removed using the `str.strip()` method.

### Extracting Relevant Data:

- **epoch_column**: Extracts the epoch numbers.
- **box_train_losses** and **box_val_losses**: Extracts the training and validation box losses, respectively.
- **cls_train_losses** and **cls_val_losses**: Extracts the training and validation class losses, respectively.

### Plotting Losses:

- A figure with two subplots is created using `plt.figure()` and `plt.subplot()`.
- The `ggplot` style is applied to the plots for better visualization.
  
1. **Box Losses**:
   - The first subplot displays the training and validation box losses over epochs.
   - The `plt.plot()` function is used to plot the box losses against the epochs.
   - A grid is added for better readability.
   - Labels, title, and legend are added for clarity.

2. **Class Losses**:
   - The second subplot displays the training and validation class losses over epochs.
   - The `plt.plot()` function is used to plot the class losses against the epochs.
   - A grid is added for better readability.
   - Labels, title, and legend are added for clarity.

Finally, `plt.show()` is called to display the plots.

### Visual Output:

- The output will be two line plots side by side:
  - The left plot shows the training and validation box losses over epochs.
  - The right plot shows the training and validation class losses over epochs.



In [None]:
result = pd.read_csv('/kaggle/working/runs/detect/fish_small/results.csv')
result.head()

In [None]:
# Remove leading and trailing spaces from column names
result.columns = result.columns.str.strip()

epoch_column = result['epoch']
box_train_losses = result['train/box_loss']
box_val_losses = result['val/box_loss']
cls_train_losses = result['train/cls_loss']
cls_val_losses = result['val/cls_loss']

plt.figure(figsize=(12,5))
plt.style.use('ggplot')  # You can choose a style you prefer
plt.subplot(1,2,1)
plt.plot(epoch_column, box_train_losses, label='train_losses')
plt.plot(epoch_column, box_val_losses, label='val_losses')
plt.grid(True, linestyle='--', linewidth=0.5, color='gray')# Add a grid
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train and Validation Box Losses')
plt.legend()

plt.subplot(1,2,2)
plt.plot(epoch_column, cls_train_losses, label='train_losses')
plt.plot(epoch_column, cls_val_losses, label='val_losses')
plt.grid(True, linestyle='--', linewidth=0.5, color='gray')# Add a grid
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('Train and Validation Class Losses')
plt.legend()
plt.show()

In [None]:
plt.figure(figsize=(12,8))
plt.imshow(cv2.imread('/kaggle/working/runs/detect/fish_small/results.png'))
plt.axis('off')