# MSOE AI-Club: Fall 2024 Innovation Lab Example Code
**Innovation Labs Sponsor:** [Brady Corporation](https://www.bradyid.com/?s_kwcid=AL!10720!3!324787930987!e!!g!!brady%20corp&cid=ppc&camp=ppc-us-brand-google.com-search-trademark_exact-core-brady%20corp&gad_source=1&gbraid=0AAAAAD_q4Jq7FA-4zFdepNvtZEYWfDQG8&gclid=Cj0KCQjw05i4BhDiARIsAB_2wfDH78Ihm-CDyEXsvNKF8NpniM-tntM0M9wxCpCJmQ8ThIRQoFzUGhsaAmDdEALw_wcB)<br/>
**Innovation Labs Rubric:** [Link to Official Rubric](https://msoe365-my.sharepoint.com/:w:/g/personal/paulsonb_msoe_edu/ERxEIGv5rMZDl78QqILSCLMBul3Q-D74qAjXg4iYWX4fiQ?e=BhT5lC)<br/>
**Additional Computer Vision Learning Resources:** [Link to Learning Tree](https://msoe-maic.com/learning-tree?node=58)

![IL Banner](https://raw.githubusercontent.com/24-25-Fall-Innovation-Lab/template/refs/heads/main/IL_Banner.png?token=GHSAT0AAAAAACYSMZ2Z6JVZBZYSQCFVG2L2ZYIASNQ)

## Problem Statement
Your challenge is to design an AI-based solution that can predict the amount of liquid in a container from a single image. While AI-Club has provided you with a baseline solution utilizing the **YOLOv8 segmentation algorithm**, your goal is to enhance and expand on this model, creating a more innovative and adaptable solution that can be applied across multiple use cases.

<span style="color: #3A8DFF; font-weight: bold;">New to AI?</span> No problem! Check out our 🌳 **Learning Tree** 🌳 for resources on [how to get started with programming](https://msoe-maic.com/learning-tree?node=2), [development environments](https://msoe-maic.com/learning-tree?node=10), [AI Basics](https://msoe-maic.com/learning-tree?node=19), and [Computer Vision](https://msoe-maic.com/learning-tree?node=58)!

## What is YOLOv8?
[YOLOv8](https://yolov8.com) is what we call a "pre-trained" model, meaning that it has already been trained on a large dataset of images. This allows the model to recognize patterns and objects in images that it has never seen before, similar to how [Chat-GPT](https://chatgpt.com) can answer questions that you prompt it which it may have never seen before. This is important in industry applications of AI because it allows companies to use AI to solve problems without having to spend the time and resources to train a model from scratch, ultimately saving time and money to create a solution -- such as in this hackathon!

Since the model is pre-trained, we can simply download it from the web, as you'll see later in this example notebook. The application that this model was pre-trained on was object detection, meaning that it can recognize and locate objects in images. For this simplified example, we've basically fine-tuned the model to recognize the liquid in a container by determining the number of pixels in an image which are liquid and which are container, then dividing the two to get a percentage of liquid in the container.

**Yolo V8 stands out in its capabilites to:**
1. **Segment objects:** It creates pixel by pixel masks rather than similar object detection / bounding boxes. Similar models such as SAM (Segment Anything) have similar features.
2. **Real-Time Inference:** Yolo is known for its fast inference speeds
3. **High accuracy:** Yolo achieves high accuracy. Especially with the added ability in fine tuning the model, we're able to greatly increase the performance on a given dataset. 

## How to Use This Notebook
This file (`ipynb`) is what's called a **Jupyter Notebook**. These types of files are commonly used by data scientists and AI engineers to write and run code in a way that is easy to read and understand -- often to determine the feasibility of a model or to test a model on a small dataset. Understanding this, you can run code cells with either `Shift+Enter` or `Ctrl+Enter` to see the output of the code. You can also edit the code in the cells to see how it affects the output.

**Jupyter Notebooks** have a couple of additional useful features, most notably a `Code Block` and a `Markdown Block`. This allows you to seamlessly use both markdown (such as this cell) and run code. Variables are stored in memory between cells, allowing you to use data across the entire file. Additionally, the `Code Block` allows you to run terminal commands by prepending the line with `!`, such as the `!pip install` line below. There are even more capabilities possible, check out [this graphical cheatsheet](https://www.edureka.co/blog/wp-content/uploads/2018/10/Jupyter_Notebook_CheatSheet_Edureka.pdf) or [this guide](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet) to get really in depth!

As for the compute power required to effectively run this notebook, this notebook is capable of running on your local machine (laptop) without much issue. However, the training process will likely take a long time to complete. <span style="color: red; font-weight: bold;">Therefore, we propose that groups use either ROSIE (MSOE Students only) or Google Colab to train their models using NVIDIA T4 GPUs.</span>
* [How to Run Jobs on ROSIE Supercomputer](https://msoe-maic.com/learning-tree?node=5)
* [How to Use Google Colab](https://colab.research.google.com)

---

## Part 1: Import Statements And File Structure Setup
Import statements are used to import libraries that are required to run the code. Libraries are collections of functions and methods that allow you to perform actions without having to write the code yourself. In this case, we are importing libraries that will allow us to work with images and data, as well as the YOLOv8 model.

**Please Note:** If you are running locally on a Mac, you may need to use the command `pip3` as opposed to `pip`

In [None]:
# Install all the required dependencies (libraries) for the project from the requirements.txt file
!pip install -r requirements.txt

In [27]:
import os
import cv2
import numpy as np

from IPython import display
from IPython.display import display, Image

from roboflow import Roboflow
import torch
import yaml

In [None]:
import os
from IPython.display import clear_output

# Set HOME to get your current working directory
HOME = os.getcwd()
print("HOME directory is set to:", HOME)

# Download the YOLO model
%cd {HOME}
!git clone https://github.com/ultralytics/ultralytics.git
%cd {HOME}/ultralytics
!pip install -e .

# Clear the output to keep it clean
clear_output()

# Run checks for the YOLO model
import ultralytics
ultralytics.checks()

___

## Part 2: Data Preprocessing
One of the most important parts of dealing with AI algorithms is <span style="color: #3A8DFF; font-weight: bold;">ensuring you have clean, well-structured, and A LOT of data</span>. In this case, we have a dataset of images that we will use to train our model. The data preprocessing step is where we will load in the data, clean it up, and prepare it for training.

For this application, we're doing "segmentation", which is a part of [Computer Vision](https://msoe-maic.com/learning-tree?node=58) where we're trying to determine the boundaries of objects in an image. In this case, we're trying to determine the boundaries of the liquid in a container. Therefore, the data that we're using to train the model has specific outlines for where the boundaries of both the liquid and container are -- this way, the model can actually learn the difference between the two!

In [None]:
# Loading yolov8s-seg model and ensuring it is downloaded correctly
from ultralytics import YOLO
model = YOLO(f'{HOME}/yolov8s-seg.pt')
results = model.predict(source='https://media.roboflow.com/notebooks/examples/dog.jpeg', conf=0.25)

This cell will download all the data required to fine-tune this YOLOv8 model. Don't worry, we've already done the hard work of labeling the data for you!

One parameter you may notice is `api_key`. We've setup this key for you to use Roboflow for free. If you'd like to use Roboflow for your own projects, you can sign up for a free account at [Roboflow](https://roboflow.com) -- it is best practice to use your own API keys for your own projects, as well as hiding them not as raw strings in your code. 👀

In [None]:
# Making and moving into 'datasets' directory
os.makedirs(f'{HOME}/datasets', exist_ok=True)
%cd {HOME}/datasets

# Downloading `gradulated-flask-segmentation` dataset from Roboflow
rf = Roboflow(api_key="ORjo88oQ6A7AwApZTU5e")
project = rf.workspace("university-msm2s").project("graduated-flask-segmentation")
dataset = project.version(2).download("yolov8")

When you're dealing with data for a machine learning model, you are often dealing with a `train`, `validation`, and `test` dataset. The `train` dataset is used to train the model, the `validation` dataset is used to validate the model's performance during training, and the `test` dataset is used to test the model's performance after training. This is important because it allows you to see how well the model is performing on data that it has never seen before.

In [31]:
# Create folders for storing the data
def calculate_next_folder(base_dir, base_name):
    # List all directories in the base_dir
    existing_dirs = [d for d in os.listdir(base_dir) if os.path.isdir(os.path.join(base_dir, d))]

    # Filter out directories that start with the base name (e.g., 'train')
    dirs = [d for d in existing_dirs if d.startswith(base_name)]

    # Find the highest number by extracting digits from directory names
    nums = [int(d[len(base_name):]) for d in dirs if d[len(base_name):].isdigit()]

    if base_name not in dirs:
        return base_name
    nums.append(1)
    # Determine the next folder number
    next_num = max(nums, default=1) + 1

    # Return the new folder name
    return f"{base_name}{next_num}"

def get_next_folder(base_name):
    base_dir_str =f'{HOME}/ultralytics/runs/segment/'
    os.makedirs(base_dir_str, exist_ok=True)
    base_dir = os.path.join(base_dir_str)
    return calculate_next_folder(base_dir, base_name)

train_folder_name = get_next_folder('train')
predict_folder_name = get_next_folder('predict')
val_folder_name = get_next_folder('val')

In [None]:
# Check if CUDA is available (for NVIDIA GPUs) or MPS (for Apple Silicon Macs), and select the appropriate device
if torch.cuda.is_available():
    device = torch.device("cuda")  # use CUDA (NVIDIA GPU)
    print("CUDA is available. Using GPU.")
elif torch.backends.mps.is_available():
    device = torch.device("mps")   # use MPS (Apple Silicon)
    print("MPS is available. Using Apple Silicon GPU.")
else:
    device = torch.device("cpu")   # fallback to CPU
    print("Neither CUDA nor MPS is available. Using CPU.")

train_folder_name = get_next_folder('train')

In [None]:
# Update the data.yaml file to have the appropriate information about our dataset download
file_path = '../datasets/Graduated-Flask-segmentation-2/data.yaml'

# Open and load the YAML file
with open(file_path, 'r') as file:
    data = yaml.safe_load(file)

# Modify the 'train' and 'val' paths
data['train'] = 'train/images'
data['val'] = 'valid/images'

# Save the updated YAML file
with open(file_path, 'w') as file:
    yaml.dump(data, file)

print("YAML file updated successfully!")

----

## Part 3: Model Training and Evaluation

<span style="color: gold; font-weight: bold;">Now let's actually TRAIN our model! This is where the magic happens</span> 🎉 <br/>
This cell will train the model on the data that we just loaded in. The model will learn the patterns in the data and use that to make predictions on new data. This process is called "fine-tuning" because we are taking a model that has already been trained on a large dataset and training it on a smaller dataset to make it more accurate for a specific use case (in this case, determining the amount of liquid in a container).

Please note, depending on the compute that you are using (laptop versus supercomputer), you should change the `epochs` accordingly. An `epoch` is a pass through the entire dataset for training, and an entire pass through is MUCH faster on a supercomputer compared to a laptop -- so typically, we reserve training for something like `1000 epochs` on a supercomputer, but only `10 epochs` on a laptop.

In [None]:
model.train(
    data=f'{HOME}/datasets/Graduated-Flask-segmentation-2/data.yaml',  # Path to data.yaml
    epochs=2,  # Number of epochs
    imgsz=640,   # Image size
    batch=16,    # Batch size
    name=f'{train_folder_name}',  # Name for this training run
    project=f'{HOME}/ultralytics/runs/segment',  # Save directory
    workers=8,   # Number of workers for data loading
    device=device  # Set the correct device (MPS, CUDA, or CPU)
)

<span style="color: gold; font-weight: bold;">If you've never trained an AI model before, you just made your first AI!</span> 🦾🤖 <br/>
This is a big deal! You've just trained a model to recognize the amount of liquid in a container from an image. This is a big step towards solving the problem that we've presented to you. Now, let's see how well it performs!

<span style="color: red; font-weight: bold;">Reminder:</span> The performance of this model will **greatly vary** depending on the `hyperparameters` you defined for the function `model.train`. Hyperparameters are the settings that you define for the model, such as the learning rate, batch size, and number of epochs. These settings can greatly affect the performance of the model, so it's important to experiment with different settings to see what works best for your specific use case! For getting decent results, I would recommend at least `30 epochs`.

We're going to be evaluating how well the model did on the `training data` portion of the dataset (the data it saw while `model.train()`). This is important because it allows us to see how well the model is learning the patterns in the data. If the model is performing well on the training data, then it is likely learning the patterns in the data well. However, if the model is not performing well on the training data, then it is likely not learning the patterns in the data well.

Understanding this, if your model is not performing well on the training data, that is a sign that you may need to change the hyperparameters or the model architecture to better fit the data. This is because the main application of AI is to learn patterns in data and generalize to new data, so if the model is not learning the patterns in its own training data, then it will not be able to generalize to new data. <span style="color: #3A8DFF; font-weight: bold;">Think of it like being a study guide that's the same as the exam but then still not passing the exam -- it's a sign that you need to study differently!</span> 📚

In [None]:
"""
This cell is going to showcase a confusion matrix for the trained model on the training dataset.
A confusion matrix is a table that is often used to describe the performance of a classification model 
on a set of data for which the true values are known -- therefore, we're looking for diagonal values to be high.
"""
display(Image(filename=f"{HOME}/ultralytics/runs/segment/{train_folder_name}/confusion_matrix.png", width=600))

Another important concept in machine learning is a situation called `overfitting`. Overfitting is when the model learns the patterns in the training data **too well** and is not able to generalize to new data. This is a common problem in machine learning and can be solved by using techniques such as [regularization](https://www.ibm.com/topics/regularization#:~:text=Regularization%20is%20a%20set%20of,for%20an%20increase%20in%20generalizability.), [dropout](https://towardsdatascience.com/dropout-in-neural-networks-47a162d621d9), and [early stopping](https://cyborgcodes.medium.com/what-is-early-stopping-in-deep-learning-eeb1e710a3cf). If you see that your model is performing well on the training data but not on the validation data, then it is likely overfitting to the training data. If the model is great at both, then it's likely a good model!

<span style="color: red; font-weight: bold;">Note:</span> Looking more specifically at the stats generated below, `loss` is a measure of how well the model is performing -- `loss` is the difference between the predicted value and the actual value. The lower the loss, the better the model is performing. Consequently, when training, you should see the loss decrease over time as the model learns the patterns in the data. If the loss is not decreasing, then the model is not learning the patterns in the data well.

In [None]:
# Visualize the training process such as loss, precision, time, etc...
display(Image(filename=f"{HOME}/ultralytics/runs/segment/{train_folder_name}/results.png", width=600))

📸 **Now that we've seen how quantitatively well/bad the model is performing, let's actually see what the model is predicting!**

Here is an example of what you could expect to output from the below `code cell`:

<img src="https://raw.githubusercontent.com/24-25-Fall-Innovation-Lab/template/refs/heads/main/segmentation_example.png?token=GHSAT0AAAAAACYEXL3EW3KROJDOOHDBWRN2ZYID2LA" alt="Segmentation Example" width="200"/>

In [None]:
# Display the results for the training data
display(Image(filename=f"{HOME}/ultralytics/runs/segment/{train_folder_name}/val_batch0_pred.jpg", width=600))

In [None]:
# Save the model and the results
model = YOLO(f'{HOME}/ultralytics/runs/segment/{train_folder_name}/weights/best.pt')
save_dir = os.path.join(f'{HOME}/ultralytics/runs/segment', val_folder_name)
val_folder_name = get_next_folder('val')
model.val(data=f'{dataset.location}/data.yaml', save_dir=save_dir)

**Now let's do the same as we've done but just with the validation data!**<br/>
`Validation Data` is data that the model has never seen before, so it's a good test to see how well the model is generalizing to new data. If the model is performing well on the validation data, then it is likely generalizing well to new data. If the model is not performing well on the validation data, then it is likely not generalizing well to new data. This is different from `testing data` because the model has never seen the validation data before, even with our review of it generalizing on the testing data after training.

In [None]:
# Display the results for the testing data (validation)
display(Image(filename=f"{HOME}/ultralytics/runs/segment/{val_folder_name}/confusion_matrix.png", width=600))

In [None]:
# Display the results for the testing data (validation)
display(Image(filename=f"{HOME}/ultralytics/runs/segment/{val_folder_name}/val_batch0_pred.jpg", width=600))

___

## Part 4: Predicting the Percentage Filled on the Final Containers
Now that we're satisfied with the model's performance on the `validation` and `training` datasets, let's predict on the test images and save the mask results along with the containers fill percentage

Because this is a simple model, the **percentage filled** is simply calculated by the number of pixels in the liquid mask divided by the number of pixels in the overall container mask. This is a simple way to determine the percentage filled in the container, but it may not be the most accurate. In a real-world application, you may want to use a more complex model to determine the percentage filled in the container.

In [None]:
# Run predictions on the data you've reserved for testing
predict_folder_name = get_next_folder('predict')

# Inferencing with the model
model_path = os.path.join(HOME, f"ultralytics/runs/segment/{train_folder_name}/weights/best.pt")
source_path = os.path.join(dataset.location, "test/images")

# Load the YOLO model from the trained weights
model = YOLO(model_path)
save_dir = os.path.join(f'{HOME}/ultralytics/runs/segment', predict_folder_name)
# Run predictions on the test dataset
results = model.predict(
    source=source_path,  # Path to test images
    conf=0.5,  # Confidence threshold
    save=True,  # Save the predictions (output images)
    save_dir=save_dir
)

In [26]:
# Filter out images where Yolo couldn't find the desired masks, then combine masks for future calculations

for idx, result in enumerate(results):
    if result.masks is None:
        continue
    
    masks = result.masks.data
    class_ids = result.boxes.cls
    
    flask_mask = None
    liquid_mask = None

    for mask_idx, mask in enumerate(masks):
        mask_img = (mask.cpu().numpy() * 255).astype(np.uint8)
        
        class_id = int(class_ids[mask_idx].item())
        class_name = class_names[class_id]

        if class_name == 'graduated-flask':
            flask_mask = mask_img
        elif class_name == 'liquid-level-length':
            liquid_mask = mask_img

    # Ensure both masks are present.
    if flask_mask is not None and liquid_mask is not None:
        combined_mask = np.logical_or(flask_mask > 0, liquid_mask > 0).astype(np.uint8) * 255
        
        mask_dict[f'image_{idx + 1}'] = [flask_mask, liquid_mask, combined_mask]



In [27]:
# OPTIONAL: If you would like to save the masks for viewing. This cell is NOT needed to run the rest of the pipeline.
temp_output_dir = os.path.join(HOME, "mask_images")
os.makedirs(temp_output_dir, exist_ok=True)

for image_name, masks in mask_dict.items():
    flask_mask, liquid_mask, combined_mask = masks
    
    # flask mask
    flask_mask_filename = os.path.join(temp_output_dir, f"{image_name}_flask_mask.png")
    cv2.imwrite(flask_mask_filename, flask_mask)
    print(f"Flask mask saved to: {flask_mask_filename}")
    
    # liquid-level-length mask
    liquid_mask_filename = os.path.join(temp_output_dir, f"{image_name}_liquid_mask.png")
    cv2.imwrite(liquid_mask_filename, liquid_mask)
    print(f"Liquid mask saved to: {liquid_mask_filename}")
    
    # combined mask
    combined_mask_filename = os.path.join(temp_output_dir, f"{image_name}_combined_mask.png")
    cv2.imwrite(combined_mask_filename, combined_mask)
    print(f"Combined mask saved to: {combined_mask_filename}")


In [30]:
# Get the pixels and percentage filled on the final masks
for image_name, masks in mask_dict.items():
    flask_mask, liquid_mask, _ = masks  # Unpack the masks from the list
    combined_mask = np.logical_or(flask_mask > 0, liquid_mask > 0).astype(np.uint8) * 255
    
    flask_area = np.sum(combined_mask == 255)  # Total white pixels in the combined mask
    liquid_area = np.sum(liquid_mask == 255)   # Total white pixels in the liquid mask

    if flask_area > 0:
        percentage_filled = (liquid_area / flask_area) * 100
    else:
        percentage_filled = 0
    
    print(f"{image_name}:")
    print(f"  Combined mask (flask) white pixels: {flask_area}")
    print(f"  Liquid mask white pixels: {liquid_area}")
    print(f"  Percentage of flask filled with liquid: {percentage_filled:.2f}%\n")

____

## Part 5: Make This Model Yours!
<span style="color: gold; font-weight: bold;">Congratulations!</span> <span style="color: #3A8DFF; font-weight: bold;">Now that you've seen how to train a model to determine the amount of liquid in a container, it's time to make this model your own!</span> Here are some ideas to get you started:
1. **Improve the Model:** Try different hyperparameters, model architectures, or training techniques to improve the performance of the model.
2. **Expand the Model:** Try training the model on a larger dataset with more diverse images to see how well it generalizes to new data.
3. **Create a New Model:** Try creating a new model from scratch to determine the amount of liquid in a container. You can use the [Learning Tree](https://msoe-maic.com/learning-tree?node=58) to learn more about how to create a new model.
4. **Apply the Model:** Try applying the model to a new use case, such as determining the amount of liquid in a different type of container or determining the amount of liquid in a video stream.
5. **And More!:** Don't limit your creativity to the ideas above! There are endless possibilities for how you can use AI to solve problems in the world!

<span style="color: #3A8DFF; font-weight: bold;">Happy coding!</span> 🚀