# Inventory Monitoring Using Machine Learning and Computer Vision
## Introduction
This notebook as a comprehensive guide for developing a machine learning pipeline to address inventory monitoring challenges in distribution centers. The project aims to automate object detection and counting in bins using computer vision techniques, leveraging the Amazon Bin Image Dataset. The approach involves training a model using AWS SageMaker and deploying it for real-time inference.

The workflow for this project includes the following key steps:

1. **Data Acquisition:** Accessing the Amazon Bin Image Dataset, containing over 500,000 labeled images.
2. **Data Preprocessing:** Cleaning, normalizing, and augmenting the dataset to enhance model robustness and generalization.
3. **Exploratory Data Analysis (EDA):** Understanding data distribution, variability, and anomalies through visualization.
4. **Model Selection and Training:** Using a pre-trained ResNet-50 as a baseline model, fine-tuned on the bin images, with experiments on other architectures.
5. **Model Evaluation:** Measuring performance with metrics like accuracy, precision, recall, and F1-score.
6. **Deployment:** Deploying the best-performing model as an endpoint in AWS SageMaker for real-time predictions.
7. **Monitoring:** Continuously tracking the model’s performance to ensure reliability over time.

The proposed pipeline demonstrates an efficient and scalable solution for automating inventory monitoring, ensuring accuracy, and optimizing operations in supply chain distribution centers.

**Note:** This notebook has a bunch of code and markdown cells with TODOs that you have to complete. These are meant to be helpful guidelines for you to finish your project while meeting the requirements in the project rubrics. Feel free to change the order of the TODO's and/or use more than one cell to complete all the tasks.

In [None]:
!pip install smdebug

In [None]:
# Install any packages that you might need

import os
import json
import boto3
from tqdm import tqdm
import sagemaker
from src.data_splitter import DataSplitter

In [None]:
# TODO: Import any packages that you might need

## Data Preparation
**TODO:** Run the cell below to download the data.

The cell below creates a folder called `train_data`, downloads training data and arranges it in subfolders. Each of these subfolders contain images where the number of objects is equal to the name of the folder. For instance, all images in folder `1` has images with 1 object in them. Images are not divided into training, testing or validation sets. If you feel like the number of samples are not enough, you can always download more data (instructions for that can be found [here](https://registry.opendata.aws/amazon-bin-imagery/)). However, we are not acessing you on the accuracy of your final trained model, but how you create your machine learning engineering pipeline.

In [None]:
def download_and_arrange_data():
    s3_client = boto3.client('s3')

    with open('file_list.json', 'r') as f:
        d=json.load(f)

    for k, v in d.items():
        print(f"Downloading Images with {k} objects")
        directory=os.path.join('train_data', k)
        if not os.path.exists(directory):
            os.makedirs(directory)
        for file_path in tqdm(v):
            file_name=os.path.basename(file_path).split('.')[0]+'.jpg'
            s3_client.download_file('aft-vbi-pds', os.path.join('bin-images', file_name),
                             os.path.join(directory, file_name))

download_and_arrange_data()

## Dataset
The **Amazon Bin Image Dataset** is a specialized dataset designed for machine learning applications in inventory monitoring and object detection. It comprises images captured from distribution center bins, each annotated with metadata about the number and type of objects present.

1. The dataset contains **536,434** images, includes bins with varying numbers of objects (e.g., 1, 2, 3, 4, 5 objects per image)
2. The dataset is organized into five classes, based on the number of objects in the bin:

**Class 1:** Images with one object in the bin.
**Class 2:** Images with two objects in the bin.
**Class 3:** Images with three objects in the bin.
**Class 4:** Images with four objects in the bin.
**Class 5:** Images with five objects in the bin.

In [None]:
input_dir = "train_data"
output_dir = "data"
splitter = DataSplitter(input_dir, output_dir)

In [None]:
# Storage the data sets to train, test and validation sets.
splitter.execute_split()

### This cell below is upload the output_dir to S3

In [None]:
def upload_directory_to_s3(local_directory, s3_bucket, s3_prefix=''):
    for root, dirs, files in os.walk(local_directory):
        for file in tqdm(files):
            local_path = os.path.join(root, file)
            s3_path = os.path.join(s3_prefix, local_path).replace("\\", "/")
            s3_client.upload_file(local_path, s3_bucket, s3_path)
    print('upload complete')

In [None]:
s3_client = boto3.client('s3')
bucket_name = 'haont1-bucket'
s3_ds_directory = 'data'
role = sagemaker.get_execution_role()
session = boto3.session.Session()
region = session.region_name

In [None]:
upload_directory_to_s3(output_dir, bucket_name)

## Exploratory Data Analysis (EDA)

In [None]:
from src.data_set_eda import DatasetEDA

### Analyze the distribution of object in bins.

In [None]:
dataset_dir = "data/train"
eda = DatasetEDA(dataset_dir)

In [None]:
# Analyze the distribution of object in bins in training set
eda.analyze_distribution()

* Class **3** has the highest number of images (1866) while class **1** has the lowest (859). This one shows a potential class imbalance where some classes have significantly fewer examples compared to others. Class **3** has approximately **2.17 times** more than class **1**, which may lead to biases during training.
* Classes **2** and **4** have a relatively balanced number of images(1609 and 1661), closer to the overall average.

In [None]:
eda.visualize_samples(num_samples=5)

From samples image we could see:

1. Variability:

   * The dataset contains a diverse set of items stored in bins.
   * Items are different about size, shape, texture and packaking materials.
   * Some items are boxed, others are wrapped in plastic, while some appear loosely packed.
3. Complexity:

   * The images include multiple types of objects, that making classification become challenging.

In [None]:
# Detect anomalies
anomalies = eda.detect_anomalies()
if anomalies:
    print(f"Total anomalies found: {len(anomalies)}")

Finding **7307 anomalies** in your training set indicates that there are instances or data points that deviate significantly from the majority of the dataset's patterns or expected behavior

* **Potential errors in data labeling:** Some images may be mislabeled, leading to incorrect class assignments. For example, an image belonging to class **3** might be incorrectly labeled as class **1**.
* **Corrupt or noisy data:** The anomalies could result from poor image quality, occlusions, or distortions).

## Data Preprocessing
* Resize image to standardize input dimensions
* Normalize pixel values to enhance model convergence
* Data augmentation techniques (resizing, flipping, rotation and normalization) to increase dataset diversity and prevent overfitting.

In [None]:

# Directories containing the train, validation, and test splits
train_dir = "data/train"
val_dir = "data/val"
test_dir = "data/test"

In [None]:
from src.data_loader import DataLoaderCreator

# Initialize DataLoaderCreator
loader_creator = DataLoaderCreator(train_dir, val_dir, test_dir)

# Create DataLoaders
data_loaders = loader_creator.create_data_loaders()

# Example: Accessing the DataLoader for training
train_loader = data_loaders["train"]
print(f"Number of batches in training DataLoader: {len(train_loader)}")

### Resize image

* Resizes all images to a standard dimension (224x224 in this case) to standardize the input size for the model.

In [None]:
# Show resize image to standardize input dimensions
import matplotlib.pyplot as plt
import torchvision.transforms.functional as F

def display_resized_images(data_loader, num_images=5):
    """
    Display a few resized images from the DataLoader.
    """
    # Get a batch of data
    images, labels = next(iter(data_loader))

    # Create a grid for displaying images
    plt.figure(figsize=(15, 5))
    for i in range(min(num_images, len(images))):
        # Denormalize the image for visualization
        img = images[i]  # Shape: [C, H, W]
        img = F.to_pil_image(img)  # Convert tensor to PIL Image

        # Display the image
        plt.subplot(1, num_images, i + 1)
        plt.imshow(img)
        plt.title(f"Label: {labels[i]}")

    plt.show()

# Display the resized images
display_resized_images(train_loader, num_images=5)

### Normalized Images

* Normalizes pixel values using ImageNet's mean and standard deviation to standardize the input pixel intensity range.

In [None]:
import matplotlib.pyplot as plt
import torchvision.transforms.functional as F

def display_normalized_images(data_loader, num_images=5):
    """
    Display a few normalized images from the DataLoader.
    """
    # Get a batch of data
    images, labels = next(iter(data_loader))
    
    
    # Create a grid for displaying images
    plt.figure(figsize=(15, 10))
    for i in range(min(num_images, len(images))):
        # Original normalized image
        normalized_img = images[i]  # Shape: [C, H, W]
             
        # Display normalized images
        plt.subplot(2, num_images, i + 1)
        plt.imshow(F.to_pil_image(normalized_img))  # Normalized image
        plt.title(f"Normalized\nLabel: {labels[i]}")
        plt.axis('off')
        
    plt.show()

    
# Display the normalized and denormalized images
display_normalized_images(train_loader, num_images=5)


### Data Augmentation

1. **Random Hozirontal Flip:** Flips images horizontally with a 50% probability, providing variety in image orientation
2. **Random Rotation:** Rotates images randomly within a ±15° range, introducing positional diversity

In [None]:
import matplotlib.pyplot as plt
import torchvision.transforms.functional as F

def display_augmentations(data_loader, num_images=5, num_augmentations=3):
    """
    Display augmented images to demonstrate the effect of data augmentation techniques.
    """
    # Get a batch of data
    images, labels = next(iter(data_loader))

    # Create a grid for displaying images
    plt.figure(figsize=(15, num_augmentations * 5))
    for i in range(min(num_images, len(images))):
        for j in range(num_augmentations):
            # Apply augmentation multiple times
            augmented_img = images[i]
            augmented_img = F.to_pil_image(augmented_img)  # Convert tensor to PIL Image for display
            
            # Display the image
            plt.subplot(num_images, num_augmentations, i * num_augmentations + j + 1)
            plt.imshow(augmented_img)
            if j == 0:
                plt.title(f"Original\nLabel: {labels[i]}")
            else:
                plt.title(f"Augmented #{j}\nLabel: {labels[i]}")
            plt.axis('off')

    plt.show()

# Display augmented images
display_augmentations(train_loader, num_images=5, num_augmentations=3)

## Model Training
**TODO:** This is the part where you can train a model. The type or architecture of the model you use is not important. 

**Note:** You will need to use the `train.py` script to train your model.

In [None]:
os.environ['SM_CHANNEL_TRAINING']=f"s3://{bucket_name}/{s3_ds_directory}/train/"
os.environ['SM_CHANNEL_VALIDATION']=f"s3://{bucket_name}/{s3_ds_directory}/val/"
os.environ['SM_CHANNEL_TEST']=f"s3://{bucket_name}/{s3_ds_directory}/test/"
os.environ["SM_MODEL_DIR"] = f"s3://{bucket_name}/model/model.tar.gz"

In [None]:
#TODO: Declare your model training hyperparameter.
#NOTE: You do not need to do hyperparameter tuning. You can use fixed hyperparameter values

In [None]:
#TODO: Create your training estimator

In [None]:
# TODO: Fit your estimator

## Standout Suggestions
You do not need to perform the tasks below to finish your project. However, you can attempt these tasks to turn your project into a more advanced portfolio piece.

### Hyperparameter Tuning
**TODO:** Here you can perform hyperparameter tuning to increase the performance of your model. You are encouraged to 
- tune as many hyperparameters as you can to get the best performance from your model
- explain why you chose to tune those particular hyperparameters and the ranges.


In [None]:
from sagemaker.tuner import (
    IntegerParameter,
    CategoricalParameter,
    ContinuousParameter,
    HyperparameterTuner,
)

#Declare your HP ranges, metrics etc.
hyperparameter_ranges = {
    'batch_size': IntegerParameter(32, 128),
    'learning_rate': ContinuousParameter(1e-4, 5e-2),
    'num_epochs': IntegerParameter(5, 15)
}

objective_metric_name = 'Average loss'
objective_type = 'Minimize'

metric_definitions = [{"Name": "Average loss", "Regex": "Average loss: ([0-9\\.]+)"}]

In [None]:
from sagemaker.pytorch import PyTorch

# Create your training estimator
estimator = PyTorch(
    entry_point="hpo.py",
    source_dir="./src",
    role=role,
    framework_version='1.12',
    py_version='py38',
    instance_count=1,
    instance_type='ml.g4dn.xlarge', # Use GPU-enabled instance
)

# Define hyperparameter tuner
tuner = HyperparameterTuner(
    estimator=estimator,
    objective_metric_name=objective_metric_name,
    objective_type=objective_type,
    hyperparameter_ranges=hyperparameter_ranges,
    metric_definitions=metric_definitions,
    max_jobs=5,  # Number of total jobs
    max_parallel_jobs=2  # Number of jobs to run in parallel
)

print("Estimator and tuner defined with S3 model directory.")

In [None]:
# Fit your estimator
tuner.fit({
    "train": os.environ['SM_CHANNEL_TRAINING'],
    "validation": os.environ['SM_CHANNEL_VALIDATION'],
    "test": os.environ['SM_CHANNEL_TEST']
}, wait=True)

In [None]:
# TODO: Find the best hyperparameters
best_estimator = tuner.best_estimator()

#Get the hyperparameters of the best trained model
best_hyperparameters = best_estimator.hyperparameters()
print(f"Best hyperparameters: {best_hyperparameters}")

In [None]:
best_batch_size = int(best_hyperparameters["batch_size"])
best_learning_rate = float(best_hyperparameters["learning_rate"])
best_epochs = int(best_hyperparameters["num_epochs"])

print("Best batch size: ", best_batch_size)
print("Best learning rate: ", best_learning_rate)
print("Best epochs: ", best_epochs)

### Describe the tuning results

The best model used a batch size of 108 and a learning rate of 0.001369 and testing score is 23.7%

In [None]:
from sagemaker.analytics import HyperparameterTuningJobAnalytics

In [None]:
exp = HyperparameterTuningJobAnalytics(
  hyperparameter_tuning_job_name='pytorch-training-241202-0346')

jobs = exp.dataframe()

jobs.sort_values('FinalObjectiveValue', ascending=0)

### Prepare to perform Training on Best Estimator

In [None]:
best_estimator=tuner.best_estimator()

In [None]:
best_estimator.hyperparameters()

**If kernel die, contienue from a completed training job**

In [None]:
BetterTrainingJobName='pytorch-training-241202-0346-005-5fef11df'
my_estimator = sagemaker.estimator.Estimator.attach(BetterTrainingJobName)
my_estimator.hyperparameters()
best_estimator=my_estimator

In [None]:
hyperparameters = {"batch_size": int(best_estimator.hyperparameters()['batch_size'].replace('"', '')), \
                   "learning_rate": best_estimator.hyperparameters()['learning_rate'], \
                   "num_epochs": int(best_estimator.hyperparameters()['num_epochs'].replace('"', ''))
                  }
hyperparameters

### Model Profiling and Debugging
**TODO:** Use model debugging and profiling to better monitor and debug your model training job.

In [None]:
from sagemaker.debugger import (
    Rule,
    ProfilerRule,
    rule_configs,
    DebuggerHookConfig,
    ProfilerConfig,
    FrameworkProfile,
    CollectionConfig
)

In [None]:
# Set up debugging and profiling rules and hooks

rules = [
    Rule.sagemaker(rule_configs.loss_not_decreasing()),
    Rule.sagemaker(rule_configs.overfit()),
    Rule.sagemaker(rule_configs.overtraining()),
    Rule.sagemaker(rule_configs.poor_weight_initialization()),
    ProfilerRule.sagemaker(rule_configs.LowGPUUtilization()),
    ProfilerRule.sagemaker(rule_configs.ProfilerReport()),
]

profiler_config = ProfilerConfig(
    system_monitor_interval_millis=500, framework_profile_params=FrameworkProfile(num_steps=10)
)

collection_config_list = [
    CollectionConfig(
        name="CrossEntropyLoss_output_0",
        parameters={
            "include_regex": "CrossEntropyLoss_output_0", 
            "train.save_interval": "50",
            "eval.save_interval": "1"
        }
    )
]


debugger_hook_config = DebuggerHookConfig(
    hook_parameters={"train.save_interval": "500", "eval.save_interval": "50"},
    collection_configs=collection_config_list
)

In [None]:
from sagemaker.pytorch import PyTorch

estimator = PyTorch(
    entry_point="train.py",
    source_dir="./src",
    role=role,
    framework_version='1.12',
    py_version='py38',
    instance_count=1,
    instance_type='ml.g4dn.xlarge', # Use GPU-enabled instance          
    hyperparameters=hyperparameters,
    debugger_hook_config=debugger_hook_config,
    profiler_config=profiler_config,
    rules=rules
)

In [None]:
# Fit the estimator
estimator.fit({"train": os.environ['SM_CHANNEL_TRAINING'], "test": os.environ['SM_CHANNEL_TEST']})

In [None]:
# Plot a debugging output.
from smdebug.trials import create_trial
from smdebug.core.modes import ModeKeys
import matplotlib.pyplot as plt
from mpl_toolkits.axes_grid1 import host_subplot

trial = create_trial(estimator.latest_job_debugger_artifacts_path())

def get_data(trial, tname, mode):
    tensor = trial.tensor(tname)
    steps = tensor.steps(mode=mode)
    vals = []
    for s in steps:
        vals.append(tensor.value(s, mode=mode))
    return steps, vals


def plot_tensor(trial, tensor_name):

    steps_train, vals_train = get_data(trial, tensor_name, mode=ModeKeys.TRAIN)
    print("loaded TRAIN data")
    steps_eval, vals_eval = get_data(trial, tensor_name, mode=ModeKeys.EVAL)
    print("loaded EVAL data")

    fig = plt.figure(figsize=(10, 7))
    host = host_subplot(111)

    par = host.twiny()

    host.set_xlabel("Steps (TRAIN)")
    par.set_xlabel("Steps (EVAL)")
    host.set_ylabel(tensor_name)

    (p1,) = host.plot(steps_train, vals_train, label=tensor_name)
    print("completed TRAIN plot")
    (p2,) = par.plot(steps_eval, vals_eval, label="val_" + tensor_name)
    print("completed EVAL plot")
    leg = plt.legend()

    host.xaxis.get_label().set_color(p1.get_color())
    leg.texts[0].set_color(p1.get_color())

    par.xaxis.get_label().set_color(p2.get_color())
    leg.texts[1].set_color(p2.get_color())

    plt.ylabel(tensor_name)

    plt.show()

plot_tensor(trial, "CrossEntropyLoss_output_0")

**TODO**: Is there some anomalous behaviour in your debugging output? If so, what is the error and how will you fix it?  
**TODO**: If not, suppose there was an error. What would that error look like and how would you have fixed it?

In [None]:
# TODO: Display the profiler output

### Model Deploying and Querying
**TODO:** Can you deploy your model to an endpoint and then query that endpoint to get a result?

In [None]:
# TODO: Deploy your model to an endpoint

In [None]:
# TODO: Run an prediction on the endpoint

In [None]:
# TODO: Remember to shutdown/delete your endpoint once your work is done

### Cheaper Training and Cost Analysis
**TODO:** Can you perform a cost analysis of your system and then use spot instances to lessen your model training cost?

In [None]:
# TODO: Cost Analysis

In [None]:
# TODO: Train your model using a spot instance

### Multi-Instance Training
**TODO:** Can you train your model on multiple instances?

In [None]:
# TODO: Train your model on Multiple Instances