# Analyzing Amazon Inventory Stock in Image Bins
This notebook contains code for training a pytorch model on Amazon Inventory dataset. The dataset contains images of bin ofs items that are sold by amazon on their ecommerce business

The script starts by installing and loading the  packages necessary to complete our project. Our project majorly uses `pytorch, torch` and `torchvision`  packages to train and test a deep learning model.
We download data from the amazon imagery website.

After data download we upload the data to a specified AWS S3 bucket. We then create an estimator that will fetch data from this bucket ,given necessary authentication and fit on the data using an already prepared `train.py` script. 

We observe  the output of this model fitting to judge the performance of  our model. All this accomplished on AWS Cloud platform using `sagemaker`.

In [215]:
%%capture
#  Install any packages that you might need
!pip install --no-cache-dir smdebug torch pytorch torchvision tqdm split-folders

In [216]:
#  Import any packages that you might need
import splitfolders
from tqdm import tqdm

import sagemaker

from sagemaker.tuner import CategoricalParameter, ContinuousParameter, HyperparameterTuner
from sagemaker.pytorch import PyTorch
from sagemaker import get_execution_role
from sagemaker.debugger import Rule, DebuggerHookConfig, TensorBoardOutputConfig, CollectionConfig, ProfilerRule, rule_configs
from sagemaker.debugger import ProfilerConfig, FrameworkProfile
import os


## Data Preparation
**TODO:** Run the cell below to download the data.

The cell below creates a folder called `train_data`, downloads training data and arranges it in subfolders. Each of these subfolders contain images where the number of objects is equal to the name of the folder. For instance, all images in folder `1` has images with 1 object in them. Images are not divided into training, testing or validation sets. If you feel like the number of samples are not enough, you can always download more data (instructions for that can be found [here](https://registry.opendata.aws/amazon-bin-imagery/)). However, we are not acessing you on the accuracy of your final trained model, but how you create your machine learning engineering pipeline.

In [217]:
import os
import json
import boto3

def download_and_arrange_data():
    s3_client = boto3.client('s3')

    with open('file_list.json', 'r') as f:
        d=json.load(f)

    for k, v in d.items():
        print(f"Downloading Images with {k} objects")
        directory=os.path.join('train_data', k)
        if not os.path.exists(directory):
            os.makedirs(directory)
        for file_path in tqdm(v):
            file_name=os.path.basename(file_path).split('.')[0]+'.jpg'
            s3_client.download_file('aft-vbi-pds', os.path.join('bin-images', file_name),
                             os.path.join(directory, file_name))

# download_and_arrange_data()

## Dataset

Our project uses [Amazom Image Bin Data Set](https://registry.opendata.aws/amazon-bin-imagery/). The dataset contains images of inventory of bins for items that are traded in Amazon e-commerce platform. 

The official description of the data by the website is as follows;

  *The Amazon Bin Image Dataset contains over 500,000 images and metadata from bins of a pod in an operating Amazon Fulfillment Center. The bin images in this dataset are captured as robot units carry pods as part of normal Amazon Fulfillment Center operations.*
  
  However in our project we will just use some of the data, just about 10,000 images. 
  
  The dataset images are grouped into `folders` with names denoted by numbers 1 up to 5. With the numbers representing the number of items contained in each  of the images respectively.So the `folder` name 1 holds bins images with 1 item, the same goes for folder 2 and so on. 
  
  The downloded `train_data` is split into `train`, `test` and `val` ,each containing data for training, testing and validation respectively [splitfolders package](https://pypi.org/project/split-folders/).
  
You can find more information about the data [here](https://registry.opendata.aws/amazon-bin-imagery/).

In [218]:
# Perform any data cleaning or data preprocessing

from sklearn.model_selection import train_test_split

#splitfolders.ratio('train_data', output="train_data", seed=1337, ratio=(.6, 0.2,0.2)) 

In [219]:


%%capture 
!aws s3 sync train_data s3://amzn-buckett/


## Model Training

For model traing we us `resnet34` model that was proposed to perform best by our benchmark model.The model configuration are specified in the `train.py` script. 

The architecture is initialiazed as follows; 

```
net = models.__dict__[args.arch]()
        
in_features = net.fc.in_features
new_fc = nn.Linear(in_features,6)
net.fc = new_fc
model=net

```
The model names is specified in `args.arch` arguments with `resnet34` set to be the default.See below;

```parser.add_argument('--arch', '-a', metavar='ARCH', default='resnet34', choices=model_names, help='model architecture: ' + ' | '.join(model_names) + ' (default: resnet34)')```

You can specify the model you want by `model` variable to set your preffered model. The variable stores the names in a dictionary.


**Note:** You will need to use the `train.py` script to train your model.

**Note:** You will need to use the `train.py` script to train your model.

In [220]:
#TODO: Declare your model training hyperparameter.
#NOTE: You do not need to do hyperparameter tuning. You can use fixed hyperparameter values
hyperparameter_ranges = {
    "learning_rate": ContinuousParameter(0.001, 0.1),
    "batch_size": CategoricalParameter([32, 64, 128, 256, 512]),
}

role = sagemaker.get_execution_role()



In [248]:
# Create your training estimator
estimator = PyTorch(
    entry_point="train.py",
    base_job_name='job-amazon-bins',
    role=role,
    framework_version="1.4.0",
    instance_count=1,
    instance_type="ml.g4dn.xlarge",
    py_version='py3'
)


In [249]:
os.environ['SM_CHANNEL_TRAINING']='s3://amzn-buckett/'
os.environ['SM_MODEL_DIR']='s3://amzn-buckett/model/'
os.environ['SM_OUTPUT_DATA_DIR']='s3://amzn-bukett/output/'

In [250]:
estimator.fit({"training": "s3://amzn-buckett/"}, wait=True)

2022-01-16 19:29:47 Starting - Starting the training job...
2022-01-16 19:30:12 Starting - Launching requested ML instancesProfilerReport-1642361387: InProgress
......
2022-01-16 19:31:13 Starting - Preparing the instances for training.........
2022-01-16 19:32:33 Downloading - Downloading input data..................
2022-01-16 19:35:35 Training - Downloading the training image..[34mbash: cannot set terminal process group (-1): Inappropriate ioctl for device[0m
[34mbash: no job control in this shell[0m
[34m2022-01-16 19:35:53,338 sagemaker-containers INFO     Imported framework sagemaker_pytorch_container.training[0m
[34m2022-01-16 19:35:53,360 sagemaker_pytorch_container.training INFO     Block until all host DNS lookups succeed.[0m
[34m2022-01-16 19:35:53,363 sagemaker_pytorch_container.training INFO     Invoking user training script.[0m
[34m2022-01-16 19:35:53,769 sagemaker-containers INFO     Module default_user_module_name does not provide a setup.py. [0m
[34mGenerat

## Standout Suggestions
You do not need to perform the tasks below to finish your project. However, you can attempt these tasks to turn your project into a more advanced portfolio piece.

### Hyperparameter Tuning
**TODO:** Here you can perform hyperparameter tuning to increase the performance of your model. You are encouraged to 
- tune as many hyperparameters as you can to get the best performance from your model
- explain why you chose to tune those particular hyperparameters and the ranges.


In [251]:
#TODO: Create your hyperparameter search space

In [252]:
#TODO: Create your training estimator

In [253]:
# TODO: Fit your estimator

In [254]:
# TODO: Find the best hyperparameters

### Model Profiling and Debugging
**TODO:** Use model debugging and profiling to better monitor and debug your model training job.

In [255]:
# TODO: Set up debugging and profiling rules and hooks

In [256]:
# TODO: Create and fit an estimator

In [257]:
# TODO: Plot a debugging output.

**TODO**: Is there some anomalous behaviour in your debugging output? If so, what is the error and how will you fix it?  
**TODO**: If not, suppose there was an error. What would that error look like and how would you have fixed it?

In [258]:
# TODO: Display the profiler output

### Model Deploying and Querying
**TODO:** Can you deploy your model to an endpoint and then query that endpoint to get a result?

In [259]:
# TODO: Deploy your model to an endpoint

In [260]:
# TODO: Run an prediction on the endpoint

In [261]:
# TODO: Remember to shutdown/delete your endpoint once your work is done

### Cheaper Training and Cost Analysis
**TODO:** Can you perform a cost analysis of your system and then use spot instances to lessen your model training cost?

In [262]:
# TODO: Cost Analysis

In [263]:
# TODO: Train your model using a spot instance

### Multi-Instance Training
**TODO:** Can you train your model on multiple instances?

In [264]:
# TODO: Train your model on Multiple Instances