# Amazon Lookout for Vision Python SDK

In this notebook we will walk you through the Amazon Lookout for Vision Python SDK. It gives you a programmatic way of interacting with this service and adds a lot of helper functions that complement the service, like:

* create manifest file
* push manifest file to S3
* check image sizes if they comply with the service
* check image shapes if you need to rescale images
* rescale images based on optimal shape
* upload images to S3 in the appropriate structure

**Requirements**

Have your images on this local instance. The bad images should be stored in a folder called *bad*, and the good images in a folder named *good*. Also note that the only formats allowed are: jpeg, jpg and png. The following url describes the quotas/limitation of images for training and validation --> https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/limits.html 

## Training a Model

First let's set some general variables that you need:

* input_bucket: the S3 bucket that contains your images for training a model
* project_name: the unique name of the Amazon Lookout for Vision project
* model_version: the model version you want to deploy (note: when starting fresh "1" is the default)
* output_bucket: a bucket where your model and inference results are stored (can be same as input_bucket)
* input_prefix: if you run inference out of S3 this is the key of the image(s) you want to predict
* output_prefix: this is the S3 key where your prediction(s) will be saved to

In [None]:
# Training & Inference
input_bucket = "YOUR_S3_BUCKET_FOR_TRAINING"
project_name = "YOUR_PROJECT_NAME"
model_version = "1" # leave this as one if you start right at the beginning
# Inference
output_bucket = "YOUR_S3_BUCKET_FOR_INFERENCE" # can be same as input_bucket
input_prefix = "YOUR_KEY_TO_FILES_TO_PREDICT/" # used in batch_predict
output_prefix = "YOUR_KEY_TO_SAVE_FILES_AFTER_PREDICTION/" # used in batch_predict

In [None]:
# Install the SDK using pip
# !pip install lookoutvision

In [None]:
# Import all the libraries needed to get started:
from lookoutvision.image import Image
from lookoutvision.manifest import Manifest
from lookoutvision.lookoutvision import LookoutForVision
from lookoutvision.metrics import Metrics

Instantiate the necessary classes:

* Image to interact with your local images
* Manifest to generate and push manifest files
* Metrics to view and compare Model metrics
* LookoutForVision as the main class to interact with the service

In [None]:
img = Image()

In [None]:
mft = Manifest(
    bucket=input_bucket,
    datasets=["training", "validation"])

In [None]:
l4v = LookoutForVision(project_name=project_name)

In [None]:
met = Metrics(project_name=project_name)

In [None]:
# If project does not exist: create it
p = l4v.create_project()
print(p)

In [None]:
# Check if your local images comply with the service
sizes = img.check_image_sizes(verbose=False)
print(sizes)

In [None]:
# Check if all image shapes are the same
shapes = img.check_image_shapes(verbose=True)
print(shapes)

In [None]:
# If not: rescale them
# Note: you don't need to specify a prefix. If you do a new folder is generated for you being named
# rescaled_good and rescaled_bad. Without prefix your original images will be overwritten
resc = img.rescale(prefix="rescaled_")
print(resc)

In [None]:
# Check again in rescaled folder (if you created it)
sizes = img.check_image_sizes(prefix="rescaled_", verbose=False)
print(sizes)

In [None]:
# Check again in rescaled folder (if you created it)
shapes = img.check_image_shapes(prefix="rescaled_", verbose=True)
print(shapes)

Once you prepared your images, have them all in the same shape and they comply with the service's rules you can upload them to your S3 bucket. The Image() class will upload appropriately so you don't need to care about structure anymore.

In [None]:
img.upload_from_local(
    bucket=input_bucket,
    train_and_test=True,
    test_split=0.2,
    prefix="rescaled_")

Now that your images are saved to S3, you can use the Manifest() class to generate a manifest file for you and push it to the same S3 location in which your image folders are. Lookout for Vision will pick these manifest files up and create datasets accordingly:

In [None]:
mft_resp = mft.push_manifests()
print(mft_resp)

Based on the manifest files in S3 create your Lookout for Vision datasets:

In [None]:
dsets = l4v.create_datasets(mft_resp, wait=True)
print(dsets)

We are ready to train the model:

In [None]:
l4v.fit(
    output_bucket=output_bucket,
    model_prefix="mymodel_",
    wait=True)

And final deploy it:

In [None]:
l4v.deploy(
    model_version=model_version,
    wait=True)

## Display Model Metrics

If you want to check the metrics of your model(s) you can use the *Metrics* class in two different flavors:

* Either display the metrics for one model
* or display the metrics for all models of the same project

In [None]:
# One model
met.desribe_model(model_version=model_version)

In [None]:
# All models of the same project
met.describe_models()

## Inference

### The Batch Transform feature  enables you to run predictions on datasets stored in Amazon S3/local.
Batch transform job would run inferences on your batch dataset and store your inference results in S3/local accordingly

For batch prediction where your data/images are in s3 ,please provide below information as input to the function.
  1. model_version=Either you put your model version or by default it will take model version as 1
  2. input_bucket=Input bucket name where your input images ( which are required to be predicted normal/anomalous) are there.
  3. input_prefix = Folder name/Key name (if applicable)  for the s3 path where input images are. In case you have this please make sure that you put a forward slash ("/") at the end as mentioned in the example.
  4. output_bucket = Output bucket name where your prediction results would be stored in json file. Please note that output json file's name would be image_name.json
  5. output_prefix = Folder name/Key name (if applicable)  for the s3 path where output predicted files would be>In case you have this please make sure that you put a forward slash ("/") at the end as mentioned in the example.
  6. content_type="image/jpeg"


In [None]:
l4v.batch_predict(
    model_version=model_version,
    input_bucket=input_bucket,
    input_prefix=input_prefix,
    output_bucket =output_bucket,
    output_prefix=output_prefix,
    content_type="image/jpeg")

For batch prediction where your data/images are in local ,please provide below information as input to the function.

1. model_version=Either you put your model version or by default it will take model version as '1'
2. local_path= Local path where your input images ( which are required to be predicted normal/anomalous) are there.
3. content_type="image/jpeg"

In [None]:
l4v._batch_predict_local(
    local_path='/your/local/path',
    model_version=model_version,
    content_type="image/jpeg")

### To predict Real-time , call the predict method with below inputs. You can either predict from S3 object OR local images
 1. model_version=Either you put your model version or by default it will take model version as '1', 
 2. local_file=Local path where your input image ( which is required to be predicted normal/anomalous) is there.,
 3. bucket=Input bucket name where your input image ( which is required to be predicted normal/anomalous) is there, 
 4. key=Key for the image (it should contain the exact file name as mentioned in the example below), 
 5. content_type="image/jpeg"

In [None]:
# When your image is in local path. Please change your local file path with your local directory and file name
l4v.predict(local_file="your/local/bad/file.jpeg")

In [None]:
# When your image is in local path. Please change your local file path with your local directory and file name
l4v.predict(local_file="your/local/good/file.jpeg")

In [None]:
# When your image is in s3. Please change your s3 bucket with key and file name
l4v.predict(
    bucket=input_bucket,
    key='my/key/to/the/file.jpeg')

### To retrain the model of the same project, you need to follow the steps.

1. Create new/updated manifest file with new images

2. Update the existing datasets ( train and test both)

3. Train a new version of model with updated dataset

In [None]:
## Define the buckets for the latest input images which need to be trained  and then initialize the Manifest method with the
## In case you have new/updated images in the same bucket you may avoid these steps.
mft_retrain = Manifest(
    bucket=input_bucket,
    datasets=["training", "validation"])

In [None]:
# In case you have new/updated images for retraining in local, you can use the below method to import the same to s3 
img.upload_from_local(
    bucket=input_bucket,
    train_and_test=True)

In [None]:
# Now create the manifest file with new dataset
mft_resp_new = mft_retrain.push_manifests()

In [None]:
#Update datasets with new manifest file
l4v.update_datasets(mft_resp_new)

In [None]:
# Start creation of new model training. This time it will take updated dataset.
l4v.fit(output_bucket=input_bucket)

### Stop the model after you are done.
If you dont provide any model version by default it will stop model version 1. 

In [None]:
# When you dont mention the model version 
l4v.stop_model()

In [None]:
# When you mention the specific model version 
new_model_version = "2"
l4v.stop_model(model_version=new_model_version)