# Alternative to Amazon Lookout for Vision with SageMaker Algorithm: Computer Vision Defect Detection Model from AWS Marketplace

Amazon Lookout for Vision, the AWS service designed to create customized artificial intelligence and machine learning (AI/ML) computer vision models for automated quality inspection, will be discontinuing on October 31, 2025. As part of this transition, the Lookout for Vision (LFV) team has published their algorithm for use within Amazon SageMaker, ensuring continuity and expanded possibilities for users.

This notebook guides you through the process of:

1. Subscribe to the LFV-published algorithm in Amazon SageMaker
1. Train an image classification model using this algorithm, which maintains the same training logic as the existing LFV service
1. Train an image segmentation model using this algorithm.

By following this guide, you'll be able to seamlessly incorporate LFV's proven computer vision capabilities into your SageMaker workflows. Whether you're transitioning existing LFV projects or starting new ones, this notebook will help ensure your automated quality inspection workflows remain uninterrupted beyond the LFV service discontinuation date. 
This sample notebook shows you how to train a custom ML model using [Computer Vision Defect Detection Model](https://aws.amazon.com/marketplace/pp/prodview-j72hhmlt6avp6) from AWS Marketplace.

-------------

### Pre-requisites

1. Note: This notebook contains elements which render correctly in Jupyter interface. Open this notebook from an Amazon SageMaker Notebook Instance or Amazon SageMaker Studio.
1. Ensure that IAM role used has **AmazonSageMakerFullAccess**
1. Some hands-on experience using **Amazon SageMaker**.
1. To use this algorithm successfully, ensure that:
   
   A. Either your IAM role has these three permissions and you have authority to make AWS Marketplace subscriptions in the AWS account used:
   
        a. aws-marketplace:ViewSubscriptions
        b. aws-marketplace:Unsubscribe
        c. aws-marketplace:Subscribe
   
   B: or your AWS account has a subscription to: [Computer Vision Defect Detection Model](https://aws.amazon.com/marketplace/pp/prodview-j72hhmlt6avp6).

### Subscribe to the algorithm

To subscribe to the algorithm:

1. Open the algorithm listing page: [Computer Vision Defect Detection Model](https://aws.amazon.com/marketplace/pp/prodview-j72hhmlt6avp6).
1. On the AWS Marketplace listing, click on Continue to subscribe button.
1. On the Subscribe to this software page, review and click on "Accept Offer" if you agree with EULA, pricing, and support terms.
1. Once you click on Continue to configuration button and then choose a region, you will see a Product Arn. This is the algorithm ARN that you need to specify while training a custom ML model. Copy the algorithm name and specify the same in the following cell.

In [None]:
# TODO: change this to use subscribed SageMaker algorithm
algorithm_name = "<Customer to specify the algorithm name after subscription>"

### Initial Set Up 

Set up your SageMaker environment: First, we'll import necessary libraries, set up our SageMaker session, and define key variables.

In [None]:
import boto3
import sagemaker
import json

In [None]:
session = sagemaker.Session()
region = session.boto_region_name
bucket = session.default_bucket()
# Project name would be used as part of s3 output path
project = "Computer-Vision-Defect-Detection"

### Create IAM Role with SageMaker Permission

Then we will create an IAM role with SageMaker full access.

In [None]:
iam_client = boto3.client('iam')
trust_policy = {
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "sagemaker.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

# Create the IAM role
role_name = "SageMakerExecutionRole"
response = None
try:
    response = iam_client.create_role(
        RoleName=role_name,
        AssumeRolePolicyDocument=json.dumps(trust_policy),
        Description="IAM role with full S3 and SageMaker access"
    )

    sm_role_arn = response['Role']['Arn']
    print(f"Role created with ARN: {sm_role_arn}")

    # Attach policies for full S3 and SageMaker access
    iam_client.attach_role_policy(
        RoleName=role_name,
        PolicyArn="arn:aws:iam::aws:policy/AmazonS3FullAccess"
    )

    iam_client.attach_role_policy(
        RoleName=role_name,
        PolicyArn="arn:aws:iam::aws:policy/AmazonSageMakerFullAccess"
    )
    print("Attached S3 full access and SageMaker full access")
except:
    print("role already exists trying to get existing role") 
    response = iam_client.get_role(
        RoleName=role_name
    )
    print("got existing role") 



In [None]:
iam_client = boto3.client('iam')
role_name = "SageMakerExecutionRole"
iam_client.get_role(RoleName=role_name)

----------------------------------
We will go through two examples, one for image classification model, the other one for image segmentation model.

## Classification Model

**Prepare your classification data:**
For this step, we'll follow the data preparation guidelines as outlined in the Amazon Lookout for Vision Developer Guide (https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/getting-started.html). And we will use cookie dataset in this guide.

a. Organize your images:
Place your normal (non-defective) images in a S3 path named "normal".
Place your anomalous (defective) images in a S3 path named "anomaly".

b. Create a manifest file: The manifest file is a JSON Lines file that lists your images and their classifications. Each line in the file represents one image and contains a JSON object with the following structure:

* "source-ref" is the S3 URI of the image
* "auto-label" is 0 for normal images and 1 for anomalous images

In [None]:
!cat train_class.manifest

Upload manifest file to preferred S3 path, change "bucket_name" and "object_key" to the location you would like to store your manifest file.

In [None]:
bucket_name = "<Specify S3 bucket name>"
object_key = "<Specify S3 object key>/train_class.manifest"
s3 = boto3.client('s3')
s3.upload_file('train_class.manifest', bucket_name, object_key)
classification_s3_path = f"s3://{bucket_name}/{object_key}" 
print(classification_s3_path)

**Create SageMaker training job:**
Now that we have our data prepared and uploaded to S3, we can create and start the training job using the LFV published SageMaker algorithm.

In [None]:
import datetime
sagemaker = boto3.Session(region_name=region).client("sagemaker")
classification_training_job_name = 'defect-detection-classification-'+datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')

Create SageMaker training job using subscribed algorithm. 

In [None]:
response = sagemaker.create_training_job(
    TrainingJobName=classification_training_job_name,
    HyperParameters={
        'ModelType': 'classification',
        'TestInputDataAttributeNames': 'source-ref,anomaly-label-metadata,anomaly-label',
        'TrainingInputDataAttributeNames': 'source-ref,anomaly-label-metadata,anomaly-label'
    },
    AlgorithmSpecification={
        'AlgorithmName': algorithm_name,
        'TrainingInputMode': 'File',
        'EnableSageMakerMetricsTimeSeries': False
    },
    RoleArn=sm_role_arn,
    InputDataConfig=[
        {
            'ChannelName': 'training',
            'DataSource': {
                'S3DataSource': {
                    'S3DataType': 'AugmentedManifestFile',
                    'S3Uri': classification_s3_path,
                    'S3DataDistributionType': 'ShardedByS3Key',
                    'AttributeNames': [
                        'source-ref',
                        'anomaly-label-metadata',
                        'anomaly-label'
                    ],
                }
            },
            'CompressionType': 'None',
            'RecordWrapperType': 'RecordIO',
            'InputMode': 'Pipe'
        },
    ],
    OutputDataConfig={'S3OutputPath': 's3://'+bucket+'/'+project+'/output'},
    ResourceConfig={
        'InstanceType': 'ml.g4dn.2xlarge',
        'InstanceCount': 1,
        'VolumeSizeInGB': 20
    },
    EnableNetworkIsolation=True,
    StoppingCondition={
        'MaxRuntimeInSeconds': 7200
    }
)

Waiting for training job to complete

In [None]:
import time
while True:
    training_response = sagemaker.describe_training_job(
        TrainingJobName=classification_training_job_name
    )
    if training_response['TrainingJobStatus'] == 'InProgress':
        print(".", end='')
    elif training_response['TrainingJobStatus'] == 'Completed':
        print("Completed")
        break
    elif training_response['TrainingJobStatus'] == 'Failed':
        print("Failed")
        break
    else:
        print("?", end='')
    time.sleep(60)

******************

## Segmentation Model

**Prepare your segmentation data:**
For this step, we'll follow the data preparation guidelines as outlined in the Amazon Lookout for Vision Developer Guide (https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/getting-started.html). And we will use cookie dataset in this guide.

For image segmentation tasks, the data preparation process is slightly different from classification. Here's how to prepare your data for segmentation:

  1. Place your original images in a S3 path e.g. "images".  
  1. Place corresponding segmentation masks in a S3 path named e.g. "masks".
  1. Create a manifest file: The manifest file is a JSON Lines file that lists your images and their classifications. Each line in the file represents one image and contains a JSON object with the following structure:
  
   * "source-ref" is the S3 URI of the image
   * "anomaly-label" is 0 for normal images and 1 for anomalous images
   * "anomaly-mask-ref" is the S3 URI of the corresponding segmentation mask (if applicable)

The following JSON line shows an image with segmentation and classification information. anomaly-label-metadata contains classification information. anomaly-mask-ref and anomaly-mask-ref-metadata contain segmentation information. https://docs.aws.amazon.com/lookout-for-vision/latest/developer-guide/manifest-file-segmentation.html

```
{
    "source-ref": "s3://path-to-image",
    "anomaly-label": 1,
    "anomaly-label-metadata": {
        "class-name": "anomaly",
        "creation-date": "2021-10-12T14:16:45.668",
        "human-annotated": "yes",
        "job-name": "labeling-job/classification-job",
        "type": "groundtruth/image-classification",
        "confidence": 1
    },
    "anomaly-mask-ref": "s3://path-to-image",
    "anomaly-mask-ref-metadata": {
        "internal-color-map": {
            "0": {
                "class-name": "BACKGROUND",
                "hex-color": "#ffffff",
                "confidence": 0.0
            },
            "1": {
                "class-name": "scratch",
                "hex-color": "#2ca02c",
                "confidence": 0.0
            },
            "2": {
                "class-name": "dent",
                "hex-color": "#1f77b4",
                "confidence": 0.0
            }
        },
        "type": "groundtruth/semantic-segmentation",
        "human-annotated": "yes",
        "creation-date": "2021-11-23T20:31:57.758889",
        "job-name": "labeling-job/segmentation-job"
    }
}                        

```

In [None]:
!cat train_segmentation.manifest

Upload manifest file to preferred S3 path, change "bucket_name" and "object_key" to the location you would like to store your manifest file.

In [None]:
bucket_name = "<Specify S3 bucket name>"
seg_manifest_object_key = "<Specify S3 object key>/train_segmentation.manifest"
s3 = boto3.client('s3')
s3.upload_file('train_segmentation.manifest', bucket_name, seg_manifest_object_key)
segmentation_s3_path = f"s3://{bucket_name}/{seg_manifest_object_key}" 
print(segmentation_s3_path)

**Create SageMaker training job:**

Start traning job for segmentation model

In [None]:
segmentation_training_job_name = 'defect-detection-segmentation-'+datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')

In [None]:
sagemaker = boto3.Session(region_name=region).client("sagemaker")
response = sagemaker.create_training_job(
    TrainingJobName=segmentation_training_job_name,
    HyperParameters={
        'ModelType': 'segmentation',
        'TestInputDataAttributeNames': 'source-ref,anomaly-label-metadata,anomaly-label,anomaly-mask-ref-metadata,anomaly-mask-ref',
        'TrainingInputDataAttributeNames': 'source-ref,anomaly-label-metadata,anomaly-label,anomaly-mask-ref-metadata,anomaly-mask-ref'
    },
    AlgorithmSpecification={
        'AlgorithmName': algorithm_name,
        'TrainingInputMode': 'File',
        'EnableSageMakerMetricsTimeSeries': False
    },
    RoleArn=sm_role_arn,
    InputDataConfig=[
        {
            'ChannelName': 'training',
            'DataSource': {
                'S3DataSource': {
                    'S3DataType': 'AugmentedManifestFile',
                    'S3Uri': segmentation_s3_path,
                    'S3DataDistributionType': 'ShardedByS3Key',
                    'AttributeNames': [
                        'source-ref',
                        'anomaly-label-metadata',
                        'anomaly-label',
                        'anomaly-mask-ref-metadata',
                        'anomaly-mask-ref'
                    ],
                }
            },
            'CompressionType': 'None',
            'RecordWrapperType': 'RecordIO',
            'InputMode': 'Pipe'
        },
    ],
    OutputDataConfig={'S3OutputPath': 's3://'+bucket+'/'+project+'/output'},
    EnableNetworkIsolation=True,
    ResourceConfig={
        'InstanceType': 'ml.g4dn.2xlarge',
        'InstanceCount': 1,
        'VolumeSizeInGB': 20
    },
    StoppingCondition={
        'MaxRuntimeInSeconds': 7200
    }
)
print(response)

Waiting for training job to complete

In [None]:
while True:
    training_response = sagemaker.describe_training_job(
        TrainingJobName=segmentation_training_job_name
    )
    if training_response['TrainingJobStatus'] == 'InProgress':
        print(".", end='')
    elif training_response['TrainingJobStatus'] == 'Completed':
        print("Completed")
        break
    elif training_response['TrainingJobStatus'] == 'Failed':
        print("Failed")
        break
    else:
        print("?", end='')
    time.sleep(60)

***********

### (Optional) Run Inference on SageMaker Batch Transform Job

We will use classification model example above to run inference with SageMaker batch transform job. First, we will create SageMaker Model Package from completed training job.

In [None]:
training_job_info = sagemaker.describe_training_job(TrainingJobName=classification_training_job_name)
model_artifact = training_job_info['ModelArtifacts']['S3ModelArtifacts']
algorithm_name = training_job_info['AlgorithmSpecification']['AlgorithmName']
print(model_artifact)
print(algorithm_name)

In [None]:
# Step 1: Create SageMaker Model Package
create_model_pkg_response = sagemaker.create_model_package(
    ModelPackageName=f"{classification_training_job_name}-package",  # You can customize this name
    SourceAlgorithmSpecification={
        'SourceAlgorithms': [
            {
                'AlgorithmName': algorithm_name,
                'ModelDataUrl': model_artifact
            },
        ]
    }
)

print(f"SageMaker Model package created: {create_model_pkg_response['ModelPackageArn']}")

Next, after we have model package, create SageMaker Model to run inference with batch transform job or host on endpoint based on your need. 

Batch Transform Jobs and SageMaker Endpoints are both used for inference in SageMaker but serve different purposes. Batch Transform is designed for offline inference, making predictions on large datasets stored in S3, and is ideal for bulk processing where low latency is not critical. Once the job is completed, the resources are released, making it cost-effective for sporadic workloads. SageMaker Endpoints are used for real-time inference, providing low-latency predictions suitable for applications requiring immediate responses. Endpoints remain active while deployed, making them better suited for continuous and steady traffic but potentially more costly due to ongoing resource usage.

We will use batch transform job as an example in this notebook.

In [None]:
# Step 2: Now you can use this model package ARN to create your model
model_package_arn = create_model_pkg_response['ModelPackageArn']
create_model_response = sagemaker.create_model(
    ModelName=f"{classification_training_job_name}-model",  # You can customize this name
    ExecutionRoleArn=sm_role_arn,
    PrimaryContainer={
        'ModelPackageName': model_package_arn
    },
    EnableNetworkIsolation=True, # EnableNetworkIsolation must be true for using product from AWS Marketplace.
)

print(f"Created model: {create_model_response['ModelArn']}")

In [None]:
# Step 3: Create SageMaker batch transform job
batch_job_name = "defect-detection-class-"+datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S')

#############################################
# Change to your input/output data S3 path  #
#############################################
s3_input_data = "s3://<Specify-s3-path-to-test-images>"
s3_output_path = "s3://<Specify-s3-path-to-store-transform-output>"

batch_transform_response = sagemaker.create_transform_job(
    TransformJobName=batch_job_name,
    ModelName=create_model_response['ModelArn'].split("/")[-1],
    MaxConcurrentTransforms=1,  # Adjust based on your workload
    TransformInput={
        'DataSource': {
            'S3DataSource': {
                'S3DataType': 'S3Prefix',
                'S3Uri': s3_input_data
            }
        },
        'ContentType': 'image/jpeg',
    },
    TransformOutput={
        'S3OutputPath': s3_output_path
    },
    TransformResources={
        'InstanceType': 'ml.c5.2xlarge',
        'InstanceCount': 1
    }
)

In [None]:
# Waiting for batch transform job to complete
while True:
    batch_response = sagemaker.describe_transform_job(
        TransformJobName=batch_job_name
    )
    if batch_response['TransformJobStatus'] == 'InProgress':
        print(".", end='')
    elif batch_response['TransformJobStatus'] == 'Completed':
        print("Completed")
        break
    elif batch_response['TransformJobStatus'] == 'Failed':
        print("Failed")
        break
    else:
        print("?", end='')
    time.sleep(60)

After batch transform job completed successfully, going to S3 output path specified in the job. There would be a output file format like `{image-file-name}.out`. For example, we were using `anomaly-1.jpg`, so there is a file in `s3_output_path` called `anomaly-1.jpg`. Below is the content inside the file:

```
{"Source": {"Type": "direct"}, "IsAnomalous": true, "Confidence": 0.9378743361326908}
```

So the result is anomaly and confidence score is 0.9378743361326908.