# Anomaly Detection for sensor data


## Usage instructions
You can run this notebook one cell at a time (by using Shift+Enter for running a cell).

### A. Subscribe to the algorithm
To subscribe to the algorithm:

1. Open the algorithm listing page.
2. On the AWS Marketplace listing, click on the Continue to subscribe button.
3. On the Subscribe to this software page, review and click on "Accept Offer" if you agree with EULA, pricing, and support terms.
4. Once you click on the Continue to configuration button and then choose a region, you will see a **Product ARN**. This is the **algorithm ARN** that you need to specify while training a custom ML model. Copy the ARN corresponding to your region and specify the same in the following cell.

### B. Import Libraries and setting up the environment

In [1]:
import base64
import boto3
import docker
import json
import pandas as pd
import requests
import sagemaker
from sagemaker import get_execution_role
import socket
import time
from urllib.parse import urlparse
from joblib import dump, load
import matplotlib.pyplot as plt
import numpy as np


session = sagemaker.Session()
region = session.boto_region_name
account_id = boto3.client("sts").get_caller_identity().get("Account")
role = get_execution_role()

sagemaker_client = boto3.client("sagemaker")
s3_client = session.boto_session.client("s3")
ecr = boto3.client("ecr")
sm_runtime = boto3.client("sagemaker-runtime")
runtime_client = boto3.client('sagemaker-runtime')
s3 = boto3.client('s3')


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/sagemaker-user/.config/sagemaker/config.yaml


### C. Train a machine learning algorithm

Now that dataset is available in an accessible Amazon S3 bucket, we are ready to train a machine learning model.

In [2]:
training_data = "s3://<bucket-name>/<key>/<filename.json>" ## to be modified by the user

#### 1. Train a model or create a training job

In [3]:
training_job_name = 'training-job-name' ## to be modified by the user

algo_arn = 'arn:aws:sagemaker:<region>:...' ## to be modified by the user based on the product ARN

estimator = sagemaker.algorithm.AlgorithmEstimator(
    algorithm_arn=algo_arn,
    base_job_name=training_job_name,
    role=role,
    instance_count=1,
    instance_type="ml.m5.large",
    input_mode="File",
    sagemaker_session=session
)

estimator.fit({"train": training_data})

INFO:sagemaker:Creating training-job with name: test-trainjob-2024-08-30-04-25-45-466


2024-08-30 04:25:45 Starting - Starting the training job...
2024-08-30 04:26:05 Starting - Preparing the instances for training...
2024-08-30 04:26:30 Downloading - Downloading input data...
2024-08-30 04:26:50 Downloading - Downloading the training image.....[34msagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml[0m
[34msagemaker.config INFO - Not applying SDK defaults from location: /home/jovyan/.config/sagemaker/config.yaml[0m
[34m['/opt/ml/input/data/train/NONANOAMLY_FEATURE_train.json'][0m
[34mMode: train[0m
[34mRunning training...[0m

2024-08-30 04:28:13 Training - Training image download completed. Training in progress.
2024-08-30 04:28:13 Uploading - Uploading generated training model
2024-08-30 04:28:13 Completed - Training job completed
Training seconds: 102
Billable seconds: 102


### D. Deploy model and verify results

Now you can deploy the model for performing real-time inference.

In [4]:
serializer = sagemaker.serializers.CSVSerializer(content_type="application/json")
deserializer = sagemaker.deserializers.CSVDeserializer(accept="application/json")
instance_type = "<instance-type>" ## to be modified by the user based on the available instance types

#### 1. Deploy trained model

This code block creates the model package folllowed by the model creation followed by the endpoint creation based on the above created training job.

In [5]:
predictor = estimator.deploy(
    initial_instance_count=1,
    instance_type=instance_type,
    serializer=serializer,
    deserializer=deserializer,
    model_name=f"test-model",
    endpoint_name=f"test-endpoint",
)

INFO:sagemaker:Creating model package with name: test-model


.........

INFO:sagemaker:Creating model with name: test-model





INFO:sagemaker:Creating endpoint-config with name test-endpoint
INFO:sagemaker:Creating endpoint with name test-endpoint


-------!

### E. Create input payload
The inference algorithm takes as input a JSON file containing the sensor data. Each column of the JSON file represents a sensor data, while each row represents a time step.


In [None]:
s3_uri = "s3://<bucket-name>/<key>/<test-filename.json>" ## to be modified by the user
bucket, key = s3_uri[len("s3://"):].split('/', 1)

obj = s3.get_object(Bucket=bucket, Key=key)
data = obj['Body'].read().decode('utf-8')

input_data = json.loads(data)
input_data_json = json.dumps(input_data)

#### 1. Perform real-time inference

In [6]:
endpoint_name = "<endpoint-name>" ## to be modified by the user

response = runtime_client.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType='application/json',
    Body=input_data_json
)
result = json.loads(response['Body'].read().decode())

#### 2. Save results to S3¶
Saving the results with the model name, eg, model_1.json and so on in S3 bucket

In [7]:
def save_results_to_s3(results, bucket_name, base_key):
    for key, value in results.items():
        json_data = json.dumps(value)
        s3_key = f"{base_key}/{key}.json"
        try:
            s3.put_object(Bucket=bucket_name, Key=s3_key, Body=json_data, ContentType='application/json')
            print(f"Successfully uploaded {s3_key} to S3.")
        except Exception as e:
            print(f"Error uploading {s3_key} to S3: {e}")


bucket_name = "<bucket-name>"
base_key = "<key>"
save_results_to_s3(result, bucket_name, base_key)

Successfully uploaded output/model_1.json to S3.
Successfully uploaded output/model_2.json to S3.
Successfully uploaded output/model_3.json to S3.
Successfully uploaded output/model_4.json to S3.
Successfully uploaded output/model_5.json to S3.
