## 4. ML Implementation & Operations

### Part-One Build ML solutions for performance, availability, scalability, resiliency, and fault tolerance

**Important Topics**
* Amazon Deep Learning containers
* AWS Deep Learning AMI (Amazon Machine Image)
* AWS Auto Scaling
* AWS GPU (P2 and P3) and CPU instances
* Amazon CloudWatch
* AWS CloudTrail

**High availability and fault tolerance**<br>
In a highly available solution, the system will continue to function even when any component of the architecture stops working. A key aspect of high availability is fault tolerance, which, when built into an architecture, ensures that applications will continue to function without degradation in performance, despite the complete failure of any component of the architecture.

**Key AWS Services that can be used to maintain High availability:**<br>
1. Amazon Glue and Amazon EMR 
You should decouple your ETL process from the ML pipeline. The compute power needed for ML isn’t the same as what you’d need for an ETL process—they have very different requirements. 
You can make use of this decoupled architecture by simply using an ETL service like AWS Glue or Amazon EMR, which use Apache Spark for your ETL jobs and Amazon SageMaker to train, test, and deploy your models.

2. Amazon Sagemaker Endpoints
To ensure a highly available ML serving endpoint, deploy Amazon SageMaker endpoints backed by multiple instances across Availability Zones. 

### a. Amazon Cloud Watch
**Amazon CloudWatch** helps you monitor your system while storing all the logs and operational metrics separately from the actual implementation and code for training and testing your ML models.

**Amazon CloudWatch Events** delivers a near-real-time stream of system events that describe changes in AWS resources

**Amazon CloudWatch alarms** allows you to watch CloudWatch metrics and to receive notifications when the metrics fall outside of the levels (high or low thresholds) that you configure


Amazon SageMaker provides out-of-the-box integration with Amazon CloudWatch, which collects near-real-time utilization metrics for the training job instance, such as CPU, memory, and GPU utilization of the training job container.

### b. AWS CloudTrail
**AWS CloudTrail** captures API calls and related events made by or on behalf of your AWS account and delivers the log files to an Amazon S3 bucket that you specify. You can identify which users and accounts called AWS, the source IP address from which the calls were made, and when the calls occurred.

### c. AWS Auto Scaling
With AWS Auto Scaling, you configure and manage scaling for your resources through a scaling plan. The scaling plan uses dynamic scaling and predictive scaling to automatically scale your application’s resources. The scaling plan lets you choose scaling strategies to define how to optimize your resource utilization. You can optimize for availability, for cost, or a balance of both. 

To determine the scaling policy for automatic scaling in Amazon SageMaker, test for how much load Recieve Packet Steering (RPS) the endpoint can sustain. Then configure automatic scaling and observe how the model behaves when it scales out. Expected behavior is lower latency and fewer or no errors with automatic scaling.

## Part-Two Recommend and implement the appropriate ML services and features for a given problem

**Important Topics**
* Amazon SageMaker Spark containers
* Amazon SageMaker build your own containers
* Amazon AI services
    * Amazon Translate
    * Amazon Lex
    * Amazon Polly
    * Amazon Transcribe
    * Amazon Rekognition
    * Amazon Comprehend

**Data Ingestion and Transformation Services**
* AWS Glue 
* Amazon EMR
* Amazon Kinesis

**Model Building, Training, Tuning, and Evaluation Services**
* Amazon SageMaker

**Amazon AI Services**<br>
* Rekognition - Vision, Image Analysis, Face Recognition, Image classification
* Polly - Speech, Text to speech, speech enabled products
* Lex - Used to build conversational application for voice and text
* Translate - Delivers fast, high-quality, affordable, and customizable language translation.
* Transcribe - Automatically convert speech to text
* Comprehend - Derive and understand valuable insights from text within documents
* Forecast - A time-series forecasting service based on machine learning (ML) and built for business metrics analysis.
* Personalize - Quickly build and deploy curated recommendations and intelligent user segmentation at scale using machine learning (ML). 

## Part-Three Apply Basic AWS security practices to ML solutions

**Important Topics**
* Security on Amazon SageMaker
* Infrastructure security on Amazon SageMaker
* What is a:
    * VPC
    * Security group
    * NAT gateway
    * Internet gateway
* AWS Key Management Service (AWS KMS)
* AWS Identity and Access Management (IAM)

There are two ways to use AWS KMS with Amazon S3:
1. Client Side
2. Server Side
    * **SSE-S3** requires that Amazon S3 manage the data and master encryption keys.
    * **SSE-C** requires that you manage the encryption key. 
    * **SSE-KMS** requires that AWS manage the data key, but you manage the customer master key in AWS KMS. 


## Part-Four Deploy and operationalize ML solutions

**Important Topics**
* A/B testing with Amazon SageMaker
* Amazon SageMaker endpoints
    * Production variants
    * Endpoint configuration
* Using Lambda with Amazon SageMaker

**Deploying a model using Amazon SageMaker hosting services is a three-step process**<br>
1. Create a model in Amazon SageMaker - You need:
    * The Amazon S3 path where the model artifacts are stored 
    * The Docker registry path for the image that contains the inference code 
    * A name that you can use for subsequent deployment steps
<br>
2. Create an endpoint configuration for an HTTPS endpoint - You need:
    * The name of one or more models in production variants
    * The ML compute instances that you want Amazon SageMaker to launch to host each production variant. When hosting models in production, you can configure the endpoint to elastically scale the deployed ML compute instances. For each production variant, you specify the number of ML compute instances that you want to deploy. When you specify two or more instances, Amazon SageMaker launches them in multiple Availability Zones. This ensures continuous availability. Amazon SageMaker manages deploying the instances.
<br>
3. Create an HTTPS endpoint - You need:
    * To provide the endpoint configuration to Amazon SageMaker. The service launches the ML compute instances and deploys the model or models as specified in the configuration.