


### [AWS sagemaker MLOPs](https://docs.aws.amazon.com/sagemaker/latest/dg/mlops.html)

### Sagemaker Steps 

- Create a model in SageMaker Inference by pointing to model artifacts stored in Amazon S3 and a container image.
- Select an inference option. For more information, see Inference options.
- Create a SageMaker Inference endpoint configuration by choosing the instance type and number of instances you need behind the endpoint. You can use Amazon SageMaker Inference Recommender to get recommendations for instance types. For Serverless Inference, you only need to provide the memory configuration you need based on your model size.
- Create a SageMaker Inference endpoint.
- Invoke your endpoint to receive an inference as a response.


### Sagemaker experiments : Track experiments for ML processes 

### Workflows
    
1. Pipelines: 
    *  Amazon SageMaker Model Building Pipelines pipeline is a series of interconnected steps that are defined using the Pipelines SDK. 
    * You can also build your pipeline without the SDK using the pipeline definition JSON schema. 
    * This pipeline definition encodes a pipeline using a directed acyclic graph (DAG) that can be exported as a JSON             definition.  
    * This DAG gives information on the requirements for and relationships between each step of your pipeline. The               structure of a pipeline's DAG is determined by the data dependencies between steps. 
    * These data dependencies are created when the properties of a step's output are passed as the input to another step.         The following image is an example of a pipeline DAG:

    * An Amazon SageMaker Model Building Pipelines instance is composed of a name, parameters, and steps. Pipeline names must be unique within an (account, region) pair.

2. Pipeline steps
    - Step Types
    - Step Properties
    - Step Parallelism
    - Data Dependency Between Steps
    - Custom Dependency Between Steps
    - Use a Custom Image in a Step

3. Parameter supports
    - ParameterString – Representing a string parameter.

    - ParameterInteger – Representing an integer parameter.

    - ParameterFloat – Representing a float parameter.

    - ParameterBoolean – Representing a Boolean Python type.

Note: As you use SageMaker Pipelines to create workflows and orchestrate your ML training steps, you might need to undertake multiple experimentation phases. Instead of running the entire pipeline from start to finish, you might only want to iterate over particular steps. SageMaker Pipelines supports selective execution of pipeline steps to help you optimize your ML training.for more refer 
[Selective execution of pipeline steps](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-selective-ex.html)
      
      
### Deploy model for inference 
Amazon SageMaker offers the following four options to deploy models for inference.

- Real-time inference for inference workloads with real-time, interactive, low latency requirements.

- Batch transform for offline inference with large datasets.

- Asynchronous inference for near-real-time inference with large inputs that require longer preprocessing times.

- Serverless inference for inference workloads that have idle periods between traffic spurts.

[feature matrix for comparisions](https://docs.aws.amazon.com/sagemaker/latest/dg/model-deploy-feature-matrix.html)


### Register and Deploy Models with Model Registry
1) [Model group](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry-model-group.html): A Model Group contains a group of versioned models. Create a Model Group by using either the AWS SDK for Python (Boto3) or the Amazon SageMaker Studio console.

2) [Register model version](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry-version.html)

     You can register an Amazon SageMaker model by creating a model version that specifies the model group to which it          belongs. A model version must include both the model artifacts (the trained weights of a model) and the inference code      for the model.
3) [Deploy model from registry](https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry-deploy.html)


### Tracking and troubleshoot Sagemaker  

Lineage tracking in Studio is centered around a directed acyclic graph (DAG). The DAG represents the steps in a pipeline. From the DAG you can track the lineage from any step to any other step.   
   - [Lineage steps link](https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines-lineage-tracking.html)
   - [deployment troubleshoot](https://docs.aws.amazon.com/sagemaker/latest/dg/deploy-model-reference.html)
   
### [Project Template](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-projects-templates-sm.html) 

Amazon SageMaker provides project templates that create the infrastructure you need to create an MLOps solution for continuous integration and continuous deployment (CI/CD) of ML models. Use these templates to process data, extract features, train and test models, register the models in the SageMaker model registry, and deploy the models for inference. You can customize the seed code and the configuration files to suit your requirements.

   

### [MLOps Workload Orchestrator](https://docs.aws.amazon.com/solutions/latest/mlops-workload-orchestrator/solution-overview.html)

Deploy a robust pipeline that uses managed automation tools and machine learning (ML) services to simplify ML model
development and production
   
###  [FAQS](https://docs.aws.amazon.com/sagemaker/latest/dg/mlopsfaq.html)

    1. what is step decorator and when to use it? 
    2. How to pass data between steps 
    3. How to troubleshoot the MLOPs or how to identify MLOPs pipeline fails
    4. how to Define a sagemaker Pipeline and all featres
    5. how to track pipeline 


### usecases and start to build
1. [Automate Machine Learning Workflows steps](https://aws.amazon.com/tutorials/machine-learning-tutorial-mlops-automate-ml-workflows/)  

2. [Sagemaker Deployment Tutorial using realtime inference build and tested](https://aws.amazon.com/tutorials/machine-learning-tutorial-deploy-model-to-real-time-inference-endpoint/)

3. [xgboost_customer_churn_studio](https://github.com/aws/amazon-sagemaker-examples/blob/main/aws_sagemaker_studio/getting_started/xgboost_customer_churn_studio.ipynb)

4. [fraud detection](https://github.com/aws/amazon-sagemaker-examples/tree/main/end_to_end/fraud_detection)

5. [Sagemaker and gitlabs](https://aws.amazon.com/blogs/machine-learning/build-mlops-workflows-with-amazon-sagemaker-projects-gitlab-and-gitlab-pipelines)

6. [Gitlab and sagemaker](https://github.com/aws-samples/sagemaker-custom-project-templates/tree/main/mlops-template-gitlab)

7. [MLOPs reference](https://towardsdatascience.com/a-practical-guide-to-mlops-in-aws-sagemaker-part-i-1d28003f565)

8. [NLP Sentiment MLOPs](https://github.com/aws/amazon-sagemaker-examples/tree/main/end_to_end/nlp_mlops_company_sentiment)
    1. Model: FinBERT
    2. Dataset: NewsCatcher API 

9. [Comparing model metrics with SageMaker Pipelines and SageMaker Model Registry](https://github.com/aws/amazon-sagemaker-examples/blob/main/sagemaker-pipeline-compare-model-versions/notebook.ipynb)

10. [Build machine learning workflows with Amazon SageMaker Processing and AWS Step Functions Data Science SDK](https://github.com/aws/amazon-sagemaker-examples/blob/main/step-functions-data-science-sdk/step_functions_mlworkflow_processing/step_functions_mlworkflow_scikit_learn_data_processing_and_model_evaluation.ipynb)


## Other components 

### S3 Storage classes 
- S3 Standard
- S3 Standard-Infrequent Access (IA)
- S3 One Zone-Infrequent Access (IA)
- S3 Glacier Instant Retrieval
- S3 Glacier Flexible Retrieval
- S3 Glacier Deep Archive


### Shadow tests 

With Amazon SageMaker you can evaluate any changes to your model serving infrastructure by comparing its performance against the currently deployed infrastructure.


### [CORS](https://docs.aws.amazon.com/AmazonS3/latest/userguide/cors.html)

### [Data Encryption](https://docs.aws.amazon.com/whitepapers/latest/logical-separation/encrypting-data-at-rest-and--in-transit.html)

### [Step Functions](https://aws.amazon.com/step-functions/)

AWS Step Functions provides serverless orchestration for modern applications. Orchestration centrally manages a workflow by breaking it into multiple steps, adding flow logic, and tracking the inputs and outputs between the steps.
explore the step functions usecases [here](https://aws.amazon.com/step-functions/use-cases/)







### AWS Lambda(https://docs.aws.amazon.com/lambda/latest/dg/welcome.html)

- AWS Lambda is a compute service that lets you run code without provisioning or managing servers.

https://www.youtube.com/watch?v=EBSdyoO3goc

4 guildine
1. no server for provision and maintainence
2. scale with usage
3. pay for value
4. availability and fault tolerence




#### AWS Lambda

AWS Lambda is a serverless computing service provided by Amazon Web Services (AWS). Users of AWS Lambda create functions, self-contained applications written in one of the supported languages and runtimes, and upload them to AWS Lambda, which executes those functions in an efficient and flexible manner clearly explained [official here](https://docs.aws.amazon.com/lambda/latest/dg/welcome.html) and also define [here](https://www.serverless.com/aws-lambda)   
for python implementation mention [here](https://docs.aws.amazon.com/lambda/latest/dg/welcome.html)
and [lambda handler](https://docs.aws.amazon.com/lambda/latest/dg/python-handler.html)


##### Calling Sagemaker endpoint using AWS API gateway and Lambda
The following diagram shows how the deployed model is called using serverless architecture. Starting from the client side, a client script calls an Amazon API Gateway API action and passes parameter values. API Gateway is a layer that provides the API to the client. In addition, it seals the backend so that AWS Lambda stays and runs in a protected private network. API Gateway passes the parameter values to the Lambda function. The Lambda function parses the value and sends it to the SageMaker model endpoint. The model performs the prediction and returns the predicted value to Lambda. The Lambda function parses the returned value and sends it back to API Gateway. 
API Gateway responds to the client with that value.

[Calling Sagemaker endpoint using AWS API gateway and Lambda](https://aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/)

![AWS serverless](./predictive_maintainence/images/serverless_architecture.jpg)



##### Lambda function for processing input data

Creae Batch transformer function \
https://docs.aws.amazon.com/sagemaker/latest/dg/batch-transform-data-processing.html

[preprocessing using Lambda](https://docs.aws.amazon.com/kinesisanalytics/latest/dev/lambda-preprocessing.html)

https://docs.aws.amazon.com/kinesisanalytics/latest/dev/lambda-preprocessing-functions.html

#### Usecases

https://aws.amazon.com/blogs/machine-learning/call-an-amazon-sagemaker-model-endpoint-using-amazon-api-gateway-and-aws-lambda/



### [Amazon EC2](https://aws.amazon.com/ec2/instance-types/?gclid=Cj0KCQiAw6yuBhDrARIsACf94RV-ywvJfljJQeU5vQeEGV1HoHEOFMgxroK-Aftf2yL_yNZn_KR16j4aAshuEALw_wcB&trk=32f4fbd0-ffda-4695-a60c-8857fab7d0dd&sc_channel=ps&ef_id=Cj0KCQiAw6yuBhDrARIsACf94RV-ywvJfljJQeU5vQeEGV1HoHEOFMgxroK-Aftf2yL_yNZn_KR16j4aAshuEALw_wcB:G:s&s_kwcid=AL!4422!3!536392685920!e!!g!!ec2%20instance!11539707735!118057054048%3E)

Amazon EC2 provides a wide selection of instance types optimized to fit different use cases. Instance types comprise varying combinations of CPU, memory, storage, and networking capacity and give you the flexibility to choose the appropriate mix of resources for your applications. Each instance type includes one or more instance sizes, allowing you to scale your resources to the requirements of your target workload.

    



### Unfolding usecases to get intuition to insight finally to workable solutions

#### Build machine learning workflows with Amazon SageMaker Processing and AWS Step Functions Data Science SDK

#### The high level steps include below -
	1. Run a SageMaker processing job using ProcessingStep of AWS Step Functions Data Science SDK to run a scikit-learn script that cleans, pre-processes, performs feature engineering, and splits the input data into train and test sets.
	2. Run a training job using TrainingStep of AWS Step Functions Data Science SDK on the pre-processed training data to train a model
	3. Run a processing job on the pre-processed test data to evaluate the trained model's performance using ProcessingStep of AWS Step Functions Data Science SDK
