### What is the machine learning workflow?

<img src="part-6_images/ml_workflow.png" alt="Machine Learning workflow" style="width: 500px;">

### How does deployment fit into the machine learning workflow?

<img src="part-6_images/mlworkflow-deployment-chars.png" alt="Machine Learning workflow - Deployment" style="width: 500px;">


SageMaker workflow: https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-mlconcepts.html

GCP workflow: https://cloud.google.com/ml-engine/docs/tensorflow/ml-solutions-overview

Azure workflow: https://docs.microsoft.com/en-us/azure/machine-learning/service/overview-what-is-azure-ml

### What is cloud computing?

Storing your data on a server e.g. Google Drive, OneDrive instead of flash drive or HDD

### Why would we use cloud computing for deploying machine learning models?

*Benefits of Cloud Computing*

- Reduced Investments and Proportional Costs (providing cost reduction)
- Increased Scalability (providing simplified capacity planning)
- Increased Availability and Reliability (providing organizational agility)

*Risks*

- (Potential) Increase in Security Vulnerabilities
- Reduced Operational Governance Control (over cloud resources)
- Limited Portability Between Cloud Providers
- Multi-regional Compliance and Legal Issues


### What does it mean for a model to be deployed?

Deployment to production can simply be thought of as a method that integrates a machine learning model into an existing production environment so that the model can be used to make decisions or predictions based upon data input into the model. 

#### Paths to Deployment:

1. Python model is recoded into the programming language of the production environment.
2. Model is coded in Predictive Model Markup Language (PMML) or Portable Format Analytics (PFA).
3. Python model is converted into a format that can be used in the production environment.

The last option is the easiest and fastest way to move a Python model from modeling directly to deployment.

e.g. The third method that's most similar to what’s used for deployment within Amazon’s SageMaker.

Most popular machine learning software frameworks, like PyTorch, TensorFlow, SciKit-Learn, have methods that will convert Python models into intermediate standard format

### Production Environments

Production Environment is just the application that customers use to receive predictions from the deployed model.

<img src="part-6_images/prod_environment.png" alt="Production environment" style="width: 500px;"/>

#### Production Environment and the Endpoint

The endpoint is an interface that:

- Allows the application to send user data to the model and
- Receives predictions back from the model based upon that user data.

One way to think of the endpoint that acts as this interface, is to think of a Python program where:

- the endpoint itself is like a function call
- the function itself would be the model and
- the Python program is the application.

<img src="part-6_images/endpointprogram-2.png" alt="Endpoint as a program" style="width: 500px;"/>

### Endpoint and REST API

Communication between the application and the model is done through the endpoint where the endpoint is an **API**. 
- An API is a set of rules that enable programs to communicate with each other
- The API uses a REST framework that uses HTTP **requests** and **responses** to enable communication

<img src="part-6_images/httpmethods.png" alt="HTTP methods" style="width: 600px;"/>


The **HTTP request** that’s sent from your application to your model is composed of four parts:
- endpoint, which is a URL
- HTTP method, e.g. GET, POST
- HTTP header, contains additional information such as the format
- Message, will contain the user's data

The **HTTP response** sent from your model to your application is composed of three parts:
- HTTP Status Code, e.g. 200
- HTTP header, format
- Message, will contain the predictions of the model

Often the user's data and the prediction will be in JSON or CSV format.

Finally, the application acts like an intermediate layer which means that it has to:
- To format the user’s data in a way that can be easily put into the HTTP request message and used by the model. 
- To translate the predictions from the HTTP response message in a way that’s easy for the application user’s to understand.

### Containers

Both the application and model require computing environment that can be run and is available. This can be achieved by the use of **containers** which are essentially collection that is standardized to have all the software/libraries that it need to run an application A very popular example of this is Docker.

A container, Docker, for example:
- Can contain all types of different software.
- The structure of a Docker container enables the container to be created, saved, used, and deleted through a set of common tools. 
- The common tool set works with any container regardless of the software the container contains. 

<img src="part-6_images/container-1.png" alt="Container architecture" style="width: 600px;"/>

A container can easily created by a script file that has the necessary instructions to replicate a particular container. 

<img src="part-6_images/container-2.png" alt="Container architecture 2" style="width: 600px;"/>


### What are the crucial characteristics associated with deploying models?

- Model Versioning, allows us to see the model's version
- Model Monitoring, easily monitor deployed models
- Updating and Routing, ability to easily update a model's performance if it fails, and allow to test the model's performance compared to other variants through routing 
- Predictions
    - On-Demand - allows the users to retrieve predictions in real time
    - Batch - for business decisions that are not required real-time. For example, imagine a business uses a complex model to predict customer satisfaction across a number of their products and they need these estimates for a weekly report. This would require processing customer data through a batch prediction request on a weekly basis.  