<a href="https://colab.research.google.com/github/venkatacrc/Notes/blob/master/ML_GCP/ProductionML.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Production ML Systems
Source: [Production Machine Learning Systems](https://www.coursera.org/learn/gcp-production-ml-systems/)

###Course Outline
1. Architecting Production ML Systems
  1. Intro
  1. Components of an ML system
    1. Data Analysis and Validation
    1. Data Transformation + Trainer
    1. Tuner + Model Evaluation and Validation
    1. Serving
    1. Orchestration + workflow
    1. Integrated Frontend + Storage
  1. Design Decisions
    1. Training Design Decisions
    1. Serving Design Decisions
  1. Serving on Cloud MLE
    1. Lab
  1. Designing an Architecture from Scratch
1. Ingesting Data for Cloud-based analytics and ML
  1. Intro
  1. Data Scenarios
    * Data On-Premise
    * Large Datasets
    * Data on Other Clouds
    * Existing Databases
  1. Demos
    * Load data into BigQuery
    * Automatic ETL Pipelines into GCP
1. Designing Adaptable ML systems
1. Designing High Performance ML Systems
1. Hybrid ML Systems

###Architecting Production ML Systems
1. What's in a Production ML System
  * Data Collection
  * Data Verification
  * Machine Resource Management
  * Feature Extraction
  * Process Management Tools
  * Configuration
  * Monitoring
  * Analysis Tools
  * Serving Infrastructure
  * ML Code
1. Training Design Decisions
1. Serving Design Decisions
1. Serving on CMLE (scalability)
1. Designing an Architecture from Scratch

![](https://drive.google.com/uc?id=1Z3J8cNIxAgssBpCSOgty-NB2TRDEwAzj)

####Other Components of ML System

![](https://drive.google.com/uc?id=1QgTOw5GRP2j7VLlyO05HNNPKNIQtvjpE)

Reuse generic software frameworks
* TensorFlow
* TF Serving
* Apache Spark
* Apache Beam
Use managed services
* Cloud Dataproc
* Cloud Dataflow
* Cloud ML Engine







##The Components of an ML System

###Data Ingestion

![](https://drive.google.com/uc?id=1tNqCCmtoBl441XrlWaJIdo_Tww1ZGF-C)

**Streaming Data Ingestion Pipeline Architecture**

$
\left.\begin{array}{ccc}
{Applications} \\
{Devices} \\
{Databases}
\end{array}\right\} \rightarrow
\left.\begin{array}{ccc}
{(Ingest)}\\
{Cloud Pub/Sub}
\end{array}\right\} \rightarrow
\left.\begin{array}{ccc}
{(Process)}\\
{Cloud Dataflow}
\end{array}\right\} \rightarrow \left\{ 
\begin{array}{ccc}
{(Analyze)}\\
{Data Studio | Third Party Tools}\\
{\uparrow} \\
{Cloud BigQuery} \mapsto {Data Warehouse}\\ 
{Cloud MLE} \mapsto {Predictive Analytics}\\
{Cloud BigTable} \mapsto {Caching \& Serving} 
\end{array}\right.$

####General Data Ingestion

Involves taking Structured(BigQuery), Streaming(PubSub) and Unstructured(Cloud Storage) data like Text, Audio, Image, Video, and Tabular data to create TFRecord or CSV.

Read$\rightarrow$Process$\rightarrow$Write

###Data Validation
Is the data healthy or not?
1. Is the new distribution similar enough to the old one? (5 number summary, modes, likelihood of distribution)
1. Are all expected features present?
1. Are any unexpected features present?
1. Does the feature have the expected type?
1. Does an expected proportion of the examples contain the feature?
1. Do the examples have the expected number of values for feature?

Data Validation Services
  * Datalab
  * DataStudio
  * Cloud Reliability




###Data Transformation
For feature wrangling

Data Transformation Services
  * Dataflow
  * Dataproc
  * Dataprep

###Trainer & Tuner
It needs to support data and model parallelism and scale large number of workers, Monitor and log, experimentation, and Hyperparameter tuning.
  * Cloud ML Engine(Managed service)
    1. Scalable
    1. Integrated with Tuner, Logging, Serving components
    1. Experiment-oriented(A/B Testing)
    1. Open
  * GKE(Kubeflow)

###Model Evaluation and Validation
Tools:
  * TFX Model Analysis
A good model is hard to find
1. model Safeness
1. Prediction Quality

Release, Develop, and Test Cycle
###Serving
Tools:
  * ML Engine
  * TF Serving(GKE)
* Low latency
* Highly efficient
* Scale Horizontally
* Reliable and robust
* Easy to update versions (Multi-armed Mandit Testing)
###Logging
Tools:
  * Cloud Reliability





###Shared Config and Utilities
Quiz: If changes are made to the trainer, what component(s) might also need to change?
Answer: Potentially all of them

Configuration Remedies:
1. Establish a common architecture for both R&D and production deployment
1. Embed the teams together, so that engineering can influence the design of code from its inception

**Orchestration** glues all the components together
Tools:
  * Cloud Composer (managed Apache Airflow)
  * Argo (GKE)

Steps to Compose a Workflow in Cloud Composer
1. Define the Ops
1. Arrange into a DAG
1. Upload to Environment
1. Explore DAG Run in Web UI




In [0]:
# A basic workflow

# BigQuery training data query
t1 = BigQueryOperator(params)

# BigQuery training data export to GCS
t2 = BigQueryToCloudStorageOperatot(params)

# ML Engine training job
t3 = MLEngineTrainingOperator(params)

# App Engine deploy new version
t4 = AppEngineVersionOperator(params)

# Etsablish dependencies
t1 >> t2 >> t3 >> t4 

###Integrated Frontend
Tools:
  * ML Engine
  * TensorBoard

http://projector.tensorflow.org/

Debug TF in real-time line by line execution.

###Pipeline Storage
Tools:
  * GCS Google Cloud Storage


##Training Design Decision
Static Vs Dynamic Training

In static we do top to bottom only once where as in Dynamic traing we do it repeatedly.
  * Acquire Data
  * Transform Data
  * Train model
  * Test model
  * Deploy Model

Statically Trained Models|Dynamically Trained Models
---|---
Trained once, offline|Add training data over time
Easy to build and test|Engineering is harder have to do progressive validation
Easy to let become stale | Regularly sync out updated version. Will adapt to changes

![](https://drive.google.com/uc?id=1BpMDbQDmlkWHMfBAGnzC48PKVlLVYkoW)


![](https://drive.google.com/uc?id=1rGy-_MB2P1FnQGiap88PZ_F4Cz3UspgP)



![](https://drive.google.com/uc?id=1W2F3j3pktQrhDGLc702lk92_s-Pkjp_B)

##Serving Design Decisions

###Static Vs Dynamic Serving

Static serving uses lookup table with the pre-computed labels to serve the prediction request.

####Architecting a Static Serving Model
1. Change Cloud MLE from online to batch prediction job
1. Model accpets and passes keys as input
1. Write predictions to a data warehouse(e.g. BigQuery)

Where as in Dynamic serving runs the model to generate the labes on demand to serve the prediction request.

Static|Dynamic
---|---
Higher storage cost|lower storage cost
low, fixed latency | variable latency
low maintenance|hiher maintenance
space intensive|Compute intensive

* **Peakedness** is how concentrated the distribution is
* **Cardinality** (size of the input space) is the number of values in the set

Cardinality is low use static serving.

Hybrid solutions optimize for both types of prediction workloads most frequest results cached and tail computed on-demand.

Problem|Inference Style
---|---
Spam| Dynamic
Voice to Text | Dynamic/Hybrid
Shopping ad conversion rate | Static

```
gcloud ml-engine predict --model $MODEL_NAME \
                    --version $VERSION_NAME \
                    --json-instances $INPUT_DATA_FILE
```

###Lab: Invoke ML predictions with Google App Engine(GAE)

![](https://drive.google.com/uc?id=1Gdm2SibRslIJC46gFga_zGJF_VvZ80U4)

Repo:
git clone https://github.com/GoogleCloudPlatform/training-data-analyst

1. Build an App on GAE that makes REST calls(Web API requests) to CMLE


In [0]:
Use “gcloud config set project [PROJECT_ID]” to change to a different project.
student_02_febba43ae9df@cloudshell:~ (qwiklabs-gcp-02-3e892ea7122e)$ gcloud auth list
           Credentialed Accounts
ACTIVE  ACCOUNT
*       student-02-febba43ae9df@qwiklabs.net

To set the active account, run:
    $ gcloud config set account `ACCOUNT`

student_02_febba43ae9df@cloudshell:~ (qwiklabs-gcp-02-3e892ea7122e)$ gcloud config list project
[core]
project = qwiklabs-gcp-02-3e892ea7122e

Your active configuration is: [cloudshell-29148]

Start Cloud Shell
Activate Google Cloud Shell
Google Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Google Cloud Shell provides command-line access to your GCP resources.

In GCP console, on the top right toolbar, click the Open Cloud Shell button.

Cloud Shell icon

Click Continue. 

It takes a few moments to provision and connect to the environment. When you are connected, you are already authenticated, and the project is set to your PROJECT_ID. For example:

Cloud Shell Terminal

gcloud is the command-line tool for Google Cloud Platform. It comes pre-installed on Cloud Shell and supports tab-completion.

You can list the active account name with this command:

gcloud auth list

Output:

Credentialed accounts:
 - <myaccount>@<mydomain>.com (active)
Example output:

Credentialed accounts:
 - google1623327_student@qwiklabs.net
You can list the project ID with this command:

gcloud config list project

Output:

[core]
project = <project_ID>
Example output:

[core]
project = qwiklabs-gcp-44776a13dea667a6
Full documentation of gcloud is available on Google Cloud gcloud Overview.
Copy trained model
Step 1
Set necessary variables and create a bucket:

REGION=us-central1
BUCKET=$(gcloud config get-value project)
TFVERSION=1.7
gsutil mb -l ${REGION} gs://${BUCKET}
Step 2
Copy trained model into your bucket:

gsutil -m cp -R gs://cloud-training-demos/babyweight/trained_model gs://${BUCKET}/babyweight
Deploy trained model
Step 1
Set necessary variables:

MODEL_NAME=babyweight
MODEL_VERSION=ml_on_gcp
MODEL_LOCATION=$(gsutil ls gs://${BUCKET}/babyweight/export/exporter/ | tail -1)
Step 2
Deploy trained model:

gcloud ai-platform models create ${MODEL_NAME} --regions $REGION
gcloud ai-platform versions create ${MODEL_VERSION} --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --runtime-version $TFVERSION
Code for your frontend
Step 1
Clone the course repository:

cd ~
git clone https://github.com/GoogleCloudPlatform/training-data-analyst
Step 2
You can use the Cloud Shell code editor to view and edit the contents of these files.

Click on the (b8ebde10ba2a31c8.png) icon on the top right of your Cloud Shell window to launch Code Editor.

Once launched, navigate to the ~/training-data-analyst/courses/machine_learning/deepdive/06_structured/labs/serving directory.

Step 3
Open the application/main.pyand application/templates/form.html files and notice the #TODOs within the code. These need to be replaced with code. The next section tells you how.

Modify main.py
Step 1
Open the main.py file by clicking on it. Notice the lines with # TODO for setting credentials and the api to use.

Set the credentials to use Google Application Default Credentials (recommended way to authorize calls to our APIs when building apps deployed on AppEngine):

credentials = GoogleCredentials.get_application_default()
Specify the api name (ML Engine API) and version to use:

api = discovery.build('ml', 'v1', credentials=credentials)
Step 2
Scroll further down in main.py and look for the next #TODO in the method get_prediction(). In there, specify, using the parent variable, the name of your trained model deployed on Cloud MLE:

parent = 'projects/%s/models/%s' % (project, model_name)
Step 3
Now that you have all the pieces for making the call to your model, build the call request by specifying it in the prediction variable:

prediction = api.projects().predict(body=input_data, name=parent).execute()
Step 4
The final #TODO (scroll towards bottom) is to get gestation_weeks from the form data and cast into a float within the features array:

features['gestation_weeks'] = float(data['gestation_weeks'])
Step 5
Save the changes you made using the File > Save button on the top left of your code editor window.

3b0e6c092072fec5.png

Modify form.html
form.html is the front-end of your app. The user fills in data (features) about the mother based on which we will make the predictions using our trained model.

Step 1
In code editor, navigate to the application/templates directory and click to open the form.html file.

Step 2
There is one #TODO item here. Look for the div segment for Plurality and add options for other plurality values (2, 3, etc).

<md-option value="2">Twins</md-option>
<md-option value="3">Triplets</md-option>
Step 3
Save the changes you made using the File > Save button on the top left of your code editor window.

Deploy and test your app
Step 1
In Cloud Shell, run the deploy.sh script to install required dependencies and deploy your app engine app to the cloud.

cd training-data-analyst/courses/machine_learning/deepdive/06_structured/labs/serving
./deploy.sh
Note: Choose a region for App Engine when prompted and follow the prompts during this process

Step 2
Go to the url https://<PROJECT-ID>.appspot.com and start making predictions.

Note: Replace <PROJECT-ID> with your Project ID.


###Lab: Build a system that predicts the traffic levels on roads.



##Ingesting Data
![](https://drive.google.com/uc?id=19likwvnYXzUbdswOd1Tkn710LtHMckKn)

Data On-Premise


```
# include -m for multi-threading
gsutil -m cp -r [src dir] gs://[bucket_name]
```

![](https://drive.google.com/uc?id=1DM9JVVmMjShYsC5QF-6mJcUx_XmQvrBJ)

Large Datasets
* about 60TB data
Transfer appliance

Cloud-to-Cloud Transfer

Existing Databases
![](https://drive.google.com/uc?id=1GZGNi_EXQOrceurnHk-yi1yFYLeKahQd)

Ingest data into BigQuery
* Types supported
  * CSV, JSON, AVRO, ORC, Parquet



##Automatic ETL Pipelines into GCP

Managed Airflow:

ETL Pattern 1: Push Solution Architecture (best for on-demand)

![](https://drive.google.com/uc?id=17D21dX8W_NwbHmEOTXLYA_xlcoRs5OHa)

ETL Pattern2 : [Pull Solution Architecture](https://cloud.google.com/blog/products/gcp/designing-etl-architecture-for-a-cloud-native-data-warehouse-on-google-cloud-platform)

[Datalake](https://cloud.google.com/solutions/build-a-data-lake-on-gcp)




#WEEK2

##Designing Adaptable ML Systems

Objectives:
1. Recognize variaous data dependencies.
1. Make cost-conscious engineering decisions
1. Mitigate model pollution
1. Implement a pipeline that is immune to one type of dependency
1. Debug the causes of observed model behavior

Modularity and Dependency management became easier with [maven](https://pypi.org/project/maven/), [gretle](https://pypi.org/project/gretel/0.0.8/), and pip.
Containers eliminate infrastructure dependecies.

Mismanaged Dependecies are costly:
1. Losses in prediction quality
1. Decreases to system stability
1. Decreases in team productivity

###Adapting to Data

1. Changing Distributions
  * Monitor descriptive statistics for your inputs and outputs
  * Monitor your residuals as a function of your inputs
  * Use custom weights in your loss function to emphasize data recency
  * Use dynamic training architecture and regularly retrain your model

###Training-Serving Skew
1. A discrepency between how you handle data in the training and serving pipelines
1. A change in the data between when you train and when you serve
1. A feedback loop between your model and your algorithm

How Code can Create Training/Serving Skew
* Different library versions that are functionally equivalent but optimized differently
* Different library versions that are not functionally equivalent
* Re-implemented functions

Lab:
Training Data is Batch and Serving data is Streaming.
![](https://drive.google.com/uc?id=1CB7PspWfvMZJv1mT9IiXT-eG5yFGIo3j)






```
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$ gcloud ai-platform models create ${MODEL_NAME} --regions $REGION             
Created ml engine model [projects/qwiklabs-gcp-03-bb0ca9c3aac2/models/babyweight].
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$ gcloud ai-platform versions create ${MODEL_VERSION} --model ${MODEL_NAME} --origin ${MODEL_LOCATION} --runtime-version $TFVERSION
Creating version (this might take a few minutes)......done.
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$ cd ~
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$ git clone https://github.com/GoogleCloudPlatform/training-data-analyst
Cloning into 'training-data-analyst'...
remote: Enumerating objects: 60, done.
remote: Counting objects: 100% (60/60), done.
remote: Compressing objects: 100% (49/49), done.
remote: Total 29663 (delta 30), reused 27 (delta 11), pack-reused 29603
Receiving objects: 100% (29663/29663), 279.01 MiB | 28.57 MiB/s, done.
Resolving deltas: 100% (18288/18288), done.
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$
student_03_8c0375482826@cloudshell:~ (qwiklabs-gcp-03-bb0ca9c3aac2)$ cd ~/training-data-analyst/courses/machine_learning/deepdive/06_structured/labs/serving
student_03_8c0375482826@cloudshell:~/training-data-analyst/courses/machine_learning/deepdive/06_structured/labs/serving (qwiklabs-gcp-03-bb0ca9c3aac2)$ ./what_to_fix.sh
./application/main.py:credentials = # TODO
./application/main.py:api = # TODO
./application/main.py:  parent = # TODO
./application/main.py:  prediction = # TODO
./application/main.py:  features['gestation_weeks'] = # TODO: get gestation_weeks and cast to float
./application/templates/form.html:            <!-- TODO: add options for other plurality values -->
./pipeline/src/main/java/com/google/cloud/training/mlongcp/BabyweightMLService.java:  private static final String PROJECT = "cloud-training-demos"; // TODO: put in your project name here
./pipeline/src/main/java/com/google/cloud/training/mlongcp/BabyweightMLService.java:  private static String       VERSION = "ml_on_gcp"; // TODO:put in your version name here

```