# Training a scikit learn model on AI Platform training. 
[AI Platform Training](https://cloud.google.com/ml-engine/docs/training-overview) can be used to train models from Tensorflow, Keras, Scikit-learn, XGBoost and Custom containers on Google Cloud Platform. [AI Platform Prediction](https://cloud.google.com/ml-engine/docs/prediction-overview) can host your trained machine learning models in the cloud and use AI Platform Prediction to infer target values for new data.

In this notebook we will do three things:
+ Train a Scikit-learn model using AI Platform training. 
+ Host our trained model as an API using AI Platform Prediction. 
+ Call our API to get a prediction. 

First have a look at the [AI Platform GUI](https://console.cloud.google.com/ai-platform). When in the console you can see on the left Jobs and Models. Jobs is where you can monitor your training jobs. And you can find your deployed models under Models.  

Before we get started have a look at the scikit-learn model code and how the folders are structured. The code can be found in the folder: scikit-caip/trainer. 

First we need to make sure that we have the required dependencies installed. Run the following cells to install the needed libraries matching the required version. 

In [None]:
!sudo pip3 install -r requirements.txt

You can use pip freeze to check the libraries and version. Only run the next cell if needed. 

In [None]:
!pip freeze

Only run if pandas-gbq is not properly installed.

In [None]:
!sudo pip install pandas-gbq

### Exercise One: Train on the Cloud using AI Platform Training
We are going to train a model using Google Cloud AI Platform. have a look at how the code is structured. 

Change these parameters before running the next cell:

+ --staging-bucket=*gs://specify_your_staging_bucket*
+ Change jobname:  *training marketing_v1_11* into *your_unique_jobname*
+ --pathoutput *gs://specify_your_output_bucket*

Your job is submitted successfully when you see the following:

    Job your_unique_jobname submitted successfully.
    
Tips:
+ If you get stuck or if you want to understand which parameters `gcloud ai-platform` takes? Have a look at the [documentation](https://cloud.google.com/ml-engine/docs/packaging-trainer). 
+ `--staging-bucket` specifies the Cloud Storage location where you want to stage your training and dependency packages. Your Google Cloud project must have access to this Cloud Storage bucket, and the bucket should be in the same region that you run the job. 
+ `--pathoutput` this is where we output the trained model (joblib). 
+ Everything below `-- \` are arguments that are parsed into the application and they are application dependend.  

After runing the next cell you can go to the [console](http://console.cloud.google.com/ai-platform/jobs) to monitor your job. Also have a look at the logs to see the model output/performance.

In [1]:
!gcloud ai-platform jobs submit your_unique_jobname \
   --staging-bucket=gs://specify_your_bucket \
   --region=us-central1 \
   --module-name=trainer.task \
   --package-path=trainer \
   --runtime-version 1.14 \
   --python-version 3.5 \
   -- \
   --pathdata gs://erwinh-public-data/scikit/data/marketing-data.csv \
   --pathoutput gs://specify_your_bucket \
   --storage GCS \
   --bqtable kfp-primer-workshop.marketing_data.raw

Job [marketing_v1_12] submitted successfully.
Your job is still active. You may view the status of your job with the command

  $ gcloud ai-platform jobs describe marketing_v1_12

or continue streaming the logs with the command

  $ gcloud ai-platform jobs stream-logs marketing_v1_12
jobId: marketing_v1_12
state: QUEUED


# Please stop here

### Exercise Two: Deploy model using AI Platform 
When the model is (successfully) trained we can take the trained model and publish it as an API. First change the following and give your model an unique name:

    MODEL_NAME="<you_model_name>"

Then we will use gcloud to create our model on AI Platform by running: `gcloud ai-platform models create`. 

Run the following cell and check if your model is deployed. Go to the [console](https://console.cloud.google.com) -> AI Platform -> Models -> <you_model_name>. Here you will find your model. Creation can take a few minutes.

In [16]:
!gcloud ai-platform models create <you_model_name>

Created ml engine model [projects/erwinh-ml-demos/models/marketingpredictor].


### Exercise three: Deploy your model as an API using AI Platform

We are going to deploy a model using Google Cloud AI Platform. You have to write the `gcloud` command to deploy a version of your trained model. 

To do:
+ Write the `gcloud` command to deploy the trained model artifact to Google AI Platform. 
+ Have a look at the [documentation](https://cloud.google.com/ml-engine/docs/deploying-models#create_a_model_resource). 
+ Hint have a look at :) `gcloud ai-platform versions create`.
+ Don't forget to start your gcloud command in the next cell with `!`
+ Go to the [console](https://console.cloud.google.com) -> AI Platform -> Models -> <you_model_name> and check if your model version has been created. Creation can take a few minutes.

In [None]:
!gloud 

## Exercise four: Getting a prediction
After deploying our model we can call the API to get a prediction. 

To do:
+ Write to gcloud command to get a prediction.
+ Have a look at the [documentation](https://cloud.google.com/ml-engine/docs/deploying-models#create_a_model_resource).
+ Hint: `gcloud ai-platform predict`
+ Don't forget to start your command with `!`

After running the cell you should see something like this:

    [True, True]

In [20]:
!gloud

[False, True]


Copyright 2019 Google Inc. All Rights Reserved. # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License.