![title](datarobot.jpg)

# DataRobot Python API Starter Activity 
# Part 3 — Predictions

## Pre-requisites


*Before starting this activity (Part 3), be sure to complete Part 1 (including waiting for Autopilot to complete).*
    


## Part 3 Objectives

In Part 3, you will:
- Deploy the model recommended by DataRobot
- Request wine quality predictions in batch and real-time modes

The goal in this activity is to predict the quality rating for a particular wine based on various characteristics of the wine, such as its acidity and alcohol and sugar content.

The wine data comes from the University of California, Irvine Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/wine+quality. 

Citation: *P. Cortez, A. Cerdeira, F. Almeida, T. Matos and J. Reis.
Modeling wine preferences by data mining from physicochemical properties. In Decision Support Systems, Elsevier, 47(4):547-553, 2009.*

---

## Setting Up Your Application

You will need to run the same client setup you did in part 1.

**Tip**: Before you start running any code, we recommend you enable Jupyter interactive shell feature, which allows you to see all of the output a particular Jupyter cell produces rather than just the last output. This will be particularly helpful as you complete your exercises.

In [None]:
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"

import datarobot as dr
import pandas as pd

In [1]:
# Remove or comment out this line if you are using DataRobot Notebooks
dr.Client(config_path = 'drconfig.yaml')

<datarobot.rest.RESTClientObject at 0x11701d950>

## 1. Viewing and Deploying Models 

### 1a. Retrieve the recommended model for the project

#### Note: Section 1a. will be familiar to those of you that have completed part 2

When `wait_for_autopilot()` returns, Autopilot has trained an initial set of models in the project. Because you are working in a separate notebook from the one where you started the project, you need to request a reference to the project created from DataRobot.

- We will start by using the DataRobot Client API `Project` to retrieve an existing DataRobot project of our choice.

- To do this, you will need a Project id.

- There are a number of ways to get a handle to your project. For example:

    - You can go to the DataRobot application UI, open the project, and extract the project ID from the URL. For example:
        - `app.datarobot.com/projects/`**`60faf10710c1574209c6ddb0`**`/models`
    

    - You can get a list of all of your projects using `Project.list()` and find the right one by name. Try that now:
    

In [None]:
for p in dr.Project.list():
    print (p.id, p.project_name)

Using the ID, you can get the right `Project` object.

In [2]:
projectId = '65774a827fce32b668b7c673' #your-project-id
project = dr.Project.get(projectId)
print(project)

Project(Python wine quality 2023-12-11)


- A DataRobot Client API `Model` object represents a model calculated by DataRobot.
- The `Model` class provides numerous ways to evaluate, interact with and test models.
- See more in the docs [here](https://datarobot-public-api-client.readthedocs-hosted.com/en/latest-release/autodoc/api_reference.html#model)

Let's get an array of `Model` objects representing all the project's models.

In [3]:
models = project.get_models()
for m in models:
    print(m.id,m.model_type)

65774c39174d4613acef2eb0 RandomForest Regressor
65774b0758bfa0dcc1b287f5 RandomForest Regressor
65774b0758bfa0dcc1b287f3 Light Gradient Boosted Trees Regressor with Early Stopping
65774b0758bfa0dcc1b287f7 eXtreme Gradient Boosted Trees Regressor
65774b0758bfa0dcc1b287f4 Light Gradient Boosting on ElasticNet Predictions 
65774b0758bfa0dcc1b287f6 RuleFit Regressor
65774b0758bfa0dcc1b287f2 Generalized Additive2 Model
65774b0758bfa0dcc1b287f0 Ridge Regressor
65774b0758bfa0dcc1b287f1 Elastic-Net Regressor (mixing alpha=0.5 / Least-Squares Loss)


Evaluating and comparing the various characteristics of the models and choosing one to deploy is is covered in the *Part 2 Notebook*. 

To proceed with deployment, let's use the "recommended model" chosen automatically by DataRobot.

In [4]:
recommendedModel = dr.ModelRecommendation.get(project.id).get_model()
print (recommendedModel.id,recommendedModel.model_type)

65774c39174d4613acef2eb0 RandomForest Regressor


DataRobot automatically recommends the most accurate modeling approach and prepares a model for deployment.</br>
</br>
See more about the model recommendation process in DataRobot in the docs [here](https://docs.datarobot.com/en/docs/modeling/reference/model-detail/model-rec-process.html)</br>
See more about DataRobot's "prepare for deployment" in the docs [here](https://docs.datarobot.com/en/docs/modeling/reference/model-detail/model-rec-process.html#prepare-a-model-for-deployment)

----

### 1b. Deploy the recommended model

Now that you have selected a model (the recommended model, in this case), the next step is to deploy it to a prediction server. This makes it available to do real-time or batch predictions. 

**NOTE: This operation is different depending on whether you are using a trial or pay-as-you-go account, full Managed AI Cloud account, or an on-premises/self-managed DataRobot installation.**

### For **Managed AI Cloud Accounts** & **self-managed DataRobot** installations:

For full (as opposed to trial) accounts, you must specify a default prediction server.

Let's list the prediction servers available to help make our choice

In [5]:
dr.PredictionServer.list()

[PredictionServer(https://datarobot-university.dynamic.orm.datarobot.com)]

In [6]:
predictionServer = dr.PredictionServer.list()[0]

deployment = dr.Deployment.create_from_learning_model(
    model_id=recommendedModel.id, 
    label='Wine Quality',
    description='Model for scoring wine quality',
    default_prediction_server_id=predictionServer.id
)
deployment

Deployment(Wine Quality)

### For **Trial** or **pay-as-you-go accounts** you don't specify a prediction server when deploying.

In [None]:
deployment = dr.Deployment.create_from_learning_model(
    model_id=recommendedModel.id, 
    label='Wine Quality',
    description='Model for scoring wine quality'
)

Just as with creating a project, your work in the API is reflected in the GUI as well. For any deployment created via the API, you can print the hyperlink to take you directly to the deployment in the GUI.

In [7]:
## Get hyperlink to deployment in the GUI
deployment.get_uri()

'https://app.datarobot.com/deployments/657765c0580b0b6e2f6ef874/overview'

### *Congratulations, you've just deployed a model into production!*

------

## 2. Requesting Predictions

For this activity, we have provided you with a small test dataset containing wines and their feature values. You will practice scoring this data to predict the `quality` target using the batch prediction method and the realtime prediction method.

Review the data in the `winequality-white-score.csv` file.

### Request batch predictions

Start a prediction job that passes in the scoring data from the provided data file, and saves the predictions to a local file called `winequality-white-predictions.csv`.

A "passthrough column" allows you to pass a column value to the prediction engine, which will be included unchanged in the output. In this example, including a unique ID for each wine allows you to easily correlate rows in the scoring dataset with rows in the predictions output.

The prediction job might take a minute or so.

**NOTE**: If you are using DataRobot Notebooks and cannot use the 'localFile' option, you can use the [`predict_batch()`](https://datarobot-public-api-client.readthedocs-hosted.com/en/latest-release/reference/predictions/batch_predictions.html#make-batch-predictions-with-a-deployment) method with:  
```
df_score = pd.read_csv('https://datarobot.box.com/shared/static/27n2c6xdhkmv9jw68om07vrmy48uz1qd.csv')

pred_res = deployment.predict_batch(source=df_score)
```

In [8]:
job = dr.BatchPredictionJob.score (
    deployment=deployment.id,
    passthrough_columns=['wine_id'],
    intake_settings={
        'type': 'localFile',
        'file': './winequality-white-score.csv'
    },
    output_settings={
        'type': 'localFile',
        'path': './winequality-white-predictions-231211.csv'
    }
)

As we defined above, the output has been written into a file called `winequality-white-predictions.csv`. </br>
Let's read it into our notebook using Pandas to take a closer look at the prediction results.

**Note**: If your account has the [Model Deployment Approval Workflow](https://app.datarobot.com/docs/mlops/governance/dep-admin.html) enabled, the output will include a column called `DEPLOYMENT_APPROVAL_STATUS`. For this activity, you can disregard those values.

In [9]:
pred_res = pd.read_csv('winequality-white-predictions-231211.csv')
pred_res.head(n = 5)

Unnamed: 0,quality_PREDICTION,DEPLOYMENT_APPROVAL_STATUS,wine_id
0,6.214605,APPROVED,100
1,6.557977,APPROVED,101
2,5.651748,APPROVED,102
3,6.450522,APPROVED,103
4,5.818415,APPROVED,104


To gain insights into each prediction being made, you can also include Prediction Explanations in the output. 

Prediction Explanations illustrate what drives predictions on a row-by-row basis. Read more about Prediction Explanations in the documentation [here](https://docs.datarobot.com/en/docs/modeling/analyze-models/understand/pred-explain/predex-overview.html). 

In [10]:
job = dr.BatchPredictionJob.score (
    deployment=deployment.id,
    passthrough_columns=['wine_id'],
    intake_settings={
        'type': 'localFile',
        'file': './winequality-white-score.csv'
    },
    output_settings={
        'type': 'localFile',
        'path': './winequality-white-predictions-top3predexp.csv'
    },
    max_explanations= 3
)

In [11]:
pred_pe_res = pd.read_csv('winequality-white-predictions-top3predexp.csv')
pred_pe_res.head(n = 5)

Unnamed: 0,quality_PREDICTION,EXPLANATION_1_FEATURE_NAME,EXPLANATION_1_STRENGTH,EXPLANATION_1_ACTUAL_VALUE,EXPLANATION_1_QUALITATIVE_STRENGTH,EXPLANATION_2_FEATURE_NAME,EXPLANATION_2_STRENGTH,EXPLANATION_2_ACTUAL_VALUE,EXPLANATION_2_QUALITATIVE_STRENGTH,EXPLANATION_3_FEATURE_NAME,EXPLANATION_3_STRENGTH,EXPLANATION_3_ACTUAL_VALUE,EXPLANATION_3_QUALITATIVE_STRENGTH,DEPLOYMENT_APPROVAL_STATUS,wine_id
0,6.214605,alcohol,0.113683,11.45,++,chlorides,0.09515,0.021,++,pH,-0.094613,3.15,--,APPROVED,100
1,6.557977,density,0.32927,0.9897,+++,alcohol,0.240535,12.05,+++,fixed acidity,0.094113,5.0,++,APPROVED,101
2,5.651748,alcohol,-0.26673,9.7,---,chlorides,0.133633,0.032,++,residual sugar,0.105565,12.4,++,APPROVED,102
3,6.450522,alcohol,0.209306,11.9,+++,residual sugar,-0.179142,1.6,--,pH,0.138487,3.33,++,APPROVED,103
4,5.818415,alcohol,-0.152883,10.0,---,free sulfur dioxide,-0.0968,16.0,---,pH,0.059776,3.49,++,APPROVED,104
