## Getting Predictions and Prediction Explanations

**Author**: Thodoris Petropoulos

**Label**: Model Deployment
### Scope

The scope of this notebook is to provide instructions on how to get predictions and prediction explanations out of a trained model using the Python API.

### Background

The main ways you can get predictions out of DataRobot using Python would be the modeling API and the prediction API.

**Modeling API**: You can use the modelling API if you use Python or R and there are multiple ways you can interact with it.

**Prediction API**: Any project can be called with the Prediction API if you have prediction servers. This is a simple REST API. Click on a model in the UI, then "Deploy Model" and "Activate now". You'll have access to a Python code snippet to help you interact with it. You can also deploy the model through the python API.


For the purposes of this tutorial, we will focus on the Modeling API. Note that this particular method of scoring utilizes modeling workers. This means that if someone is using these workers for modeling, your prediction is going to have to wait. This method of scoring is good for testing but not for deployment. For actual deployment, please deploy the model as a REST API through DataRobot's UI or through the API.

### Requirements

- Python version 3.7.3
-  DataRobot API version 2.19.0. 
Small adjustments might be needed depending on the Python version and DataRobot API version you are using.

Full documentation of the Python package can be found here: https://datarobot-public-api-client.readthedocs-hosted.com

It is assumed you already have a DataRobot <code>Project</code> object and a DataRobot <code>Model </code> object.

#### Import Libraries

In [None]:
import datarobot as dr

#### Requesting Predictions

Before actually requesting predictions, you should upload the dataset you wish to predict via <code>Project.upload_dataset</code>. Previously uploaded datasets can be seen under <code>Project.get_datasets</code>. When uploading the dataset you can provide the path to a local file, a file object, raw file content, a pandas.DataFrame object, or the url to a publicly available dataset.

In [None]:
#Uploading prediction dataset
dataset_from_path = project.upload_dataset('path/file')

#Request predictions
predict_job = model.request_predictions(dataset_from_path.id)

#Waiting for prediction calculations
predictions = predict_job.get_result_when_complete()

predictions.head()

#### Requesting Prediction Explanations
In order to create PredictionExplanations for a particular model and dataset, you must first Compute feature impact for the model via <code>dr.Model.get_or_request_feature_impact()</code>

In [None]:
model.get_or_request_feature_impact()

pei = dr.PredictionExplanationsInitialization.create(project.id, model.id)

#Wait for results of Prediction Explanations
pei.get_result_when_complete()

pe_job = dr.PredictionExplanations.create(project.id, model.id,  dataset_from_path.id)

#Waiting for Job to Complete
pe = pe_job.get_result_when_complete()

df_pe = pe.get_all_as_dataframe()
df_pe.head()

#### Time Series Projects Caveats
Prediction datasets are uploaded as normal predictions. However, when uploading a prediction dataset, a new parameter forecastPoint can be specified. The forecast point of a prediction dataset identifies the point in time relative which predictions should be generated, and if one is not specified when uploading a dataset, the server will choose the most recent possible forecast point. The forecast window specified when setting the partitioning options for the project determines how far into the future from the forecast point predictions should be calculated.

**Important Note**:
When uploading a dataset for Time Series projects scoring, you need to include the actual values from previous dates depending on the feature derivation setup. For example, if feature derivation window is -10 to -1 days and you want to forecast sales for the next 3 days, your dataset would look like this:

| date       | sales | Known_in_advance_feature |
|------------|-------|--------------------------|
| 01/01/2019 | 130   | AAA                      |
| 02/01/2019 | 123   | VVV                      |
| 03/01/2019 | 412   | BBB                      |
| 04/01/2019 | 321   | DDD                      |
| 05/01/2019 | 512   | DDD                      |
| 06/01/2019 | 623   | VVV                      |
| 07/01/2019 | 356   | CCC                      |
| 08/01/2019 | 133   | AAA                      |
| 09/01/2019 | 356   | CCC                      |
| 10/01/2019 | 654   | DDD                      |
| 11/01/2019 |       | BBB                      |
| 12/01/2019 |       | CCC                      |
| 13/01/2019 |       | DDD                      |

DataRobot will detect your forecast point as 10/01/2019 and then it will calculate lag features and make predictions for the missing dates.

#### Getting Predictions from a DataRobot Deployment
If you have used MLOps to deploy a model (DataRobot or Custom), you will have access to an API which you can call using an API Client. Below is a python script of an API Client. You can create your own API Client in the language of your choice!

In [None]:
"""
Usage:
    python datarobot-predict.py <input-file.csv>
 
This example uses the requests library which you can install with:
    pip install requests
We highly recommend that you update SSL certificates with:
    pip install -U urllib3[secure] certifi
"""
import sys
import json
import requests
 
API_URL = 'Find this in Deployment -> Overview -> Summary -> Endpoint'
API_KEY = 'YOUR_API_KEY'
DATAROBOT_KEY = 'Find this in Deployment -> Predictions -> Prediction API -> Single mode -> on top of the code sample'
 
DEPLOYMENT_ID = 'YOUR_DEPLOYMENT_ID'
MAX_PREDICTION_FILE_SIZE_BYTES = 52428800  # 50 MB
 
 
class DataRobotPredictionError(Exception):
    """Raised if there are issues getting predictions from DataRobot"""
 
 
def make_datarobot_deployment_predictions(data, deployment_id):
    """
    Make predictions on data provided using DataRobot deployment_id provided.
    See docs for details:
         https://app.eu.datarobot.com/docs/users-guide/predictions/api/new-prediction-api.html
 
    Parameters
    ----------
    data : str
        Feature1,Feature2
        numeric_value,string
    deployment_id : str
        The ID of the deployment to make predictions with.
 
    Returns
    -------
    Response schema:
        https://app.eu.datarobot.com/docs/users-guide/predictions/api/new-prediction-api.html#response-schema
 
    Raises
    ------
    DataRobotPredictionError if there are issues getting predictions from DataRobot
    """
    # Set HTTP headers. The charset should match the contents of the file.
    headers = {
        'Content-Type': 'text/plain; charset=UTF-8',
        'Authorization': 'Bearer {}'.format(API_KEY),
        'DataRobot-Key': DATAROBOT_KEY,
    }
 
    url = API_URL.format(deployment_id=deployment_id)
    # Make API request for predictions
    predictions_response = requests.post(
        url,
        data=data,
        headers=headers,
    )
    _raise_dataroboterror_for_status(predictions_response)
    # Return a Python dict following the schema in the documentation
    return predictions_response.json()
 
 
def _raise_dataroboterror_for_status(response):
    """Raise DataRobotPredictionError if the request fails along with the response returned"""
    try:
        response.raise_for_status()
    except requests.exceptions.HTTPError:
        err_msg = '{code} Error: {msg}'.format(
            code=response.status_code, msg=response.text)
        raise DataRobotPredictionError(err_msg)
 
 
def main(filename, deployment_id):
    """
    Return an exit code on script completion or error. Codes > 0 are errors to the shell.
    Also useful as a usage demonstration of
    `make_datarobot_deployment_predictions(data, deployment_id)`
    """
    if not filename:
        print(
            'Input file is required argument. '
            'Usage: python datarobot-predict.py <input-file.csv>')
        return 1
    data = open(filename, 'rb').read()
    data_size = sys.getsizeof(data)
    if data_size >= MAX_PREDICTION_FILE_SIZE_BYTES:
        print(
            'Input file is too large: {} bytes. '
            'Max allowed size is: {} bytes.'
        ).format(data_size, MAX_PREDICTION_FILE_SIZE_BYTES)
        return 1
    try:
        predictions = make_datarobot_deployment_predictions(data, deployment_id)
    except DataRobotPredictionError as exc:
        print(exc)
        return 1
    print(json.dumps(predictions, indent=4))
    return 0
 
 
if __name__ == "__main__":
    filename = sys.argv[1]
    sys.exit(main(filename, DEPLOYMENT_ID))
 