## Controlling DataRobot Optimizer Through Python

This code shows how to interact to DataRobot Optimizer App using the example Lending Club dataset **"Lending Club Sample 30.csv"**.
In this example, we are trying to find the best combination of values for revol_util, inq_last_6mths, loan_amnt, and dti that minimizes the probability of a loan going bad.  

### Result

A dataframe with the original features, prediction, and the optimized features.  The optimized features have the prefix **opt_**


### Assumptions

1. A model has been deployed 
2. An Optimization App has been created
3. The dataset and Jupyter notebook are in the same folder
4. The outcome of the optimization is added to the original dataframe

### Steps

1. Change key_dict 
    1. Get the URL from the application (see figure below)
    2. Put the values in key_dict
    3. Put the name of the dataset in ts_settings["filename"]
2. Read file into a dataframe
3. result_df = perform_optimization(data_df)


### Functions

*get_optimization* : makes a post request to perform optimization.  It returns optimized values for the constraint features, and the predicted target

*get_constraints*  : accesses the constraint features and their ranges from the Optimizer App

*create_constrain_from_df* : if you want to decide which features to constrain, change the  "cfeatures" list in ts_settings and provide a file to estimate the min and the max for these features

*set_optimizer* : prepares the elements required by the Optimizer App



### Get the URL with application id, and the token id from
The URL is available from the share icon in the top menu of the Optimization App.

<img src="Picture1.gif">

In [62]:
import requests  # you could also use aiohttp instead of requests to make calls asynchronously
import json
import pandas as pd

'''
You get the ID and Token from the Optimizer App by clicking the share triangle (in the top menu of the UI)
'''
#https://606c32b7462514a41c85c492.apps.datarobot.com/settings?token=xMdWLVsq2C2dNvMCTMlbDzbZ6IcF_GwA4E76iYtJTGs&lang=en&theme=light

key_dict = {
    'Complete_data': (
                 '',  # ID
                 '',  # Token
                ),
}
#Set constraint features, and filename

ts_settings = {"cfeatures" :['revol_util','inq_last_6mths','loan_amnt','dti'],
              "filename":"Lending Club Sample 30.csv"}


In [63]:
def get_optimization(app, constraints):
    '''
    Given the credentials and data, get the best target under the constrained features
    :param app: 'Complete_data' in key_dict
    :param constraints: a dict containing the constrained features, the data, and the optimization type
    :return: optimization performance, and values tried for each constrained feature
    '''
    app_id, token = key_dict[app]
    url = f'https://{app_id}.apps.datarobot.com/api/optimize'
    headers = {"Authorization": f"Bearer {token}"}
    return requests.post(url, json=constraints, headers=headers).json()


In [64]:
def get_constraints(key_dict):
    '''
    Get a list of constraint features and their minimum and maximum values from the Optimization App  
    :param key_dict: the dictionary with the required app id and token
    :return: a list of constraint features and their min and max
    '''
    app_id, token = key_dict['Complete_data']
    url = f'https://{app_id}.apps.datarobot.com/api/application'
    headers = {"Authorization": f"Bearer {token}"}
    constraints = requests.get(url,headers=headers).json()
    constrain_list = constraints['constraints']
    [x.pop('feature_type') for x in constrain_list]
    return constrain_list

In [65]:
#You can build your constraint list from full dataset
#The min and the max are calculated from the full dataset
def create_constrain_from_df(df_tmp,cfeatures):
    '''
    Create a list of constraint features and their minimum and maximum values.  We assume we get the minimum and maximum 
    of the constraint features from a dataset
    :param df_tmp: dataframe 
    :param cfeatures: the name of the constrained features
    :return: a list of constraint features and their min and max
    '''
    constrain_list = []
    for c_name in cfeatures:
        c_dict = {'feature': c_name,'info':{'min': df_tmp[c_name].min(),'max':df_tmp[c_name].max(),'is_int':False}}
        constrain_list.append(c_dict)
    return constrain_list

In [66]:
def set_optimizer(constrain_list,tmp1):
    '''
    Create the constraints dictionary:  It has to have the constraint list, and a row that need to be optimized
    :param constrain_list: list of features and their constraints
    :param tmp1: a row that has been changed to a JSON format
    :return: a dict
    '''
    constraints = {
        'settings': {
        # Specify if you want to maximize or minimize
            "targetDirection": 'min',

            # Specify your flex feature ranges
            'constraints': constrain_list,
        },
    #    'optimization': {'method': 'exhaust'},  # optional

        # Specify your fixed values
        'datapoint': tmp1,
    }
    return constraints

In [67]:
def perform_optimization(df):
    '''
    Perform a batch optimization.  Add the results to the original dataframe
    :param df: dataframe with the dataset to be optimized 
    :return: a dataframe with the original dataframe with the predicted and optimized features.  The optimized features  
             name are prefixed with opt_ 
    '''
    #create constrain_list
    constrain_list = get_constraints(key_dict)
    #create new features and set them to 0.00
    df["opt_prediction"] = 0.00
    for feature in ts_settings["cfeatures"]:
        df["opt_"+feature] = 0.00

    #For create a constraints for each row and request optimization from DataRobot Optimizer App
    for index, row in df.iterrows():
        tmp1 = json.loads(row.to_json())
        constraints =set_optimizer(constrain_list,tmp1)
        for app in key_dict:  # if using aiohttp, you could start multiple optimizations and have them run simultaneously
            results = get_optimization(app, constraints)
            #Get the optimized target: prediction, and the values of the constrained features that resulted in the optimized target
            tp = results['optimized_simulation']
            df.loc[index,"opt_prediction"] = tp['prediction']
            for feature in tp['features']:
                f_name = feature["name"]
                df.loc[index,"opt_"+f_name] = feature["value"]
    return df


# RUN THIS 

In [None]:
data_df = pd.read_csv(ts_settings["filename"])
result_df = perform_optimization(data_df)

In [55]:
result_df[['opt_prediction','opt_revol_util','opt_inq_last_6mths','opt_loan_amnt','opt_dti']]


Unnamed: 0,opt_prediction,opt_revol_util,opt_inq_last_6mths,opt_loan_amnt,opt_dti
0,0.048569,1.383494,8.192054,27291.322287,0.134797
1,0.291327,48.439644,3.468059,28272.498063,1.917662
2,0.21389,94.653879,2.516591,1048.202766,7.263397
3,0.276012,89.474905,1.336023,29563.737299,19.058788
4,0.053004,46.291825,1.916218,27783.82429,18.628171
5,0.379326,92.36791,2.879611,24905.041876,22.379689
6,0.379326,92.36791,2.879611,24905.041876,22.379689
7,0.379326,92.36791,2.879611,24905.041876,22.379689
8,0.379326,92.36791,2.879611,24905.041876,22.379689
9,0.379326,92.36791,2.879611,24905.041876,22.379689


In [8]:
# See example of output from the optimizer
results

{'non_flexible_features': {'Target (1)': None,
  'Target (1) Prediction Value': None,
  'addr_state': None,
  'annual_inc': None,
  'custom_id': None,
  'delinq_2yrs': None,
  'desc': None,
  'earliest_cr_line': None,
  'emp_length': None,
  'emp_title': None,
  'funded_amnt': None,
  'grade': None,
  'home_ownership': None,
  'initial_list_status': None,
  'installment': None,
  'int_rate': None,
  'is_bad': None,
  'mths_since_last_delinq': None,
  'mths_since_last_major_derog': None,
  'mths_since_last_record': None,
  'open_acc': None,
  'opt_dti': 0.0,
  'opt_inq_last_6mths': 0.0,
  'opt_loan_amnt': 0.0,
  'opt_prediction': 0.0,
  'opt_revol_util': 0.0,
  'policy_code': None,
  'pub_rec': None,
  'purpose': None,
  'pymnt_plan': None,
  'revol_bal': None,
  'sub_grade': None,
  'term': None,
  'title': None,
  'total_acc': None,
  'url': None,
  'verification_status': None,
  'zip_code': None},
 'target': 'max',
 'constraints': [{'feature': 'revol_util',
   'feature_type': 'Numeri

In [9]:
df.loc[3,"opt_prediction"]

0.2709511209

In [61]:
resuld_df

Unnamed: 0,custom_id,loan_amnt,funded_amnt,term,int_rate,installment,grade,sub_grade,emp_title,emp_length,...,mths_since_last_major_derog,policy_code,is_bad,Target (1),Target (1) Prediction Value,opt_prediction,opt_revol_util,opt_inq_last_6mths,opt_loan_amnt,opt_dti
0,7.0,14000.0,8725.0,60 months,7.51%,174.88,A,A4,Peninsula Counseling Center,10+ years,...,,1.0,0.0,0.0,0.99874,0.048569,1.383494,8.192054,27291.322287,0.134797
1,8.0,3975.0,3975.0,60 months,17.58%,100.04,D,D4,Health Plan of Nevada,6 years,...,,1.0,0.0,0.0,0.910816,0.291327,48.439644,3.468059,28272.498063,1.917662
2,9.0,25000.0,25000.0,36 months,15.58%,873.76,D,D3,John Deere,2 years,...,,1.0,0.0,0.0,0.722703,0.21389,94.653879,2.516591,1048.202766,7.263397
3,10.0,10000.0,10000.0,36 months,8.00%,313.37,A,A3,,< 1 year,...,,1.0,0.0,0.0,0.953509,0.276012,89.474905,1.336023,29563.737299,19.058788
4,11.0,10000.0,10000.0,36 months,6.62%,307.04,A,A2,,3 years,...,,1.0,0.0,0.0,0.988155,0.053004,46.291825,1.916218,27783.82429,18.628171
5,,,,,,,,,,,...,,,,,,0.379326,92.36791,2.879611,24905.041876,22.379689
6,,,,,,,,,,,...,,,,,,0.379326,92.36791,2.879611,24905.041876,22.379689
7,,,,,,,,,,,...,,,,,,0.379326,92.36791,2.879611,24905.041876,22.379689
8,,,,,,,,,,,...,,,,,,0.379326,92.36791,2.879611,24905.041876,22.379689
9,,,,,,,,,,,...,,,,,,0.379326,92.36791,2.879611,24905.041876,22.379689
