# Integrating Key Performance Indicators and Hyperparameters Into Model Manager
SAS Model Manager generates certain Key Performance Indicators (KPIs) automatically based on user created performance definitions. However, we may want to include other KPIs to measure model parameters not tracked by MM.

This can be done by performing local tests on models we've passed to SAS Model Manager, then passing up the resulting values as custom KPI values.

For certain Python models, python-sasctl will also generate a json file containing the hyperparameters of the model, making them easily accessible for future use.

### Python Package Imports

In [1]:
# Standard Library
from pathlib import Path
import warnings

# Third Party
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

# Application Specific
import sasctl.pzmm as pzmm
from sasctl import Session
from sasctl.pzmm.model_parameters import ModelParameters as mp

# Global Package Options
pd.options.mode.chained_assignment = None  # default="warn"
warnings.simplefilter(action="ignore", category=FutureWarning)

### Building the Model
For more information on building models for SAS Model Manager, please reference the [binary classification model](/pzmm_binary_classification_model_import.ipynb), [regression model](/pzmm_regression_model_import.ipynb), and [multiclass classification model](/pzmm_multi_classification_model_import.ipynb) notebooks.

In [2]:
hmeq_data = pd.read_csv("data/hmeq.csv", sep=",")

In [3]:
predictorColumns = ["LOAN", "MORTDUE", "VALUE", "YOJ", "DEROG", "DELINQ", "CLAGE", "NINQ", "CLNO", "DEBTINC"]

target_column = "BAD"
x = hmeq_data[predictorColumns]
y = hmeq_data[target_column]

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=42)

x_test.fillna(x_test.mean(), inplace=True)
x_train.fillna(x_train.mean(), inplace=True)

In [4]:
tree_model = DecisionTreeClassifier(random_state=42)
tree_model = tree_model.fit(x_train, y_train)

In [5]:
y_tree_predict = tree_model.predict(x_test)
y_tree_proba = tree_model.predict_proba(x_test)

In [13]:
path = Path.cwd() / "data/hmeqModels/DTC_KPIs/"
prefix = "DTC_KPIsV1"
score_metrics = ["EM_CLASSIFICATION", "EM_EVENTPROBABILITY"]
pzmm.PickleModel.pickle_trained_model(tree_model, prefix, path)

Model DTC_KPIsV1 was successfully pickled and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\DTC_KPIsV1.pickle.


In [14]:
pzmm.JSONFiles.write_var_json(hmeq_data[predictorColumns], is_input=True, json_path=path)

output_var = pd.DataFrame(columns=score_metrics, data=[["A", 0.5]])
pzmm.JSONFiles.write_var_json(output_var, is_input=False, json_path=path)

pzmm.JSONFiles.write_model_properties_json(
    model_name=prefix,
    model_desc=f"Description for the {prefix} model.",
    target_variable=target_column,
    model_algorithm="Decision tree",
    model_function="Classification",
    target_values=["1", "0"],
    json_path=path,
    modeler="sasdemo"
)

pzmm.JSONFiles.write_file_metadata_json(model_prefix=prefix, json_path=path)

inputVar.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\inputVar.json
outputVar.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\outputVar.json
ModelProperties.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\ModelProperties.json
fileMetadata.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\fileMetadata.json


In [15]:
import getpass

username = "edmdev"#getpass.getpass()
password = "Go4thsas"#getpass.getpass()
host = "base.ingress-nginx.wenbao-rc1-m1.modelmanager.sashq-d.openstack.sas.com"#"demo.sas.com"

sess = Session(host, username, password, protocol="http")

train_proba = tree_model.predict_proba(x_train)

train_data = pd.concat([y_train.reset_index(drop=True), pd.Series(data=train_proba[:, 1])], axis=1)
test_data = pd.concat([y_test.reset_index(drop=True), pd.Series(data=y_tree_proba[:, 1])], axis=1)

pzmm.JSONFiles.calculate_model_statistics(
    target_value=1, 
    train_data=train_data, 
    test_data=test_data, 
    json_path=path
)

dmcas_fitstat.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\dmcas_fitstat.json
dmcas_roc.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\dmcas_roc.json
dmcas_lift.json was successfully written and saved to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\dmcas_lift.json


In [16]:
model = pzmm.ImportModel.import_model(
    model_files=path, 
    model_prefix=prefix, 
    project="HMEQ_CustomKPIsV3", 
    input_data=x, 
    predict_method= [tree_model.predict_proba, [int, int]], 
    score_metrics=score_metrics, 
    target_values=["1", "0"],
    model_file_name=prefix + ".pickle",
    missing_values=True,
    overwrite_model=True
)

Model score code was written successfully to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs\score_DTC_KPIsV1.py and uploaded to SAS Model Manager.
All model files were zipped to C:\Users\sclind\Documents\Python Scripts\GitHub\sassoftware\python-sasctl\examples\data\hmeqModels\DTC_KPIs.


  warn(


Model was successfully imported into SAS Model Manager as DTC_KPIsV1 with the following UUID: 615e4328-6341-415e-97c9-654f6d3e2baa.


### Updating Model and Project Properties
In order to allow for performance definitions to be run in SAS Model Manager, certain properties need to be set for both the model and the project.

In [17]:
from sasctl._services.model_repository import ModelRepository as mr

model = mr.get_model(model[0].id)

model["targetEvent"] = "1"
model["targetVariable"] = "BAD"
model["function"] = "Classification"
model["targetLevel"] = "Binary"
model["eventProbVar"] = "EM_EVENTPROBABILITY"

model = mr.update_model(model)

In [18]:
project = mr.get_project(model.projectName)

variables = model["inputVariables"] + model["outputVariables"]

project["targetVariable"] = "BAD"
project["variables"] = variables
project["targetLevel"] = "Binary"
project["targetEventValue"] = "1"
project["classTargetValues"] = ".5"
project["function"] = "Classification"
project["eventProbabilityVariable"] = "EM_EVENTPROBABILITY"

project = mr.update_project(project)

### Hyperparameter Generation
If the hyperparameter json file is not generated automatically, this code block will generate it and add it to SAS Model Manager.

In [19]:
mp.generate_hyperparameters(tree_model, prefix, path)

with open(path / f"{prefix}Hyperparameters.json", "r") as f:
    mr.add_model_content(model, f, f"{prefix}Hyperparameters.json")

Once the model has been uploaded to SAS Model Manager, custom hyperparameters can be added to the hyperparameter json file using the `add_hyperparamters` function. The custom hyperparameters are passed in as kwargs.

In [20]:
mp.add_hyperparameters(model, example=1)

In [27]:
mp.update_kpis(project=project)

  if pd.__version__ >= StrictVersion("1.0.3"):


AttributeError: 'RestObj' object has no attribute 'json'

### Performance Definition
To create a performance definition, we first have to pass up data for the performance definition to run on. All data used for performance defintions should be named using the following format: 

{Table Prefix}\_{Time}\_{Time Label}

In [23]:
from sasctl._services.cas_management import CASManagement as cas

for x in range(1, 5):
    cas.upload_file(
        file=f"data/HMEQPERF_{x}_Q{x}.csv", 
        name=f"HMEQPERF_{x}_Q{x}")
    print(x)

Unexpected exception formatting exception. Falling back to standard exception


Traceback (most recent call last):
  File "C:\Users\sclind\.conda\envs\dev-py38\lib\site-packages\IPython\core\interactiveshell.py", line 3433, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-23-5f03f39f6a35>", line 4, in <module>
    cas.upload_file(
  File "C:\Users\sclind\.conda\envs\dev-py38\lib\site-packages\sasctl\_services\cas_management.py", line 433, in upload_file
    tbl = cls.post(
  File "C:\Users\sclind\.conda\envs\dev-py38\lib\site-packages\sasctl\_services\service.py", line 113, in post
    return cls.request("post", *args, **kwargs)
  File "C:\Users\sclind\.conda\envs\dev-py38\lib\site-packages\sasctl\_services\service.py", line 93, in request
    return core.request(verb, path, session, format_, **kwargs)
  File "C:\Users\sclind\.conda\envs\dev-py38\lib\site-packages\sasctl\core.py", line 2033, in request
    raise HTTPError(
urllib.error.HTTPError: HTTP Error 409: {"version":2,"httpStatusCode":409,"errorCode":12226,"message":"

After pushing up the data, the performance definition can be created. When the performance definition is run, the KPIs are generated within Model Manager.

In [24]:
from sasctl._services.model_management import ModelManagement as mm

performance_task = mm.create_performance_definition(table_prefix="hmeqperf", project=project, scoring_required=True)

In [25]:
performance_definition = mm.list_performance_definitions(filter=f"eq(projectId,\"{project.id}\")")

performance_job = mm.execute_performance_definition(performance_definition[0].id)

Once the performance defintion is run, it is possible to update the hyperparameter json file to include the KPIs that have been generated. This is not a necessary step, but could be helpful when analyzing which hyperparameters lead to better KPIs.

In [26]:
mp.update_kpis(project)

  if pd.__version__ >= StrictVersion("1.0.3"):


SystemError: No KPI table exists for project HMEQ_CustomKPIsV3. Please confirm that the performance definition completed or custom KPIs have been uploaded successfully.

### Custom KPIs
It is also possible to generate custom key performance indicators and pass them up to SAS Model Manager. Below, using the same data sets as were used in the SAS performance definition, the recall score is calculated, and then passed up to the KPI table in SAS Model Manager.

In [None]:
from sklearn.metrics import jaccard_score

recall_list = list()
time_labels = list()
time_sks = list()
name = ["jaccard" for x in range(4)]

for x in range(1, 5):
    test_data = pd.read_csv(f"data/HMEQPERF_{x}_Q{x}.csv")
    x_test = test_data[predictorColumns]
    y_test = test_data[target_column]
    test_data_predictions = tree_model.predict(x_test)
    recall = jaccard_score(y_test, test_data_predictions)
    recall_list.append(recall)
    time_labels.append(f"Q{x}")
    time_sks.append(x)

#TODO: allow option to add multiple of same custom KPI
model = mr.get_model(model)
mm.create_custom_kpi(
    model=model.id,
    project=project,
    kpiName=name,
    kpiValue=recall_list,
    timeLabel=time_labels,
    timeSK=time_sks
)

Once the KPIs have been generated, the hyperparameter file can updated, and the new KPIs will appear in the file.

In [None]:
import json

mp.update_kpis("HMEQModels")

hyperparameters = mp.get_hyperparameters("DecisionTreeClassifier")

print(json.dumps(hyperparameters, indent=4))