# Viewing and Editing Model Version Sets

In this notebook we'll show how we can:
- Load a model version set
- View a list of models and associated version set labels
- Update version labels

When we update the version labels we'll adopt a 'Champion-Challenger' convention where we label one model as the champion, which is to be used for production inference. We can periodically change the label - for example as we monitor drift in a model we may retrain a set of models and label the champion based on fit statistics. 

Using this approach we set our preferred model as a 'champion' so that we pull in the champion for inference batch jobs.

### Load Dependencies

In [48]:
import ads
import tempfile
from ads.model import SklearnModel
from ads.model import ModelVersionSet
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import pandas as pd
import numpy as np
from ads.common.model_metadata import UseCaseType
import json
import joblib

### Connect to Model Version Set

In [None]:
project_ocid = 'ocid1.datascienceproject...'
compartment_ocid = 'ocid1.compartment...'

In [5]:
# this time we'll do it by name rather than having to know the OCID
mvs = ModelVersionSet.from_name(name='demo-model-version-set',compartment_id=compartment_ocid)

### View Models in Set

In [None]:
mvs

In [43]:
num_models = len(mvs.models())



In [44]:

print('There are', num_models,'models in the Model Version Set')

There are 2 models in the Model Version Set


### Compare Models in Model Version Set

In [95]:
def compare_models(model_version_set, hold_out_data,actuals):
    ''' Loads each model, performs hold-out prediction, calculates classification report and returns model version set information'''

    df = pd.DataFrame()
    # columns=[model_name,model_version,model_version_label,model_type,algo_type,model_ocid,model_features,model_accuracy,model_f1,model_precision,model_recall]

    for i in range(len(model_version_set)):
        # load model object
        model_meta = model_version_set[i]
        model_ocid = model_version_set[i].id
        temp_dir = tempfile.mkdtemp()
        downloaded_model = SklearnModel.from_model_catalog(model_ocid,artifact_dir=temp_dir,ignore_conda_error=True)        

        # score hold-out
        model = joblib.load(temp_dir+'/model.joblib')

        preds = model.predict(hold_out_data)

        # create classification report
        rpt = classification_report(actuals,preds,output_dict=True)

        model_taxonomy = model_meta.defined_metadata_list.to_dict()
        framework = model_taxonomy['data'][0]['value']
        algo = model_taxonomy['data'][2]['value']
        hyperparms = json.dumps(model_taxonomy['data'][4]['value'])
        objective = model_taxonomy['data'][5]['value']
        version_label = model_meta.version_label
        version_id = model_meta.version_id

        acc = rpt['accuracy']
        f1 = rpt['weighted avg']['f1-score']
        precision = rpt['weighted avg']['precision']
        recall = rpt['weighted avg']['recall']


        # write to df
        temp_df = pd.DataFrame({'display_name':[model_meta.display_name],'version':[version_id],'version_label':[version_label],'framework':[framework],'algo':[algo],'objective':[objective],'accuracy':[acc],'f1':[f1],'precision':[precision],'recall':[recall],'hyperparams':[hyperparms]})
        df = pd.concat([df,temp_df],ignore_index=True)

    return df




In [81]:
X_test=pd.read_csv('data/x_test.csv')
y_test=pd.read_csv('data/y_test.csv')

In [96]:
model_comparison = compare_models(mvs.models(),X_test,y_test)
model_comparison.head()



loop1:   0%|          | 0/4 [00:00<?, ?it/s]



loop1:   0%|          | 0/4 [00:00<?, ?it/s]



Unnamed: 0,display_name,version,version_label,framework,algo,objective,accuracy,f1,precision,recall,hyperparams
0,Updated model,2,Version 2,1.7.2,RandomForestClassifier,scikit-learn,1.0,1.0,1.0,1.0,"{""bootstrap"": ""True"", ""ccp_alpha"": ""0.0"", ""class_weight"": ""None"", ""criterion"": ""gini"", ""max_depth"": ""None"", ""max_features"": ""sqrt"", ""max_leaf_nodes"": ""None"", ""max_samples"": ""None"", ""min_impurity_decrease"": ""0.0"", ""min_samples_leaf"": ""1"", ""min_samples_split"": ""2"", ""min_weight_fraction_leaf"": ""0.0"", ""monotonic_cst"": ""None"", ""n_estimators"": ""100"", ""n_jobs"": ""None"", ""oob_score"": ""False"", ""random_state"": ""None"", ""verbose"": ""0"", ""warm_start"": ""False""}"
1,Initial model,1,Version 1,1.7.2,LogisticRegression,scikit-learn,1.0,1.0,1.0,1.0,"{""C"": ""1.0"", ""class_weight"": ""None"", ""dual"": ""False"", ""fit_intercept"": ""True"", ""intercept_scaling"": ""1"", ""l1_ratio"": ""None"", ""max_iter"": ""100"", ""multi_class"": ""deprecated"", ""n_jobs"": ""None"", ""penalty"": ""l2"", ""random_state"": ""None"", ""solver"": ""lbfgs"", ""tol"": ""0.0001"", ""verbose"": ""0"", ""warm_start"": ""False""}"


### Update Model Version Set Labels

In [97]:
def update_model_version_label(model_version_set,new_label):
    model_ocid = model_version_set.id
    temp_dir = tempfile.mkdtemp()
    model = SklearnModel.from_model_catalog(model_ocid,artifact_dir=temp_dir,ignore_conda_error=True)
    model.update(version_label=new_label)
    print('model version set label updated!')


In [98]:
update_model_version_label(mvs.models()[0],'Champion Model')



loop1:   0%|          | 0/4 [00:00<?, ?it/s]

model version set label updated!


In [99]:
update_model_version_label(mvs.models()[1],'Challenger Model')



loop1:   0%|          | 0/4 [00:00<?, ?it/s]

model version set label updated!


### Validate that the Version Set Labels were updated

In [100]:
mvs = ModelVersionSet.from_name(name='demo-model-version-set',compartment_id=compartment_ocid)

In [101]:
print(mvs.models()[0].version_label,mvs.models()[0].version_id)

Champion Model 2


In [102]:
print(mvs.models()[1].version_label,mvs.models()[1].version_id)

Challenger Model 1
