<img src="https://storage.googleapis.com/arize-assets/arize-logo-white.jpg" width="200"/>

# Arize Tutorial: Surrogate Model Feature Importance

A surrogate model is an interpretable model trained on predicting the predictions of a black box model. The goal is to approximate the predictions of the black box model as closely as possible and generate feature importance values from the interpretable surrogate model. The benefit of this approach is that it does not require knowledge of the inner workings of the black box model.

In this tutorial we use the `MimcExplainer` from the `interpret_community` library to generate feature importance values from a surrogate model using only the prediction outputs from a black box model. Both [classification](#classification) and [regression](#regression) examples are provided below and feature importance values are logged to Arize using the Pandas [logger](https://docs.arize.com/arize/api-reference/python-sdk/arize.pandas).

# Install and import the `interpret_community` library

In [9]:
!pip install -q interpret==0.2.7 interpret-community==0.22.0
from interpret_community.mimic.mimic_explainer import (
    MimicExplainer,
    LGBMExplainableModel,
)

<a name="classification"></a>
# Classification Example
### Generate example
In this example we'll use a support vector machine (SVM) as our black box model. Only the prediction outputs of the SVM model is needed to train the surrogate model, and feature importances are generated from the surrogate model and sent to Arize.

In [10]:
import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.svm import SVC

bc = load_breast_cancer()

feature_names = bc.feature_names
target_names = bc.target_names
data, target = bc.data, bc.target

df = pd.DataFrame(data, columns=feature_names)

model = SVC(probability=True).fit(df, target)

prediction_label = pd.Series(map(lambda v: target_names[v], model.predict(df)))
prediction_score = pd.Series(map(lambda v: v[1], model.predict_proba(df)))
actual_label = pd.Series(map(lambda v: target_names[v], target))
actual_score = pd.Series(target)

### Generate feature importance values
Note that the model itself is not used here. Only its prediction outputs are used.

In [33]:
def model_func(_):
    return np.array(list(map(lambda p: [1 - p, p], prediction_score)))


explainer = MimicExplainer(
    model_func,
    df,
    LGBMExplainableModel,
    augment_data=False,
    is_function=True,
)

feature_importance_values = pd.DataFrame(
    explainer.explain_local(df).local_importance_values, columns=feature_names
)

feature_importance_values

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,radius error,texture error,perimeter error,area error,smoothness error,compactness error,concavity error,concave points error,symmetry error,fractal dimension error,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension
0,0.018108,0.138541,0.077393,0.015277,0.005145,0.080906,-0.016905,-0.643807,-0.042103,0.021538,-0.062563,-0.072320,-0.132991,-0.226836,-0.024982,0.001385,-0.008986,0.021477,-0.060072,-0.014799,-0.372269,0.460105,-1.056927,-0.682384,0.037168,0.004643,-0.150165,-1.097867,-0.081158,-0.133718
1,0.060940,0.049884,0.089396,0.029083,0.037934,0.078933,-0.020549,-0.655449,0.001232,0.032916,-0.053794,-0.036639,-0.047667,-0.208763,0.022451,-0.037539,0.005764,-0.005375,0.020566,-0.046258,-0.546568,0.040514,-1.118227,-0.741861,0.006268,-0.040353,-0.032484,-0.985273,0.021430,0.009588
2,0.018205,-0.047854,0.068413,0.020678,0.004345,0.103632,-0.028856,-0.772027,-0.001734,-0.004789,-0.066018,-0.118009,-0.094730,-0.209220,-0.002162,0.058385,0.021198,0.014575,-0.001813,-0.000490,-0.541401,-0.018137,-0.902831,-0.607974,-0.009130,0.015945,-0.146762,-0.955678,0.017115,0.009473
3,-0.047839,-0.012862,-0.137044,-0.028348,-0.042262,0.101697,-0.024009,-0.878298,-0.069611,0.021210,-0.101758,-0.007154,-0.073681,-0.047703,-0.089667,0.006967,-0.006839,0.008236,-0.013382,0.010059,0.075693,-0.258503,0.470571,0.860682,-0.679373,-0.002332,-0.219645,-2.472767,-0.065581,-0.127581
4,0.061742,0.218800,0.110820,0.015215,-0.013631,-0.009401,-0.033375,-0.642792,-0.019196,0.018381,-0.088982,-0.142218,-0.165266,-0.263277,-0.030444,-0.000005,-0.008663,0.022542,0.001264,-0.025966,-0.385342,0.444826,-1.060118,-0.917084,-0.014753,-0.038165,-0.158720,-0.738017,0.008146,-0.032212
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
564,0.028755,-0.158062,0.091706,0.020887,-0.018577,0.065849,-0.006534,-0.732744,-0.006353,0.013473,-0.067669,0.023710,-0.090861,-0.172663,-0.061976,0.049752,0.006937,0.018800,-0.054642,-0.007213,-0.538320,-0.047435,-0.863377,-0.499762,0.007036,-0.066063,-0.151718,-0.900800,-0.103035,-0.021701
565,0.025618,-0.102315,0.066744,0.016527,-0.001696,0.092922,-0.004896,-0.734690,0.003628,-0.012860,-0.047477,0.090344,-0.131177,-0.194746,-0.000419,0.058898,0.020499,0.008065,0.007598,0.030133,-0.529617,-0.279627,-0.806108,-0.699274,-0.075069,-0.052195,-0.162254,-0.717028,-0.007273,0.031213
566,0.021270,-0.142339,0.081855,0.022344,0.093014,0.114570,0.051181,-0.623286,-0.086434,0.009751,-0.014388,-0.008984,-0.036364,-0.239864,-0.012502,0.059692,-0.004276,0.017378,-0.044175,-0.010367,-0.764227,-0.424005,-1.017428,-0.788748,-0.023779,0.002408,-0.153181,-0.326826,0.053247,-0.009958
567,0.022906,-0.017950,0.075247,0.016336,-0.000332,0.091722,-0.009136,-0.714674,-0.051145,0.005447,-0.044413,0.031540,-0.075651,-0.156925,-0.002790,0.030921,-0.010858,0.005159,0.010155,0.002927,-0.528196,-0.333554,-0.834152,-0.492522,0.032410,0.002644,-0.141578,-0.904549,-0.027490,-0.111809


### Send data to Arize
Set up Arize client. We'll be using the Pandas Logger. First copy the Arize `API_KEY` and `ORG_KEY` from your admin page linked below!

[![Button_Open.png](https://storage.googleapis.com/arize-assets/fixtures/Button_Open.png)](https://app.arize.com/admin)

In [None]:
!pip install -q arize
from arize.pandas.logger import Client, Schema
from arize.utils.types import ModelTypes, Environments

ORGANIZATION_KEY = "ORGANIZATION_KEY"
API_KEY = "API_KEY"

arize_client = Client(organization_key=ORGANIZATION_KEY, api_key=API_KEY)

if ORGANIZATION_KEY == "ORGANIZATION_KEY" or API_KEY == "API_KEY":
    raise ValueError("❌ NEED TO CHANGE ORGANIZATION AND/OR API_KEY")
else:
    print("✅ Import and Setup Arize Client Done! Now we can start using Arize!")

Helper functions to simulate prediction IDs and timestamps.

In [13]:
import uuid
from datetime import datetime, timedelta

# Prediction ID is required for logging any dataset
def generate_prediction_ids(df):
    return pd.Series((str(uuid.uuid4()) for _ in range(len(df))), index=df.index)


# OPTIONAL: We can directly specify when inferences were made
def simulate_production_timestamps(df, days=30):
    t = datetime.now()
    current_t, earlier_t = t.timestamp(), (t - timedelta(days=days)).timestamp()
    return pd.Series(np.linspace(earlier_t, current_t, num=len(df)), index=df.index)

Assemble Pandas DataFrame as a production dataset with prediction IDs and timestamps.

In [14]:
feature_importance_values_column_names_mapping = {
    f"{feat}": f"{feat} (feature importance)" for feat in feature_names
}

production_dataset = pd.concat(
    [
        pd.DataFrame(
            {
                "prediction_id": generate_prediction_ids(df),
                "prediction_ts": simulate_production_timestamps(df),
                "prediction_label": prediction_label,
                "actual_label": actual_label,
                "prediction_score": prediction_score,
                "actual_score": actual_score,
            }
        ),
        df,
        feature_importance_values.rename(
            columns=feature_importance_values_column_names_mapping
        ),
    ],
    axis=1,
)

production_dataset

Unnamed: 0,prediction_id,prediction_ts,prediction_label,actual_label,prediction_score,actual_score,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,radius error,texture error,perimeter error,area error,smoothness error,compactness error,concavity error,concave points error,symmetry error,fractal dimension error,worst radius,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension,mean radius (feature importance),mean texture (feature importance),mean perimeter (feature importance),mean area (feature importance),mean smoothness (feature importance),mean compactness (feature importance),mean concavity (feature importance),mean concave points (feature importance),mean symmetry (feature importance),mean fractal dimension (feature importance),radius error (feature importance),texture error (feature importance),perimeter error (feature importance),area error (feature importance),smoothness error (feature importance),compactness error (feature importance),concavity error (feature importance),concave points error (feature importance),symmetry error (feature importance),fractal dimension error (feature importance),worst radius (feature importance),worst texture (feature importance),worst perimeter (feature importance),worst area (feature importance),worst smoothness (feature importance),worst compactness (feature importance),worst concavity (feature importance),worst concave points (feature importance),worst symmetry (feature importance),worst fractal dimension (feature importance)
0,3f6e7a06-d87e-452c-b5ee-d16acd53a8f5,1.634424e+09,malignant,malignant,0.046753,0,17.99,10.38,122.80,1001.0,0.11840,0.27760,0.30010,0.14710,0.2419,0.07871,1.0950,0.9053,8.589,153.40,0.006399,0.04904,0.05373,0.01587,0.03003,0.006193,25.380,17.33,184.60,2019.0,0.16220,0.66560,0.7119,0.2654,0.4601,0.11890,0.018108,0.138541,0.077393,0.015277,0.005145,0.080906,-0.016905,-0.643807,-0.042103,0.021538,-0.062563,-0.072320,-0.132991,-0.226836,-0.024982,0.001385,-0.008986,0.021477,-0.060072,-0.014799,-0.372269,0.460105,-1.056927,-0.682384,0.037168,0.004643,-0.150165,-1.097867,-0.081158,-0.133718
1,593d3d42-bb52-485c-9ebf-5a345518f4b5,1.634429e+09,malignant,malignant,0.046696,0,20.57,17.77,132.90,1326.0,0.08474,0.07864,0.08690,0.07017,0.1812,0.05667,0.5435,0.7339,3.398,74.08,0.005225,0.01308,0.01860,0.01340,0.01389,0.003532,24.990,23.41,158.80,1956.0,0.12380,0.18660,0.2416,0.1860,0.2750,0.08902,0.060940,0.049884,0.089396,0.029083,0.037934,0.078933,-0.020549,-0.655449,0.001232,0.032916,-0.053794,-0.036639,-0.047667,-0.208763,0.022451,-0.037539,0.005764,-0.005375,0.020566,-0.046258,-0.546568,0.040514,-1.118227,-0.741861,0.006268,-0.040353,-0.032484,-0.985273,0.021430,0.009588
2,658d2822-3beb-489d-8650-1fd4d0e2c918,1.634434e+09,malignant,malignant,0.046781,0,19.69,21.25,130.00,1203.0,0.10960,0.15990,0.19740,0.12790,0.2069,0.05999,0.7456,0.7869,4.585,94.03,0.006150,0.04006,0.03832,0.02058,0.02250,0.004571,23.570,25.53,152.50,1709.0,0.14440,0.42450,0.4504,0.2430,0.3613,0.08758,0.018205,-0.047854,0.068413,0.020678,0.004345,0.103632,-0.028856,-0.772027,-0.001734,-0.004789,-0.066018,-0.118009,-0.094730,-0.209220,-0.002162,0.058385,0.021198,0.014575,-0.001813,-0.000490,-0.541401,-0.018137,-0.902831,-0.607974,-0.009130,0.015945,-0.146762,-0.955678,0.017115,0.009473
3,6c10f0e2-7952-41b0-af85-84857a06973e,1.634438e+09,malignant,malignant,0.046739,0,11.42,20.38,77.58,386.1,0.14250,0.28390,0.24140,0.10520,0.2597,0.09744,0.4956,1.1560,3.445,27.23,0.009110,0.07458,0.05661,0.01867,0.05963,0.009208,14.910,26.50,98.87,567.7,0.20980,0.86630,0.6869,0.2575,0.6638,0.17300,-0.047839,-0.012862,-0.137044,-0.028348,-0.042262,0.101697,-0.024009,-0.878298,-0.069611,0.021210,-0.101758,-0.007154,-0.073681,-0.047703,-0.089667,0.006967,-0.006839,0.008236,-0.013382,0.010059,0.075693,-0.258503,0.470571,0.860682,-0.679373,-0.002332,-0.219645,-2.472767,-0.065581,-0.127581
4,ed34ee71-279f-4686-9acd-7abb3f0db15e,1.634443e+09,malignant,malignant,0.046730,0,20.29,14.34,135.10,1297.0,0.10030,0.13280,0.19800,0.10430,0.1809,0.05883,0.7572,0.7813,5.438,94.44,0.011490,0.02461,0.05688,0.01885,0.01756,0.005115,22.540,16.67,152.20,1575.0,0.13740,0.20500,0.4000,0.1625,0.2364,0.07678,0.061742,0.218800,0.110820,0.015215,-0.013631,-0.009401,-0.033375,-0.642792,-0.019196,0.018381,-0.088982,-0.142218,-0.165266,-0.263277,-0.030444,-0.000005,-0.008663,0.022542,0.001264,-0.025966,-0.385342,0.444826,-1.060118,-0.917084,-0.014753,-0.038165,-0.158720,-0.738017,0.008146,-0.032212
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
564,1b417af0-3174-4360-b4a6-91f4d96af2f5,1.636998e+09,malignant,malignant,0.046747,0,21.56,22.39,142.00,1479.0,0.11100,0.11590,0.24390,0.13890,0.1726,0.05623,1.1760,1.2560,7.673,158.70,0.010300,0.02891,0.05198,0.02454,0.01114,0.004239,25.450,26.40,166.10,2027.0,0.14100,0.21130,0.4107,0.2216,0.2060,0.07115,0.028755,-0.158062,0.091706,0.020887,-0.018577,0.065849,-0.006534,-0.732744,-0.006353,0.013473,-0.067669,0.023710,-0.090861,-0.172663,-0.061976,0.049752,0.006937,0.018800,-0.054642,-0.007213,-0.538320,-0.047435,-0.863377,-0.499762,0.007036,-0.066063,-0.151718,-0.900800,-0.103035,-0.021701
565,0689f633-9334-4d52-8cd1-837940c6bdd7,1.637003e+09,malignant,malignant,0.046813,0,20.13,28.25,131.20,1261.0,0.09780,0.10340,0.14400,0.09791,0.1752,0.05533,0.7655,2.4630,5.203,99.04,0.005769,0.02423,0.03950,0.01678,0.01898,0.002498,23.690,38.25,155.00,1731.0,0.11660,0.19220,0.3215,0.1628,0.2572,0.06637,0.025618,-0.102315,0.066744,0.016527,-0.001696,0.092922,-0.004896,-0.734690,0.003628,-0.012860,-0.047477,0.090344,-0.131177,-0.194746,-0.000419,0.058898,0.020499,0.008065,0.007598,0.030133,-0.529617,-0.279627,-0.806108,-0.699274,-0.075069,-0.052195,-0.162254,-0.717028,-0.007273,0.031213
566,6d02f372-a2bb-4496-bf25-46e8f344417d,1.637007e+09,malignant,malignant,0.046765,0,16.60,28.08,108.30,858.1,0.08455,0.10230,0.09251,0.05302,0.1590,0.05648,0.4564,1.0750,3.425,48.55,0.005903,0.03731,0.04730,0.01557,0.01318,0.003892,18.980,34.12,126.70,1124.0,0.11390,0.30940,0.3403,0.1418,0.2218,0.07820,0.021270,-0.142339,0.081855,0.022344,0.093014,0.114570,0.051181,-0.623286,-0.086434,0.009751,-0.014388,-0.008984,-0.036364,-0.239864,-0.012502,0.059692,-0.004276,0.017378,-0.044175,-0.010367,-0.764227,-0.424005,-1.017428,-0.788748,-0.023779,0.002408,-0.153181,-0.326826,0.053247,-0.009958
567,ce84c7f3-99b6-416c-b4e9-181c5d30d02e,1.637012e+09,malignant,malignant,0.046801,0,20.60,29.33,140.10,1265.0,0.11780,0.27700,0.35140,0.15200,0.2397,0.07016,0.7260,1.5950,5.772,86.22,0.006522,0.06158,0.07117,0.01664,0.02324,0.006185,25.740,39.42,184.60,1821.0,0.16500,0.86810,0.9387,0.2650,0.4087,0.12400,0.022906,-0.017950,0.075247,0.016336,-0.000332,0.091722,-0.009136,-0.714674,-0.051145,0.005447,-0.044413,0.031540,-0.075651,-0.156925,-0.002790,0.030921,-0.010858,0.005159,0.010155,0.002927,-0.528196,-0.333554,-0.834152,-0.492522,0.032410,0.002644,-0.141578,-0.904549,-0.027490,-0.111809


Send dataframe to Arize

In [None]:
# Define a Schema() object for Arize to pick up data from the correct columns for logging
production_schema = Schema(
    prediction_id_column_name="prediction_id",  # REQUIRED
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="prediction_label",
    prediction_score_column_name="prediction_score",
    actual_label_column_name="actual_label",
    actual_score_column_name="actual_score",
    feature_column_names=feature_names,
    shap_values_column_names=feature_importance_values_column_names_mapping,
)

# arize_client.log returns a Response object from Python's requests module
response = arize_client.log(
    dataframe=production_dataset,
    schema=production_schema,
    model_id="surrogate_model_example_classification",
    model_type=ModelTypes.SCORE_CATEGORICAL,
    environment=Environments.PRODUCTION,
    path="inferences.bin",
)

# If successful, the server will return a status_code of 200
if response.status_code != 200:
    print(
        f"❌ logging failed with response code {response.status_code}, {response.text}"
    )
else:
    print(
        f"✅ You have successfully logged {len(production_dataset)} data points to Arize!"
    )

<a name="regression"></a>
# Regression Example
### Generate example
In this example we'll use a support vector machine (SVM) as our black box model. Only the prediction outputs of the SVM model is needed to train the surrogate model, and feature importances are generated from the surrogate model and sent to Arize.

In [16]:
import pandas as pd
import numpy as np
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()

# Use only 1,000 data point for a speedier example
data_reg = housing.data[:1000]
target_reg = housing.target[:1000]
feature_names_reg = housing.feature_names

df_reg = pd.DataFrame(data_reg, columns=feature_names_reg)

from sklearn.svm import SVR

model_reg = SVR().fit(df_reg, target_reg)

prediction_label_reg = pd.Series(model_reg.predict(df_reg))
actual_label_reg = pd.Series(target_reg)

### Generate feature importance values
Note that the model itself is not used here. Only its prediction outputs are used.

In [17]:
def model_func_reg(_):
    return np.array(prediction_label_reg)


explainer_reg = MimicExplainer(
    model_func_reg,
    df_reg,
    LGBMExplainableModel,
    augment_data=False,
    is_function=True,
)

feature_importance_values_reg = pd.DataFrame(
    explainer_reg.explain_local(df_reg).local_importance_values,
    columns=feature_names_reg,
)

feature_importance_values_reg

Unnamed: 0,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude
0,0.015311,-0.002339,-0.000323,-0.000011,-0.179580,0.000087,-0.000761,2.031643e-05
1,0.015648,0.003556,0.000510,-0.001013,0.235775,0.000251,-0.000751,-2.424062e-04
2,0.002295,-0.005026,0.000525,0.000115,-0.131106,0.000258,-0.000344,-3.687946e-06
3,0.000629,-0.005884,0.000581,0.000116,-0.115065,0.000062,-0.000355,2.577298e-06
4,-0.000214,-0.006022,0.000452,0.000048,-0.112893,0.000112,-0.000341,-2.379053e-07
...,...,...,...,...,...,...,...,...
995,-0.001465,0.004850,0.000106,0.000529,0.732082,0.000627,-0.000586,-7.360537e-03
996,0.015947,0.004487,-0.000017,-0.000008,-0.198665,0.000013,-0.000715,-5.934504e-04
997,-0.000803,0.004343,-0.000384,0.000560,0.245006,0.000255,0.000214,3.893223e-04
998,0.000144,0.006334,-0.000519,-0.000915,0.198935,-0.000431,0.000118,2.966419e-04


Assemble Pandas DataFrame as a production dataset with prediction IDs and timestamps.

In [18]:
feature_importance_values_column_names_mapping_reg = {
    f"{feat}": f"{feat} (feature importance)" for feat in feature_names_reg
}

production_dataset_reg = pd.concat(
    [
        pd.DataFrame(
            {
                "prediction_id": generate_prediction_ids(df_reg),
                "prediction_ts": simulate_production_timestamps(df_reg),
                "prediction_label": prediction_label_reg,
                "actual_label": actual_label_reg,
            }
        ),
        df_reg,
        feature_importance_values_reg.rename(
            columns=feature_importance_values_column_names_mapping_reg
        ),
    ],
    axis=1,
)

production_dataset_reg

Unnamed: 0,prediction_id,prediction_ts,prediction_label,actual_label,MedInc,HouseAge,AveRooms,AveBedrms,Population,AveOccup,Latitude,Longitude,MedInc (feature importance),HouseAge (feature importance),AveRooms (feature importance),AveBedrms (feature importance),Population (feature importance),AveOccup (feature importance),Latitude (feature importance),Longitude (feature importance)
0,5ff849a6-8980-4e9d-a1bb-7d3620faaad6,1.634424e+09,1.774298,4.526,8.3252,41.0,6.984127,1.023810,322.0,2.555556,37.88,-122.23,0.015311,-0.002339,-0.000323,-0.000011,-0.179580,0.000087,-0.000761,2.031643e-05
1,efe396fa-38d8-4d91-ba38-811610e14825,1.634427e+09,2.192030,3.585,8.3014,21.0,6.238137,0.971880,2401.0,2.109842,37.86,-122.22,0.015648,0.003556,0.000510,-0.001013,0.235775,0.000251,-0.000751,-2.424062e-04
2,dbbb46d2-4c5c-4fa8-b1a8-67fb64ae8ce9,1.634430e+09,1.819093,3.521,7.2574,52.0,8.288136,1.073446,496.0,2.802260,37.85,-122.24,0.002295,-0.005026,0.000525,0.000115,-0.131106,0.000258,-0.000344,-3.687946e-06
3,86086f75-d0d9-433e-8299-fb9cc9cfc70c,1.634432e+09,1.833490,3.413,5.6431,52.0,5.817352,1.073059,558.0,2.547945,37.85,-122.25,0.000629,-0.005884,0.000581,0.000116,-0.115065,0.000062,-0.000355,2.577298e-06
4,b295a43e-2eec-4086-bb3f-9a222b4b7a4b,1.634435e+09,1.834195,3.422,3.8462,52.0,6.281853,1.081081,565.0,2.181467,37.85,-122.25,-0.000214,-0.006022,0.000452,0.000048,-0.112893,0.000112,-0.000341,-2.379053e-07
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,22281192-29ca-417b-89fa-fda745bd74ae,1.637006e+09,2.693768,1.924,4.8624,11.0,5.680000,1.044706,5826.0,2.741647,37.71,-121.75,-0.001465,0.004850,0.000106,0.000529,0.732082,0.000627,-0.000586,-7.360537e-03
996,201564c0-94fe-4cc4-aa2b-d31313419c4c,1.637009e+09,1.760488,4.188,9.1531,25.0,5.811765,0.952941,254.0,2.988235,37.74,-121.77,0.015947,0.004487,-0.000017,-0.000008,-0.198665,0.000013,-0.000715,-5.934504e-04
997,f2195a5e-a2fa-471b-a442-b0e4bd17fa0b,1.637011e+09,2.201814,2.168,4.7361,22.0,6.080220,1.036264,2474.0,2.718681,37.70,-121.80,-0.000803,0.004343,-0.000384,0.000560,0.245006,0.000255,0.000214,3.893223e-04
998,625b0c7d-418e-4861-ab5f-88a2412fa348,1.637014e+09,2.158937,2.155,5.4324,17.0,5.975831,0.965257,2222.0,3.356495,37.69,-121.80,0.000144,0.006334,-0.000519,-0.000915,0.198935,-0.000431,0.000118,2.966419e-04


Send DataFrame to Arize.

In [None]:
# Define a Schema() object for Arize to pick up data from the correct columns for logging
production_schema_reg = Schema(
    prediction_id_column_name="prediction_id",  # REQUIRED
    timestamp_column_name="prediction_ts",
    prediction_label_column_name="prediction_label",
    actual_label_column_name="actual_label",
    feature_column_names=feature_names_reg,
    shap_values_column_names=feature_importance_values_column_names_mapping_reg,
)

# arize_client.log returns a Response object from Python's requests module
response_reg = arize_client.log(
    dataframe=production_dataset_reg,
    schema=production_schema_reg,
    model_id="surrogate_model_example_regression",
    model_type=ModelTypes.NUMERIC,
    environment=Environments.PRODUCTION,
    path="inferences.bin",
)

# If successful, the server will return a status_code of 200
if response_reg.status_code != 200:
    print(
        f"❌ logging failed with response code {response_reg.status_code}, {response_reg.text}"
    )
else:
    print(
        f"✅ You have successfully logged {len(production_dataset_reg)} data points to Arize!"
    )

##  Conclusion
You now know how to seamlessly log surrogate model feature importance values onto the Arize platform. Go to [Arize](https://app.arize.com/) in order to analyze and monitor the logged SHAP values.

### Overview
Arize is an end-to-end ML observability and model monitoring platform. The platform is designed to help ML engineers and data science practitioners surface and fix issues with ML models in production faster with:
- Automated ML monitoring and model monitoring
- Workflows to troubleshoot model performance
- Real-time visualizations for model performance monitoring, data quality monitoring, and drift monitoring
- Model prediction cohort analysis
- Pre-deployment model validation
- Integrated model explainability

### Website
Visit Us At: https://arize.com/model-monitoring/

### Additional Resources
- [What is ML observability?](https://arize.com/what-is-ml-observability/)
- [Playbook to model monitoring in production](https://arize.com/the-playbook-to-monitor-your-models-performance-in-production/)
- [Using statistical distance metrics for ML monitoring and observability](https://arize.com/using-statistical-distance-metrics-for-machine-learning-observability/)
- [ML infrastructure tools for data preparation](https://arize.com/ml-infrastructure-tools-for-data-preparation/)
- [ML infrastructure tools for model building](https://arize.com/ml-infrastructure-tools-for-model-building/)
- [ML infrastructure tools for production](https://arize.com/ml-infrastructure-tools-for-production-part-1/)
- [ML infrastructure tools for model deployment and model serving](https://arize.com/ml-infrastructure-tools-for-production-part-2-model-deployment-and-serving/)
- [ML infrastructure tools for ML monitoring and observability](https://arize.com/ml-infrastructure-tools-ml-observability/)

Visit the [Arize Blog](https://arize.com/blog) and [Resource Center](https://arize.com/resource-hub/) for more resources on ML observability and model monitoring.
