# Arize Tutorial: Boston House Prices

Let's get started on using Arize!✨

Arize helps you visualize your model performance, understand drift & data quality issues, and share insights learned from your models. 

In this tutorial, we will be building a model to predict Boston House Prices. The model is predicting a numeric value so we call it a ModelType Numeric. After that, we will load the models's training inferences and test inferences into Arize. 🚀. 

### Running This Notebook
1. Click "Open in playground" to create a copy of this notebook for yourself.
2. Save a copy in Google Drive for yourself.
3. Step through each section below, pressing play on the code blocks to run the cells.
4. In Step 2, use your own Org and API key from your Arize account. 


[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Arize-ai/client_python/blob/main/arize/examples/tutorials/Arize_Tutorial_Boston_House_Prices.ipynb)


## Step 1: Load Data and Build Model

In [None]:
import numpy as np
import pandas as pd
import uuid
import matplotlib.pyplot as plt
from sklearn import ensemble
from sklearn import datasets
from sklearn.utils import shuffle
from sklearn.metrics import mean_squared_error

###############################################################################
# Load data
boston = datasets.load_boston()
X, y = shuffle(boston.data, boston.target, random_state=13)
X = X.astype(np.float32)
offset = int(X.shape[0] * 0.9)
X_train, y_train = X[:offset], y[:offset]
X_test, y_test = X[offset:], y[offset:]

###############################################################################
# Fit regression model
params = {'n_estimators': 500, 'max_depth': 4, 'min_samples_split': 0.5,
          'learning_rate': 0.01, 'loss': 'ls'}
clf = ensemble.GradientBoostingRegressor(**params)
clf.fit(X_train, y_train)
train_inferences = clf.predict(X_train)
test_inferences = clf.predict(X_test)

print('Step 1 ✅: Load Data & Build Model Done!')

Step 1 ✅: Load Data & Build Model Done!


## Step 2: Import and Setup Arize Client 

In [None]:
!pip install arize

from arize.api import Client
from arize.types import ModelTypes

ORGANIZATION_KEY = 'YOUR ORGANIZATION KEY'
API_KEY = 'YOUR API KEY'
arize = Client(organization_key=ORGANIZATION_KEY, api_key=API_KEY)

print('Step 2 ✅: Import and Setup Arize Client Done! Now we can start using Arize!')


Collecting arize
  Using cached https://files.pythonhosted.org/packages/80/07/6ba0d938a1075ee29ae6bdb962a51704eef597381b1ed578625b7bab8f71/arize-2.1.1-py2.py3-none-any.whl
Collecting requests-futures==1.0.0
  Using cached https://files.pythonhosted.org/packages/47/c4/fd48d1ac5110a5457c71ac7cc4caa93da10a80b8de71112430e439bdee22/requests-futures-1.0.0.tar.gz
Collecting protobuf==3.12.0
[?25l  Downloading https://files.pythonhosted.org/packages/c9/bf/5416042df9e48e89c60bdb116d9f1a72f2d014f6839a85148d63e6ae52dc/protobuf-3.12.0-cp37-cp37m-manylinux1_x86_64.whl (1.3MB)
[K     |████████████████████████████████| 1.3MB 4.5MB/s 
[?25hCollecting googleapis-common-protos==1.51.0
  Downloading https://files.pythonhosted.org/packages/05/46/168fd780f594a4d61122f7f3dc0561686084319ad73b4febbf02ae8b32cf/googleapis-common-protos-1.51.0.tar.gz
Building wheels for collected packages: requests-futures, googleapis-common-protos
  Building wheel for requests-futures (setup.py) ... [?25l[?25hdone
  Create

Step 2 ✅: Import and Setup Arize Client Done! Now we can start using Arize!


## Step 3: Log Training Inferences to Arize

In [None]:
features_df = pd.DataFrame(X_train)
features_df.columns = boston.feature_names
train_predictions_df = pd.DataFrame(train_inferences)
train_actual_labels_df = pd.DataFrame(y_train)
ids_df = pd.DataFrame([str(uuid.uuid4()) for _ in range(len(train_predictions_df))])

responses = arize.log_training_records(
    model_id='boston_house_prices',
    model_version='1.0',
    model_type=ModelTypes.NUMERIC,
    prediction_labels=train_predictions_df,
    actual_labels=train_actual_labels_df,
    features=features_df,
    )
    
## Listen to response code to ensure successful delivery
import concurrent.futures as cf
for response in cf.as_completed(responses):
  res = response.result()
  if res.status_code != 200:
    print(f'future failed with response code {res.status_code}, {res.text}')

print('Step 3 ✅: If no errors showed up, you have sent Training Inferences!')

## Step 4: Log Test Inferences to Arize

In [None]:
"""
Note: In a real production environment, you will use log_validation for the test set. 
However, just as an example here, we will think of the test set as the production data 
you might send for this model. For production, we use log_bulk_predictions 
and log_bulk_actuals.    
"""

features_df = pd.DataFrame(X_test)
features_df.columns = boston.feature_names
test_predictions_df = pd.DataFrame(test_inferences)
test_actual_labels_df = pd.DataFrame(y_test)
ids_df = pd.DataFrame([str(uuid.uuid4()) for _ in range(len(test_predictions_df))])

# First we log the predictions. We are using log_bulk_predictions since we are sending more than 1 prediction. 
responses = arize.log_bulk_predictions(
    model_id='boston_house_prices',
    model_version='1.0',
    model_type=ModelTypes.NUMERIC,
    prediction_ids=ids_df,
    prediction_labels=test_predictions_df,
    features=features_df,
    )

## Listen to response code to ensure successful delivery of predictions
import concurrent.futures as cf
for response in cf.as_completed(responses):
  res = response.result()
  if res.status_code != 200:
    print(f'future failed with response code {res.status_code}, {res.text}')

# Next, we log the actuals. We are using log_bulk_actuals since we are sending more than 1 actual. 
responses = arize.log_bulk_actuals(
    model_id='boston_house_prices',
    prediction_ids=ids_df, # Pass in the same IDs to match the predictions & actuals. 
    actual_labels=test_actual_labels_df
    )

## Listen to response code to ensure successful delivery of actuals
import concurrent.futures as cf
for response in cf.as_completed(responses):
  res = response.result()
  if res.status_code != 200:
    print(f'future failed with response code {res.status_code}, {res.text}')

print('Step 4 ✅: If no errors showed up, you have successfully sent in Test Inferences!')