# New Relic ML Performance Monitoring - Bring Your Own Data

[ml-performance-monitoring](https://github.com/newrelic-experimental/ml-performance-monitoring) provides a Python library for sending machine learning models' inference data and performance metrics into New Relic. 
<br>
By using this package, you can easily and quickly monitor your model, directly from a Jupyter notebook or a cloud service. 
<br>
The package is ML framework agnostic and can be quickly integrated. It is based on the newrelic-telemetry-sdk-python library.
<br>
It is based on the [newrelic-telemetry-sdk-python](https://github.com/newrelic/newrelic-telemetry-sdk-python) library.


This notebook provides an example of sending inference data and metrics of an XGBoost model

<U>Note</U>- this notebook uses the libraries:
* numpy
* pandas
* sklearn
* xgboost

### 0. Install libraries

In [1]:
!pip3 install git+https://github.com/newrelic-experimental/ml-performance-monitoring.git

In [2]:
!pip3 install numpy==1.21.4 pandas==1.3.5 scikit-learn==1.0.2 xgboost==0.90

### 1. Import libraries

In [3]:
import numpy as np
import pandas as pd
import xgboost as xgb
from sklearn.metrics import mean_squared_error

from ml_performance_monitoring.monitor import MLPerformanceMonitoring

### 2. Load the Boston housing prices dataset and split it into train and test sets

In [4]:
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split

boston_dataset = load_boston()
X, y = (
    boston_dataset["data"],
    boston_dataset["target"],
)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=123
)

In [5]:
print(X_train[:5], y_train[:5])

[[3.51000e-02 9.50000e+01 2.68000e+00 0.00000e+00 4.16100e-01 7.85300e+00
  3.32000e+01 5.11800e+00 4.00000e+00 2.24000e+02 1.47000e+01 3.92780e+02
  3.81000e+00]
 [9.72418e+00 0.00000e+00 1.81000e+01 0.00000e+00 7.40000e-01 6.40600e+00
  9.72000e+01 2.06510e+00 2.40000e+01 6.66000e+02 2.02000e+01 3.85960e+02
  1.95200e+01]
 [1.39140e-01 0.00000e+00 4.05000e+00 0.00000e+00 5.10000e-01 5.57200e+00
  8.85000e+01 2.59610e+00 5.00000e+00 2.96000e+02 1.66000e+01 3.96900e+02
  1.46900e+01]
 [1.22040e-01 0.00000e+00 2.89000e+00 0.00000e+00 4.45000e-01 6.62500e+00
  5.78000e+01 3.49520e+00 2.00000e+00 2.76000e+02 1.80000e+01 3.57980e+02
  6.65000e+00]
 [1.36000e-02 7.50000e+01 4.00000e+00 0.00000e+00 4.10000e-01 5.88800e+00
  4.76000e+01 7.31970e+00 3.00000e+00 4.69000e+02 2.11000e+01 3.96900e+02
  1.48000e+01]] [48.5 17.1 23.1 28.4 18.9]


### 3. Fitting XGBoost regression model



In [6]:
xg_reg = xgb.XGBRegressor(
    objective="reg:squarederror",
    colsample_bytree=0.3,
    learning_rate=0.1,
    max_depth=5,
    alpha=10,
    n_estimators=10,
)
xg_reg.fit(X_train, y_train)

XGBRegressor(alpha=10, base_score=0.5, booster='gbtree', callbacks=None,
             colsample_bylevel=1, colsample_bynode=1, colsample_bytree=0.3,
             early_stopping_rounds=None, enable_categorical=False,
             eval_metric=None, gamma=0, gpu_id=-1, grow_policy='depthwise',
             importance_type=None, interaction_constraints='',
             learning_rate=0.1, max_bin=256, max_cat_to_onehot=4,
             max_delta_step=0, max_depth=5, max_leaves=0, min_child_weight=1,
             missing=nan, monotone_constraints='()', n_estimators=10, n_jobs=0,
             num_parallel_tree=1, predictor='auto', random_state=0,
             reg_alpha=10, ...)

### 4. Predicting the test set results

In [7]:
y_pred = xg_reg.predict(X_test)
y_pred

array([ 9.107417 , 17.003168 , 23.554592 , 11.127839 , 18.847708 ,
       14.83594  , 17.920063 ,  7.8613553, 12.147531 , 18.064953 ,
       18.074614 , 14.411147 , 11.660895 , 15.656607 , 14.011221 ,
       14.031893 , 14.247164 , 23.057888 , 13.79595  , 11.450888 ,
       12.218057 , 15.310874 , 20.628437 , 23.554592 , 15.16144  ,
       13.139177 , 11.691522 , 15.382896 , 15.382896 , 11.007488 ,
       14.734173 , 18.862646 ,  8.471523 , 15.349405 , 15.463427 ,
       19.77869  , 15.875313 ,  9.95701  , 12.919248 , 23.60624  ,
       18.295853 , 12.589657 , 14.179869 , 21.97658  , 12.664099 ,
       16.489174 , 14.038502 , 15.660043 , 12.919248 , 14.453838 ,
       18.862646 , 17.252022 , 14.038502 ,  7.9073224, 14.332668 ,
       10.871179 , 10.826269 ,  8.5319   , 19.534363 ,  8.196141 ,
       12.403829 , 14.235118 , 10.42806  , 13.162136 , 14.247164 ,
       15.931118 , 16.565668 ,  9.804545 , 14.915205 , 18.295853 ,
       13.15519  , 15.318869 , 12.919248 , 16.430174 , 11.8543

### 5. Record inference data to New Relic

The MLPerformanceMonitoring parameters: 
   * Required parameters:
      * `model_name` - must be unique per model
      *  `insert_key` - [Get your key](https://one.newrelic.com/launcher/api-keys-ui.api-keys-launcher) (also referenced as `ingest - license`) and set it as environment variable: `NEW_RELIC_INSERT_KEY`.
[Click here](https://docs.newrelic.com/docs/apis/intro-apis/new-relic-api-keys/#license-key) for more details and instructions.

* Optional parameters:
   * `metadata` (dictionary) - will be added to each event (row) of the data 
   * `send_data_metrics` (boolean) - send data metrics (statistics) to New Relic (False as default)
   * `features_columns`(list) - the features' names ordered as X columns.
   * `labels_columns` (list) - the labels' names ordered as y columns. 

(note: The parameters `features_columns` and `labels_columns` are only relevant when sending the data as an np.array. When the data is sent as a dataframe, the dataframes (X,y) columns' names will be taken as features and labels names respectively. In addition, if you send your data as an np.array without sending the features_columns and labels_columns, on New Relic data, the names will appear as "feature_{n}" and "lablel_{n}" numbered by the features/labels order)


5.1. Define monitoring parameters

In [8]:
metadata = {"environment": "notebook", "dataset": "Boston housing prices"}
model_version = "1.0"
features_columns, labels_columns = (
    list(boston_dataset["feature_names"]),
    ["target"],
)

5.2 Create model monitor

In [9]:
insert_key = None

ml_monitor = MLPerformanceMonitoring(
    insert_key=insert_key,  # set the environment variable NEW_RELIC_INSERT_KEY or send your insert key here
    model_name="XGBoost Regression on Boston housing Dataset",
    metadata=metadata,
    send_data_metrics=True,
    features_columns=features_columns,
    labels_columns=labels_columns,
    label_type="numeric",
    model_version=model_version
)

5.3 Send your data as an np.array.

In [None]:
ml_monitor.record_inference_data(X=X_test, y=y_pred)

5.4  Send your data as a pd.DataFrame.

In [10]:
X_df = pd.DataFrame(
    list(map(np.ravel, X_test)),
    columns=features_columns,
)

y_pred_df = pd.DataFrame(
    list(map(np.ravel, y_pred)),
    columns=labels_columns,
)
X_df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT
0,51.1358,0.0,18.1,0.0,0.597,5.757,100.0,1.413,24.0,666.0,20.2,2.6,10.11
1,0.05735,0.0,4.49,0.0,0.449,6.63,56.1,4.4377,3.0,247.0,18.5,392.3,6.53
2,0.03578,20.0,3.33,0.0,0.4429,7.82,64.5,4.6947,5.0,216.0,14.9,387.31,3.76
3,12.0482,0.0,18.1,0.0,0.614,5.648,87.6,1.9512,24.0,666.0,20.2,291.55,14.1
4,0.0315,95.0,1.47,0.0,0.403,6.975,15.3,7.6534,3.0,402.0,17.0,396.9,4.56


In [None]:
ml_monitor.record_inference_data(X=X_df, y=y_pred_df)

In [11]:
y_pred_df.head()

Unnamed: 0,target
0,9.107417
1,17.003168
2,23.554592
3,11.127839
4,18.847708


### 6. Record metrics to New Relic
You can stream custom metrics to New Relic, monitoring your model performance or model data. These metrics will be sent to NRDB as [metric data](https://docs.newrelic.com/docs/data-apis/ingest-apis/metric-api/introduction-metric-api/).

In [12]:
rmse = round(np.sqrt(mean_squared_error(y_test, y_pred)), 3)
print(f"RMSE: {rmse}")

RMSE: 10.517


In [None]:
metrics = {
    "RMSE": rmse,
}
ml_monitor.record_metrics(metrics=metrics)

### 7. Monitor and alert
Done! Check your application in the [New Relic UI](https://one.newrelic.com/nr1-core?filters=%28domain%20%3D%20%27MLOPS%27%20AND%20type%20%3D%20%27MACHINE_LEARNING_MODEL%27%29) to see the real time data.