# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="../images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 04: Monitoring Pipeline</span>


## üóíÔ∏è This notebook is divided into the following sections:

1. Fetch the model and corresponding feature view to read logs.
2. Use the Logs to monitor the model for drifts.

In [1]:
import hopsworks
import plotly.graph_objects as go
from datetime import datetime, timedelta, timezone

## Connect to Hopsworks
Establish connection to the Hopsworks feture store.

In [2]:
project = hopsworks.login()

fs = project.get_feature_store()

mr = project.get_model_registry()

2025-11-21 13:37:26,420 INFO: Initializing external client
2025-11-21 13:37:26,420 INFO: Base URL: https://10.87.45.81:28181
2025-11-21 13:37:27,200 INFO: Python Engine initialized.

Logged in to project, explore it here https://10.87.45.81:28181/p/120


## Fetch the model to be monitored and fetch the feature view from it

In [47]:
retrieved_model = mr.get_model(
    name="xgboost_fraud_batch_model",
    version=1,
)

# Get the feature view used to train the model
feature_view = retrieved_model.get_feature_view() 

2025-11-21 08:31:19,586 INFO: Initializing for batch retrieval of feature vectors


## Read the required logs from the feature view.

In the example below, we read logging written in the last year.

In [48]:
# Read the required logs from the feature view

logs = feature_view.read_log(start_time=datetime.now(timezone.utc) - timedelta(days=1), end_time=datetime.now(timezone.utc))
logs.head()


Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (1.79s) 


Unnamed: 0,request_id,td_version,model_name,model_version,request_parameters,cc_num,datetime,tid,predicted_fraud_label,category,amount,age_at_transaction,days_until_card_expires,loc_delta,trans_volume_mstd,trans_volume_mavg,trans_freq,loc_delta_mavg,label_encoder_category_
0,,1,xgboost_fraud_batch_model,1,,4697111123445202,1970-01-01 00:27:22.492684,28c23357b139d6cf4013f0764c5c6271,0,Health/Beauty,65.38,41.589951,651.668009,0.618005,0.0,65.38,1.0,0.618005,5
1,,1,xgboost_fraud_batch_model,1,,4939322191531264,1970-01-01 00:27:28.621675,ddb8056813a538dd7c3bf4070be252c5,0,Cash Withdrawal,56.33,50.817176,31.730613,0.020371,0.0,56.33,1.0,0.020371,0
2,,1,xgboost_fraud_batch_model,1,,4387069901220731,1970-01-01 00:27:21.289398,78c51655f4717a157910523997c90d4e,0,Health/Beauty,89.95,17.946315,1000.594931,0.194714,0.0,89.95,1.0,0.194714,5
3,,1,xgboost_fraud_batch_model,1,,4862825903962339,1970-01-01 00:27:22.148729,361b4594e1b09eec37da7587fcd089e0,0,Health/Beauty,76.25,66.105071,1537.64897,0.484752,0.0,76.25,1.0,0.484752,5
4,,1,xgboost_fraud_batch_model,1,,4160806832774853,1970-01-01 00:27:23.751245,04e31d48d7812c75504c8caed971193b,0,Cash Withdrawal,94.36,67.640818,1763.101331,8.8e-05,0.0,94.36,1.0,8.8e-05,0


## Vizualise distribution of model inputs features and model training dataset features

In [72]:
model_training_dataset_version = retrieved_model.get_training_dataset_provenance().accessible[0].version

In [73]:
X_train, X_test, y_train, y_test = feature_view.get_train_test_split(model_training_dataset_version)

Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (1.65s) 
2025-11-21 08:43:46,672 INFO: Profiling dataframe in Python Engine
2025-11-21 08:43:47,446 INFO: Profiling dataframe in Python Engine
2025-11-21 08:43:47,461 INFO: Profiling dataframe in Python Engine


In [80]:
# Create the figure
fig = go.Figure()

# Add first histogram
fig.add_trace(go.Histogram(
    x=X_train['amount'],
    name='Training Data',
    opacity=0.6
))

# Add second histogram
fig.add_trace(go.Histogram(
    x=logs['amount'],
    name='Model Input',
    opacity=0.6
))

# Overlay histograms
fig.update_layout(
    barmode='overlay',
    title='Distribution Comparison - age_at_transaction',
    xaxis_title='amount',
    yaxis_title='Count'
)

fig.show()

## Integrating with other party libaries
One you have the logs stored you can read it up to integrate it with other model monitoring softwares like for example nannyml.

Here you an example of using nannyml's univarite drift detection using the logged features

In [81]:
! pip install nannyml --quiet

In [82]:
import nannyml as nml

In [84]:
feature_column_names = ["amount", "label_encoder_category_"]

In [102]:
univariate_calculator = nml.UnivariateDriftCalculator(
    column_names=feature_column_names,
    chunk_size=10000
)

In [103]:

univariate_calculator.fit(X_train)

<nannyml.drift.univariate.calculator.UnivariateDriftCalculator at 0x334db7f90>

In [104]:
univariate_drift = univariate_calculator.calculate(logs)

In [105]:
univariate_drift.plot()