# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="../../images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 03: Batch Inference</span>


## üóíÔ∏è This notebook is divided into the following sections:

1. Load batch data.
2. Predict using model from Model Registry.

## <span style='color:#ff5f27'> üìù Imports

In [1]:
from xgboost import XGBClassifier

## <span style="color:#ff5f27;"> üì° Connecting to Hopsworks Model Registry </span>

In [2]:
import hopsworks

project = hopsworks.login()

# Get the model registry
mr = project.get_model_registry()

2025-11-21 16:52:41,831 INFO: Initializing external client
2025-11-21 16:52:41,831 INFO: Base URL: https://10.87.45.81:28181
2025-11-21 16:52:42,391 INFO: Python Engine initialized.

Logged in to project, explore it here https://10.87.45.81:28181/p/120


## <span style='color:#ff5f27'>üöÄ Fetch and test the model</span>

Finally you can start making predictions with your model! Retrieve your model from Hopsworks model registry.

In [3]:
# Retrieve the model from the model registry
retrieved_model = mr.get_model(
    name="xgboost_fraud_batch_model",
    version=1,
)

# Download the saved model files to a local directory
saved_model_dir = retrieved_model.download()

Downloading: 0.000%|          | 0/203340 elapsed<00:00 remaining<?

Downloading model artifact (0 dirs, 1 files)... 

Downloading: 0.000%|          | 0/95827 elapsed<00:00 remaining<?

Downloading model artifact (1 dirs, 2 files)... DONE

In [4]:
# Initialize the model
model = XGBClassifier()

# Load the model from a saved JSON file
model.load_model(saved_model_dir + "/model.json")
model

0,1,2
,objective,'binary:logistic'
,base_score,[0.0014511199]
,booster,'gbtree'
,callbacks,
,colsample_bylevel,
,colsample_bynode,
,colsample_bytree,
,device,
,early_stopping_rounds,
,enable_categorical,False


## <span style="color:#ff5f27;"> ‚öôÔ∏è Feature View Retrieval</span>


In [5]:
# Retrieve the 'transactions_view_fraud_batch_fv' feature view
feature_view = retrieved_model.get_feature_view()

2025-11-21 16:52:46,945 INFO: Initializing for batch retrieval of feature vectors


---
## <span style="color:#ff5f27;">üîÆ  Batch Prediction </span>

Batch data can be fetched in Hopsworks using the `get_batch_data` function. 

Setting `logging_data` to `True` makes Hopsworks fetch the all required logging metadata like untransformed features, event time, serving keys and saves it in a attribute called `hopsworks_logging_data` within the returned dataframe. 

When this dataframe is passed to the `log` function, Hopsworks automatically extracts all the required data and inserts it into the logging feature group.

In [6]:
# Initialize batch scoring
feature_view.init_batch_scoring(1)

# Fetching batch data, with logging metadata.
batch_data = feature_view.get_batch_data(logging_data=False)

# Display the first 3 rows
batch_data.head(3)

Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (4.60s) 


Unnamed: 0,amount,age_at_transaction,days_until_card_expires,loc_delta,trans_volume_mstd,trans_volume_mavg,trans_freq,loc_delta_mavg,label_encoder_category_
0,94.39,79.296388,1095.818322,0.0,0.0,94.39,1.0,0.0,8
1,72.32,79.296728,1095.69441,0.154888,15.605847,83.355,2.0,0.077444,1
2,84.45,79.302222,1093.689097,0.223899,0.0,84.45,1.0,0.223899,4


In [7]:
# Make predictions on the batch data
predictions = model.predict(batch_data)

---
## <span style="color:#ff5f27;">üîÆ  Logging Prediction along with Features for Monitoring </span>

Logging the features along with the predictions performed by the model. These logs can then be read to monitor the performance of the model and detect drifts.

In [8]:
# Log the batch data along prediction made. 
feature_view.log(batch_data, predictions=predictions, model=retrieved_model)

2025-11-21 16:52:55,497 INFO: The following columns : `category`, `cc_num`, `datetime`, `request_id`, `request_parameters`, `tid` are missing in the logged dataframe. Setting them to None.


Uploading Dataframe: 100.00% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| Rows 105092/105092 | Elapsed Time: 00:21 | Remaining Time: 00:00


[(Job('transactions_view_fraud_batch_fv_1_log_1_offline_fg_materialization', 'PYSPARK'),
  None)]

In [12]:
# stop the job materialization schedule and materialize log manually
feature_view.pause_logging()
feature_view.materialize_log(wait=True)

Launching job: transactions_view_fraud_batch_fv_1_log_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://10.87.45.81:28181/p/120/jobs/named/transactions_view_fraud_batch_fv_1_log_1_offline_fg_materialization/executions
2025-11-21 16:56:23,454 INFO: Waiting for execution to finish. Current state: SUBMITTED. Final status: UNDEFINED
2025-11-21 16:56:26,539 INFO: Waiting for execution to finish. Current state: RUNNING. Final status: UNDEFINED
2025-11-21 16:57:58,897 INFO: Waiting for execution to finish. Current state: AGGREGATING_LOGS. Final status: SUCCEEDED
2025-11-21 16:57:58,975 INFO: Waiting for log aggregation to finish.
2025-11-21 16:58:07,355 INFO: Execution finished successfully.


[Job('transactions_view_fraud_batch_fv_1_log_1_offline_fg_materialization', 'PYSPARK')]

In [13]:
# read untransformed log
feature_view.read_log().head(3)


Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (9.64s) 


Unnamed: 0,request_id,td_version,model_name,model_version,request_parameters,cc_num,datetime,tid,predicted_fraud_label,category,amount,age_at_transaction,days_until_card_expires,loc_delta,trans_volume_mstd,trans_volume_mavg,trans_freq,loc_delta_mavg,label_encoder_category_
0,,1,xgboost_fraud_batch_model,1,,,NaT,,0,,86.88,44.593532,1060.360903,0.176381,0.0,86.88,1.0,0.176381,4
1,,1,xgboost_fraud_batch_model,1,,,NaT,,0,,15.38,100.021841,1680.02787,0.382512,0.0,15.38,1.0,0.382512,4
2,,1,xgboost_fraud_batch_model,1,,,NaT,,0,,100.4,69.515179,491.959525,0.337089,0.0,100.4,1.0,0.337089,0


In [14]:
# read logs
feature_view.read_log().head(3)


Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (9.54s) 


Unnamed: 0,request_id,td_version,model_name,model_version,request_parameters,cc_num,datetime,tid,predicted_fraud_label,category,amount,age_at_transaction,days_until_card_expires,loc_delta,trans_volume_mstd,trans_volume_mavg,trans_freq,loc_delta_mavg,label_encoder_category_
0,,1,xgboost_fraud_batch_model,1,,,NaT,,0,,86.88,44.593532,1060.360903,0.176381,0.0,86.88,1.0,0.176381,4
1,,1,xgboost_fraud_batch_model,1,,,NaT,,0,,15.38,100.021841,1680.02787,0.382512,0.0,15.38,1.0,0.382512,4
2,,1,xgboost_fraud_batch_model,1,,,NaT,,0,,100.4,69.515179,491.959525,0.337089,0.0,100.4,1.0,0.337089,0


---
## <span style="color:#ff5f27;">‚è≠Ô∏è **Next:** Part 04: Model Monitoring</span>

In the following notebook you will monitor your model using the logs that have been written.