# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="../images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 04: Batch Inference</span>

## 🗒️ This notebook is divided into the following sections:

1. Load batch data.
2. Predict using model from Model Registry.

## <span style='color:#ff5f27'> 📝 Imports

In [None]:
import joblib
import datetime
import time
import pandas as pd

## <span style="color:#ff5f27;"> 📡 Connecting to Hopsworks Feature Store </span>

In [None]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store()

## <span style="color:#ff5f27;"> ⚙️ Feature View Retrieval</span>


In [None]:
# Retrieve the 'air_quality_fv' feature view
feature_view = fs.get_feature_view(
    name='air_quality_fv',
    version=1,
)

## <span style="color:#ff5f27;">🗄 Model Registry</span>


In [None]:
# Retrieve the model registry
mr = project.get_model_registry()

## <span style="color:#ff5f27;">🪝 Retrieving model from Model Registry</span>

In [None]:
# Retrieving the 'air_quality_xgboost_model' from the model registry
retrieved_model = mr.get_model(
    name="air_quality_xgboost_model",
    version=1,
)

# Downloading the saved model artifacts to a local directory
saved_model_dir = retrieved_model.download()

In [None]:
# Loading the XGBoost regressor model and label encoder from the saved model directory
retrieved_xgboost_model = joblib.load(saved_model_dir + "/xgboost_regressor.pkl")
retrieved_encoder = joblib.load(saved_model_dir + "/label_encoder.pkl")

# Displaying the retrieved XGBoost regressor model
retrieved_xgboost_model

## <span style="color:#ff5f27;">✨ Load Batch Data of last days</span>

First, you will need to fetch the training dataset that you created in the previous notebook.

In [None]:
# Getting the current date
today = datetime.date.today()

# Calculating a date threshold 30 days ago from the current date
date_threshold = today - datetime.timedelta(days=30)

# Converting the date threshold to a string format
str(date_threshold)

In [None]:
# Initializing batch scoring
feature_view.init_batch_scoring(1)

# Retrieving batch data from the feature view with a start time set to the date threshold
batch_data = feature_view.get_batch_data(start_time=date_threshold)

### <span style="color:#ff5f27;">🤖 Making the predictions</span>

In [None]:
# Transforming the 'city_name' column in the batch data using the retrieved label encoder
encoded = retrieved_encoder.transform(batch_data['city_name'])

# Concatenating the label-encoded 'city_name' with the original batch data
X_batch = pd.concat([batch_data, pd.DataFrame(encoded)], axis=1)

# Dropping unnecessary columns ('date', 'city_name', 'unix_time') from the batch data
X_batch = X_batch.drop(columns=['date', 'city_name', 'unix_time'])

# Renaming the newly added column with label-encoded city names to 'city_name_encoded'
X_batch = X_batch.rename(columns={0: 'city_name_encoded'})

# Extracting the target variable 'pm2_5' from the batch data
y_batch = X_batch.pop('pm2_5')

In [None]:
# Making predictions on the batch data using the retrieved XGBoost regressor model
predictions = retrieved_xgboost_model.predict(X_batch)

# Displaying the first 5 predictions
predictions[:5]

---
## <span style="color:#ff5f27;">👾 Now try out the Streamlit App!</span>

In [None]:
!python3 -m streamlit run streamlit_app.py

---

### <span style="color:#ff5f27;">🥳 <b> Next Steps  </b> </span>
Congratulations you've now completed the Air Quality tutorial for Managed Hopsworks.

Check out our other tutorials on ➡ https://github.com/logicalclocks/hopsworks-tutorials

Or documentation at ➡ https://docs.hopsworks.ai