# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="../images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 04: Batch Inference</span>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/air_quality/4_air_quality_batch_inference.ipynb)

## 🗒️ This notebook is divided into the following sections:

1. Load batch data.
2. Predict using model from Model Registry.

## <span style='color:#ff5f27'> 📝 Imports

In [None]:
import joblib
import datetime
import time
import pandas as pd

## <span style="color:#ff5f27;"> 📡 Connecting to Hopsworks Feature Store </span>

In [None]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store()

## <span style="color:#ff5f27;"> ⚙️ Feature View Retrieval</span>


In [None]:
feature_view = fs.get_feature_view(
    name='air_quality_fv',
    version=1,
)

## <span style="color:#ff5f27;">🗄 Model Registry</span>


In [None]:
mr = project.get_model_registry()

## <span style="color:#ff5f27;">🪝 Retrieving model from Model Registry</span>

In [None]:
retrieved_model = mr.get_model(
    name="air_quality_xgboost_model",
    version=1,
)
saved_model_dir = retrieved_model.download()

In [None]:
retrieved_xgboost_model = joblib.load(saved_model_dir + "/xgboost_regressor.pkl")
retrieved_encoder = joblib.load(saved_model_dir + "/label_encoder.pkl")
retrieved_xgboost_model

## <span style="color:#ff5f27;">✨ Load Batch Data of last days</span>

First, you will need to fetch the training dataset that you created in the previous notebook.

In [None]:
today = datetime.date.today()
date_threshold = today - datetime.timedelta(days=30)
str(date_threshold)

In [None]:
start_of_cell = time.time()

feature_view.init_batch_scoring(training_dataset_version=1)
batch_data = feature_view.get_batch_data(start_time=date_threshold)

end_of_cell = time.time()
print(f"Took {round(end_of_cell - start_of_cell, 2)} sec.\n")

### <span style="color:#ff5f27;">🤖 Making the predictions</span>

In [None]:
# Transform the data
encoded = retrieved_encoder.transform(batch_data['city_name'])

# Convert the output to a dense array and concatenate with the original data
X_batch = pd.concat([batch_data, pd.DataFrame(encoded)], axis=1)

X_batch = X_batch.drop(columns=['date', 'city_name', 'unix_time'])
X_batch = X_batch.rename(columns={0: 'city_name_encoded'})

y_batch = X_batch.pop('pm2_5')

In [None]:
predictions = retrieved_xgboost_model.predict(X_batch)
predictions[:5]

---
## <span style="color:#ff5f27;">👾 Now try out the Streamlit App!</span>

In [None]:
# install dependcies
!pip3 install geopy streamlit streamlit-folium folium  --q

In [None]:
!python3 -m streamlit run streamlit_app.py

---

### <span style="color:#ff5f27;">🥳 <b> Next Steps  </b> </span>
Congratulations you've now completed the Air Quality tutorial for Managed Hopsworks.

Check out our other tutorials on ➡ https://github.com/logicalclocks/hopsworks-tutorials

Or documentation at ➡ https://docs.hopsworks.ai