## Inference Pipeline

In [1]:
%load_ext autoreload
%autoreload 2

In [9]:
from datetime import datetime, timedelta
import pytz
import pandas as pd

current_date = pd.to_datetime(datetime.now(pytz.utc)).floor('H')
print(f'{current_date=}')

current_date=Timestamp('2024-10-15 20:00:00+0000', tz='UTC')


The following cell will fetch a batch of raw features just as we did when we made our **frontend.py** app. This downloads rides from the past 28 days. Just as in our frontend app, this function will not work unless the most recent data is available to us. Using the GitHub actions, these features will not be available for the first 15-25 minutes of every hour or so. This is why this notebook will not be run unless our feature_pipeline is completed.

In [11]:
from src.inference import load_batch_of_features_from_store

features = load_batch_of_features_from_store(current_date)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.





Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1049751
Connected. Call `.close()` to terminate connection gracefully.
Fetching rides from 2024-09-17 20:00:00+00:00 to 2024-10-15 19:00:00+00:00
Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (8.31s) 


##### Load the Model

Now, we will connect to the project and model registry, download the model, and load it to a local file and get predictions from the model using the new features.

The **get_model_predictions** function returns a dataframe with the pickup_location_id and predicted demand rounded to the nearest integer. The predictions are the number of rides for the current hour since the features are the previous hours over the last 28 days up until an hour ago.

In [12]:
from src.inference import(
    load_model_from_registry,
    get_model_predictions
)

model = load_model_from_registry()
predictions = get_model_predictions(model=model, features=features)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.





Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1049751
Connected. Call `.close()` to terminate connection gracefully.


In [13]:
predictions

Unnamed: 0,pickup_location_id,predicted_demand
0,1,0.0
1,2,0.0
2,3,0.0
3,4,2.0
4,5,0.0
...,...,...
260,261,43.0
261,262,51.0
262,263,106.0
263,264,58.0


We are creating predictions that will be our feature group. The next steps are very similar to what we did when we did our feature_pipeline. We need to create a pickup_hour and pickup_ts column (hopsworks does not like datetime), then create the feature group and write the data to it.

In [14]:
predictions['pickup_hour'] = current_date
predictions['pickup_ts'] = predictions['pickup_hour'].astype('int64') // 10**6

In [15]:
predictions

Unnamed: 0,pickup_location_id,predicted_demand,pickup_hour,pickup_ts
0,1,0.0,2024-10-15 20:00:00+00:00,1729022400000
1,2,0.0,2024-10-15 20:00:00+00:00,1729022400000
2,3,0.0,2024-10-15 20:00:00+00:00,1729022400000
3,4,2.0,2024-10-15 20:00:00+00:00,1729022400000
4,5,0.0,2024-10-15 20:00:00+00:00,1729022400000
...,...,...,...,...
260,261,43.0,2024-10-15 20:00:00+00:00,1729022400000
261,262,51.0,2024-10-15 20:00:00+00:00,1729022400000
262,263,106.0,2024-10-15 20:00:00+00:00,1729022400000
263,264,58.0,2024-10-15 20:00:00+00:00,1729022400000


In [16]:
from src.inference import get_feature_store
import src.config as config

feature_store = get_feature_store()

feature_group = feature_store.get_or_create_feature_group(
    name = config.FEATURE_GROUP_MODEL_PREDICTIONS,
    version= 1,
    description= 'Predictions generated by model',
    primary_key= ['pickup_location_id', 'pickup_ts'],
    event_time= 'pickup_ts'
)





Connection closed.
Connected. Call `.close()` to terminate connection gracefully.





Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1049751
Connected. Call `.close()` to terminate connection gracefully.


In [17]:
# write the data to the new feature group
feature_group.insert(predictions, write_options={'wait_for_job' : True})



Feature Group created successfully, explore it at 
https://c.app.hopsworks.ai:443/p/1049751/fs/1041478/fg/1284432


Uploading Dataframe: 0.00% |          | Rows 0/265 | Elapsed Time: 00:00 | Remaining Time: ?

Launching job: model_predictions_feature_group_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/1049751/jobs/named/model_predictions_feature_group_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x18bcdcedd20>, None)