### Model Training Pipeline

This notebook will be very similar to notebook **10_lightgbm_with_hyperparameter_tuning**. The major difference is that we will be reading the data from the feature store instead of a local parquet file. We will need the config.py variables so we will start by loading that in.

In [22]:
import warnings
warnings.filterwarnings('ignore')

In [23]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [24]:
# import config file for HOPSWORKS vars
import src.config as config

Connect to the project and get our feature store. Since we have already created the feature store, we do not need to include the description, primary keys, and event time keys as we previously did.

In [25]:
import hopsworks

# connect to our project
project = hopsworks.login(
    project=config.HOPSWORKS_PROJECT_NAME,
    api_key_value=config.HOPSWORKS_API_KEY
)

# get feature store
feature_store = project.get_feature_store()

# get feature group
feature_group = feature_store.get_or_create_feature_group(
    name = config.FEATURE_GROUP_NAME,
    version=config.FEATURE_GROUP_VERSION
)

Connection closed.
Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/1049751
Connected. Call `.close()` to terminate connection gracefully.


##### Feature View

Next, we create the feature view. We code this so that we can create the feature view if it is not set up yet and if it is set up, we just get it.

A feature view is a set of features that come from one or more feature groups. In the feature view we can join features from multiple feature groups into a final dataset. A feature view is metadata which basically means "data about data" that organizes and manages the data we give it.

In [26]:
# create feature view if it is not already set up
try:
    feature_store.create_feature_view(
        name = config.FEATURE_VIEW_NAME,
        version=2,
        # grab every variable in our feature group - we only have one feature group
        query=feature_group.select_all()
    )
except:
    print('Feature view already created.')

# get feature view whether it was created before, or now
feature_view = feature_store.get_feature_view(
    name=config.FEATURE_VIEW_NAME,
    version=2
)

Feature view already created.


##### Create Training Data

From our feature view we will call the **training_data** method. This method will give us back two variables. The first variable is the actual time series training data which is taken from our feature_view. The second variable is more metadata about the data that include things such as names, datatypes, and statistics. Since we wont be using them we use **_** which tells python to ignore this information so that we only receive the training data.

In [27]:
ts_data, _ = feature_view.training_data(
    description='Hourly time series data',
)

Finished: Reading data from Hopsworks, using Hopsworks Feature Query Service (23.40s) 


In [28]:
# should be time series so we will have to convert this
ts_data.drop(columns = 'pickup_ts')

# sort by `pickup_location_id` and `pickup_hour`
ts_data.sort_values(by=['pickup_location_id', 'pickup_hour'], inplace=True)
ts_data

Unnamed: 0,pickup_hour,rides,pickup_location_id,pickup_ts
4925621,2022-01-01 00:00:00+00:00,0,1,1.640995e+12
310700,2022-01-01 01:00:00+00:00,0,1,1.640999e+12
854840,2022-01-01 02:00:00+00:00,0,1,1.641002e+12
3642105,2022-01-01 03:00:00+00:00,0,1,1.641006e+12
193219,2022-01-01 04:00:00+00:00,1,1,1.641010e+12
...,...,...,...,...
3090440,2024-10-02 14:00:00+00:00,1,265,1.727878e+12
3090644,2024-10-02 15:00:00+00:00,4,265,1.727881e+12
3090978,2024-10-02 16:00:00+00:00,5,265,1.727885e+12
3091380,2024-10-02 17:00:00+00:00,8,265,1.727888e+12


In [29]:
# sort this data 
ts_data.sort_values(by = ['pickup_location_id', 'pickup_hour'], inplace = True)
ts_data

Unnamed: 0,pickup_hour,rides,pickup_location_id,pickup_ts
4925621,2022-01-01 00:00:00+00:00,0,1,1.640995e+12
310700,2022-01-01 01:00:00+00:00,0,1,1.640999e+12
854840,2022-01-01 02:00:00+00:00,0,1,1.641002e+12
3642105,2022-01-01 03:00:00+00:00,0,1,1.641006e+12
193219,2022-01-01 04:00:00+00:00,1,1,1.641010e+12
...,...,...,...,...
3090440,2024-10-02 14:00:00+00:00,1,265,1.727878e+12
3090644,2024-10-02 15:00:00+00:00,4,265,1.727881e+12
3090978,2024-10-02 16:00:00+00:00,5,265,1.727885e+12
3091380,2024-10-02 17:00:00+00:00,8,265,1.727888e+12


In [30]:
from src.plot import plot_ts

# plot time square to make sure we are getting the data we need 
plot_ts(ts_data, locations=[43])

##### Features and Targets

The next step is to create the features and targets for our training data as we did previously.

In [31]:
from src.data import transform_ts_data_into_features_and_target

# call our function to create the features which are the last month of hourly demand
features, targets = transform_ts_data_into_features_and_target(
    ts_data,
    input_seq_len=24*28, # one month
    step_size=23,  # step size of 23 ensures that not every target is for midnight hour
)

features_and_target = features.copy()
features_and_target['target_rides_next_hour'] = targets

print(f'{features_and_target.shape=}')

AssertionError: 

In [12]:
print(f'{features_and_target.shape=}')

features_and_target.shape=(227386, 675)


##### Split into Training and Testing

In [13]:
features_and_target.head()

Unnamed: 0,rides_previous_672_hour,rides_previous_671_hour,rides_previous_670_hour,rides_previous_669_hour,rides_previous_668_hour,rides_previous_667_hour,rides_previous_666_hour,rides_previous_665_hour,rides_previous_664_hour,rides_previous_663_hour,...,rides_previous_7_hour,rides_previous_6_hour,rides_previous_5_hour,rides_previous_4_hour,rides_previous_3_hour,rides_previous_2_hour,rides_previous_1_hour,pickup_hour,pickup_location_id,target_rides_next_hour
0,0.0,0.0,0.0,0.0,1.0,1.0,0.0,2.0,0.0,0.0,...,2.0,0.0,1.0,0.0,0.0,0.0,0.0,2022-01-29 00:00:00+00:00,1,0.0
1,0.0,0.0,0.0,0.0,0.0,0.0,4.0,1.0,2.0,1.0,...,0.0,0.0,0.0,0.0,0.0,1.0,0.0,2022-01-29 23:00:00+00:00,1,0.0
2,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,2.0,...,2.0,2.0,0.0,1.0,2.0,0.0,0.0,2022-01-30 22:00:00+00:00,1,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,1.0,2.0,1.0,0.0,1.0,2022-01-31 21:00:00+00:00,1,1.0
4,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,1.0,0.0,0.0,0.0,2022-02-01 20:00:00+00:00,1,0.0


In [14]:
import pandas as pd
# verify that the pickup_hour is in datetime format
features_and_target['pickup_hour'] = pd.to_datetime(features_and_target['pickup_hour'])

In [15]:
features_and_target['pickup_hour'].dtype

datetime64[ns, UTC]

In [16]:
from datetime import date, timedelta
from pytz import timezone
from src.data_split import train_test_split

# training data -> from January 2022 up until 1 month ago
# test data -> last month
cutoff_date = pd.to_datetime(date.today() - timedelta(days=28*3), utc=True)

print(f'{cutoff_date=}')

X_train, y_train, X_test, y_test = train_test_split(
    features_and_target,
    cutoff_date,
    target_column_name='target_rides_next_hour'   
)

print(f'{X_train.shape=}')
print(f'{y_train.shape=}')
print(f'{X_test.shape=}')
print(f'{y_test.shape=}')


cutoff_date=Timestamp('2024-07-09 00:00:00+0000', tz='UTC')
X_train.shape=(205671, 674)
y_train.shape=(205671,)
X_test.shape=(21715, 674)
y_test.shape=(21715,)


##### Model Creation

Here we will use the model that we found to be the best in our earlier exploration which was a LightGBM model. We will use the same process as before by using Optuna to find the best parameters for the model.

In [17]:
import numpy as np
from sklearn.model_selection import KFold, TimeSeriesSplit
from sklearn.pipeline import make_pipeline
from sklearn.metrics import mean_absolute_error
import optuna

from src.model import get_pipeline

def objective(trial: optuna.trial.Trial) -> float:
    """
    Given a set of hyper-parameters, it trains a model and computes an average
    validation error based on a TimeSeriesSplit
    """
    # pick hyper-parameters
    hyperparams = {
        "metric": 'mae',
        "verbose": -1,
        "num_leaves": trial.suggest_int("num_leaves", 2, 256),
        "feature_fraction": trial.suggest_float("feature_fraction", 0.2, 1.0),
        "bagging_fraction": trial.suggest_float("bagging_fraction", 0.2, 1.0),
        "min_child_samples": trial.suggest_int("min_child_samples", 3, 100),   
    }
    
    # sort X_train by `pikup_hour` inplace
    # so the TimeSeriesSplit will split the data in a consistent way
    X_train.sort_values('pickup_hour', inplace=True)

    tss = TimeSeriesSplit(n_splits=2)
    scores = []
    for train_index, val_index in tss.split(X_train):

        # split data for training and validation
        X_train_, X_val_ = X_train.iloc[train_index, :], X_train.iloc[val_index,:]
        y_train_, y_val_ = y_train.iloc[train_index], y_train.iloc[val_index]
        
        # train the model
        pipeline = get_pipeline(**hyperparams)
        pipeline.fit(X_train_, y_train_)
        
        # evaluate the model
        y_pred = pipeline.predict(X_val_)
        mae = mean_absolute_error(y_val_, y_pred)

        scores.append(mae)
   
    # Return the mean score
    return np.array(scores).mean()

In [18]:
# make a study
study = optuna.create_study(direction = 'minimize')
# only use 3 trials so it doesn't take so long
study.optimize(objective, n_trials = 10)


[I 2024-10-01 15:24:36,164] A new study created in memory with name: no-name-5a8ae034-b6ef-476c-9c52-fa002612dfbf


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 66, 'feature_fraction': 0.32496671891708023, 'bagging_fraction': 0.8966759586438529, 'min_child_samples': 58}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 66, 'feature_fraction': 0.32496671891708023, 'bagging_fraction': 0.8966759586438529, 'min_child_samples': 58}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:25:42,599] Trial 0 finished with value: 23.772293270598283 and parameters: {'num_leaves': 66, 'feature_fraction': 0.32496671891708023, 'bagging_fraction': 0.8966759586438529, 'min_child_samples': 58}. Best is trial 0 with value: 23.772293270598283.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 229, 'feature_fraction': 0.24410649563746034, 'bagging_fraction': 0.29142011408207147, 'min_child_samples': 57}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 229, 'feature_fraction': 0.24410649563746034, 'bagging_fraction': 0.29142011408207147, 'min_child_samples': 57}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:26:59,704] Trial 1 finished with value: 24.08087778636444 and parameters: {'num_leaves': 229, 'feature_fraction': 0.24410649563746034, 'bagging_fraction': 0.29142011408207147, 'min_child_samples': 57}. Best is trial 0 with value: 23.772293270598283.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 29, 'feature_fraction': 0.5949205164762663, 'bagging_fraction': 0.8522097028780602, 'min_child_samples': 49}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 29, 'feature_fraction': 0.5949205164762663, 'bagging_fraction': 0.8522097028780602, 'min_child_samples': 49}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:28:09,709] Trial 2 finished with value: 23.535971860226994 and parameters: {'num_leaves': 29, 'feature_fraction': 0.5949205164762663, 'bagging_fraction': 0.8522097028780602, 'min_child_samples': 49}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 120, 'feature_fraction': 0.6817020353921532, 'bagging_fraction': 0.7639464435658025, 'min_child_samples': 12}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 120, 'feature_fraction': 0.6817020353921532, 'bagging_fraction': 0.7639464435658025, 'min_child_samples': 12}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:29:21,384] Trial 3 finished with value: 24.131529439827535 and parameters: {'num_leaves': 120, 'feature_fraction': 0.6817020353921532, 'bagging_fraction': 0.7639464435658025, 'min_child_samples': 12}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 220, 'feature_fraction': 0.41638312670580935, 'bagging_fraction': 0.6463413993475755, 'min_child_samples': 20}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 220, 'feature_fraction': 0.41638312670580935, 'bagging_fraction': 0.6463413993475755, 'min_child_samples': 20}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:30:55,970] Trial 4 finished with value: 24.403585464176793 and parameters: {'num_leaves': 220, 'feature_fraction': 0.41638312670580935, 'bagging_fraction': 0.6463413993475755, 'min_child_samples': 20}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 189, 'feature_fraction': 0.7091432608455286, 'bagging_fraction': 0.48746197605241903, 'min_child_samples': 71}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 189, 'feature_fraction': 0.7091432608455286, 'bagging_fraction': 0.48746197605241903, 'min_child_samples': 71}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:32:29,278] Trial 5 finished with value: 24.462503058506385 and parameters: {'num_leaves': 189, 'feature_fraction': 0.7091432608455286, 'bagging_fraction': 0.48746197605241903, 'min_child_samples': 71}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 112, 'feature_fraction': 0.6409753647249099, 'bagging_fraction': 0.6969873082994154, 'min_child_samples': 65}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 112, 'feature_fraction': 0.6409753647249099, 'bagging_fraction': 0.6969873082994154, 'min_child_samples': 65}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:33:44,175] Trial 6 finished with value: 24.052277867512302 and parameters: {'num_leaves': 112, 'feature_fraction': 0.6409753647249099, 'bagging_fraction': 0.6969873082994154, 'min_child_samples': 65}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 153, 'feature_fraction': 0.3634695176293353, 'bagging_fraction': 0.20268874449696794, 'min_child_samples': 64}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 153, 'feature_fraction': 0.3634695176293353, 'bagging_fraction': 0.20268874449696794, 'min_child_samples': 64}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:35:02,767] Trial 7 finished with value: 24.128790731625298 and parameters: {'num_leaves': 153, 'feature_fraction': 0.3634695176293353, 'bagging_fraction': 0.20268874449696794, 'min_child_samples': 64}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 88, 'feature_fraction': 0.2966810086295178, 'bagging_fraction': 0.3246404180447704, 'min_child_samples': 51}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 88, 'feature_fraction': 0.2966810086295178, 'bagging_fraction': 0.3246404180447704, 'min_child_samples': 51}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:36:04,686] Trial 8 finished with value: 23.79751875070458 and parameters: {'num_leaves': 88, 'feature_fraction': 0.2966810086295178, 'bagging_fraction': 0.3246404180447704, 'min_child_samples': 51}. Best is trial 2 with value: 23.535971860226994.


Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 44, 'feature_fraction': 0.42007906439095327, 'bagging_fraction': 0.4064473842616709, 'min_child_samples': 17}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:
Creating pipeline with hyperparameters: {'metric': 'mae', 'verbose': -1, 'num_leaves': 44, 'feature_fraction': 0.42007906439095327, 'bagging_fraction': 0.4064473842616709, 'min_child_samples': 17}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


[I 2024-10-01 15:37:07,103] Trial 9 finished with value: 23.591488152909946 and parameters: {'num_leaves': 44, 'feature_fraction': 0.42007906439095327, 'bagging_fraction': 0.4064473842616709, 'min_child_samples': 17}. Best is trial 2 with value: 23.535971860226994.


In [19]:
# get the best parameters
best_params = study.best_trial.params
print(f'{best_params=}')

best_params={'num_leaves': 29, 'feature_fraction': 0.5949205164762663, 'bagging_fraction': 0.8522097028780602, 'min_child_samples': 49}


##### Train the Entire Training Data on Best Params

In [20]:
# create pipeline with the best parameters
pipeline = get_pipeline(**best_params)

# fit 
pipeline.fit(X_train, y_train)


Creating pipeline with hyperparameters: {'num_leaves': 29, 'feature_fraction': 0.5949205164762663, 'bagging_fraction': 0.8522097028780602, 'min_child_samples': 49}
Added feature transformer for average rides
Added temporal features engineer
Pipeline created:


In [21]:
# mean absolute error
predictions = pipeline.predict(X_test)
mae = mean_absolute_error(y_test, predictions)
print(f'{mae}')

22.219706686061432


##### MAE Per Location

We can investigate the MAE per location or time to see why the model is more or less accurate for a certain location or time. Chances are, the model will be less accurate during times or locations with higher variance.

In [22]:
predictions

array([18.28735298, 13.33808174, 13.42265312, ..., 21.20848261,
       22.03356164, 24.40409239])

In [23]:
segmented_mae = X_test.copy()
segmented_mae = segmented_mae[['pickup_location_id']]

In [24]:
segmented_mae

Unnamed: 0,pickup_location_id
0,1
1,1
2,1
3,1
4,1
...,...
21710,265
21711,265
21712,265
21713,265


In [25]:
# make predictions a series and concat it
mae_pred = pd.Series(predictions, index = segmented_mae.index, name = 'predictions')

In [26]:
mae_true = pd.Series(y_test, index = segmented_mae.index, name='y_true')

In [27]:
result = pd.concat([segmented_mae, mae_pred, mae_true], axis = 1)

In [28]:
result

Unnamed: 0,pickup_location_id,predictions,y_true
0,1,18.287353,2.0
1,1,13.338082,0.0
2,1,13.422653,0.0
3,1,5.809448,0.0
4,1,7.810569,0.0
...,...,...,...
21710,265,17.700605,2.0
21711,265,11.577626,4.0
21712,265,21.208483,2.0
21713,265,22.033562,2.0


In [29]:
def calculate_segmented_mae(data, segment_col, y_true_col='y_true', y_pred_col='predictions'):
    segments = data[segment_col].unique()
    mae_dict = {}
    for segment in segments:
        segment_mask = data[segment_col] == segment
        y_true_segment = data.loc[segment_mask, y_true_col]
        y_pred_segment = data.loc[segment_mask, y_pred_col]
        segment_mae = mean_absolute_error(y_true_segment, y_pred_segment)
        mae_dict[segment] = segment_mae
    return mae_dict

mae_dict = calculate_segmented_mae(result, 'pickup_location_id')

In [30]:
# Sort the dictionary by value
sorted_mae_dict = {k: v for k, v in sorted(mae_dict.items(), key=lambda item: item[1])}

# Display the sorted dictionary
print(sorted_mae_dict)

{224: 6.846083182851098, 232: 7.57695974352009, 41: 7.781794746305517, 42: 8.0301527574074, 265: 8.685225750533418, 24: 8.828517501083043, 74: 8.861854819276662, 65: 8.963221671114727, 226: 9.074887168835252, 244: 9.28761551796431, 45: 9.323091273412716, 116: 9.467529696152893, 145: 9.576023930131782, 209: 9.637103206283127, 7: 9.669713029493046, 152: 9.742126911687109, 33: 9.746090990656697, 146: 10.083782354924347, 25: 10.573363414792537, 243: 10.686470527403454, 223: 10.74482284295302, 193: 10.752248528554391, 88: 10.873492788752587, 260: 10.961072195982007, 66: 11.03219513156947, 97: 11.081316471273425, 216: 11.144517106983903, 52: 11.253373653284433, 10: 11.272489867420047, 225: 11.452407027995113, 49: 11.45880798408948, 228: 11.57108583092901, 179: 11.574007044265015, 247: 11.582076955590262, 1: 11.584797815870902, 61: 11.599208539952054, 40: 11.632555807951826, 189: 11.63358827492956, 130: 11.641953037997705, 168: 11.642886105301033, 188: 11.693286884781076, 256: 11.720547555335

In [31]:
plot_ts(ts_data, locations=[132])

##### Save the Model to Models Directory

In [32]:
import joblib
from src.paths import MODELS_DIR

# save the model to this directory
joblib.dump(pipeline, MODELS_DIR / 'model.pkl')

['C:\\Users\\ryans\\taxi_demand_predictor\\models\\model.pkl']

##### Hopsworks Model Registry

Now we define the format for input and output data for Hopsworks.

In [33]:
from hsml.schema import Schema
from hsml.model_schema import ModelSchema

# input format
input_schema = Schema(X_train)
output_schema = Schema(y_train)
model_schema = ModelSchema(input_schema=input_schema, output_schema=output_schema)

In [34]:
# model_registry
model_registry = project.get_model_registry()

model = model_registry.sklearn.create_model(
    name="taxi_demand_predictor_next_hour",
    metrics={"test_mae": mae},
    description="LightGBM regressor with a bit of hyper-parameter tuning",
    input_example=X_train.sample(),
    model_schema=model_schema
)


Connected. Call `.close()` to terminate connection gracefully.


We have to convert our path to a string in order to save the model.

In [35]:
model.save(str(MODELS_DIR / 'model.pkl'))

  0%|          | 0/6 [00:00<?, ?it/s]

Uploading: 0.000%|          | 0/332220 elapsed<00:00 remaining<?

Uploading: 0.000%|          | 0/4468 elapsed<00:00 remaining<?

Uploading: 0.000%|          | 0/60849 elapsed<00:00 remaining<?

Model created, explore it at https://c.app.hopsworks.ai:443/p/1049751/models/taxi_demand_predictor_next_hour/2


Model(name: 'taxi_demand_predictor_next_hour', version: 2)