# <span style="font-width:bold; font-size: 3rem; color:#1EB182;">**Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 02: Feature Pipeline</span>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/advanced_tutorials/nyc_taxi_fares/2_feature_pipeline.ipynb)

## 🗒️ This notebook is divided into 2 sections:
1. **Data generation**,
2. **Insert new data into the Feature Store**.

### <span style='color:#ff5f27'> 📝 Imports

In [1]:
import pandas as pd
from datetime import datetime
import time 
import os 

from functions import *

___

## <span style="color:#ff5f27;"> 🪄 Generating new data</span>

### <span style='color:#ff5f27'> 🚖 Rides Data

In [2]:
df_rides = generate_rides_data(150)

df_rides

Unnamed: 0,ride_id,pickup_datetime,pickup_longitude,dropoff_longitude,pickup_latitude,dropoff_latitude,passenger_count,taxi_id,driver_id
0,b62b6b5e0c1bc7c38203d49f87aa038b,1607249100000,-74.46605,-74.35951,41.08324,41.33374,1,89,38
1,0aca1f99442965e975b1caf8e2de3499,1592438200000,-74.03295,-73.19762,40.71218,41.03299,1,169,172
2,e5c13e261fe0588934d959c9b28ce624,1607241700000,-73.60920,-73.54689,41.46554,40.80531,4,42,142
3,e0c9e7899e8b96880ae7b23ed0666b09,1590086900000,-73.51136,-73.05219,41.32622,41.05065,2,5,101
4,f39983333f042baeac963a22b54857b9,1602438300000,-74.38961,-74.27410,40.78734,41.48349,3,152,79
...,...,...,...,...,...,...,...,...,...
145,6376ae17d85d34e277ebaa45bdc85457,1608158400000,-73.56905,-73.75968,40.86125,41.34922,3,134,76
146,cbdd9779c63df86c24f3c6071e9ccb1f,1581316800000,-73.88083,-73.25228,41.15133,41.22743,3,191,173
147,5b7c928f230cc78160569e6e6893d985,1583733300000,-72.93300,-73.05207,40.84738,41.12884,4,199,77
148,f5ddf5bc7f4ce2829bc0948765a99d6c,1598523700000,-73.99126,-73.38686,41.77799,41.78662,1,186,66


In [3]:
df_rides = calculate_distance_features(df_rides)

In [4]:
df_rides = calculate_datetime_features(df_rides)

In [5]:
# now save the newly-generated ride_ids.
# it will be retrieved and used in for fares data generation
ride_ids = df_rides.ride_id

In [6]:
for col in ["passenger_count", "taxi_id", "driver_id"]:
    df_rides[col] = df_rides[col].astype("int64")


### <span style='color:#ff5f27'> 💸 Fares Data

In [7]:
df_fares = generate_fares_data(150)

df_fares

Unnamed: 0,total_fare,tolls,taxi_id,driver_id
0,57,2,59,43
1,101,1,182,21
2,235,5,95,47
3,205,0,160,186
4,106,5,145,25
...,...,...,...,...
145,158,4,182,78
146,93,0,121,101
147,177,0,150,80
148,144,0,37,166


In [8]:
df_fares = df_fares.astype("int64")

In [9]:
# lets load our ride_ids which were created moments ago for rides_fg
df_fares["ride_id"] = ride_ids

In [10]:
for col in ["tolls", "total_fare"]:
    df_fares[col] = df_fares[col].astype("double")

___

## <span style="color:#ff5f27;"> 📡 Connecting to Hopsworks Feature Store </span>

In [11]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store() 

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/164




Connected. Call `.close()` to terminate connection gracefully.


In [12]:
rides_fg = fs.get_or_create_feature_group(name="nyc_taxi_rides",
                                          version=1)   

fares_fg = fs.get_or_create_feature_group(name="nyc_taxi_fares",
                                          version=1)   

---

## <span style="color:#ff5f27;">⬆️ Uploading new data to the Feature Store</span>

In [13]:
rides_fg.insert(df_rides)

Uploading Dataframe: 0.00% |          | Rows 0/150 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/164/jobs/named/nyc_taxi_rides_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x247194effa0>, None)

In [14]:
fares_fg.insert(df_fares)

Uploading Dataframe: 0.00% |          | Rows 0/150 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/164/jobs/named/nyc_taxi_fares_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x2471945a1c0>, None)

---

## <span style="color:#ff5f27;">⏭️ **Next:** Part 03 </span>

In the next notebook, you will create a feature view and training dataset.