# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 01: Backfill Features to the Feature Store</span>


## 🗒️ This notebook is divided in 3 sections:
1. Loading the data 
2. Connect to the Hopsworks feature store,
3. Create feature groups and insert them to the feature store.

![tutorial-flow](images/01_featuregroups.png)

## <span style='color:#ff5f27'> 📝 Imports

In [1]:
import pandas as pd

from features import *

## <span style='color:#ff5f27'> 💽 Loading Historical Data</span>


#### <span style='color:#ff5f27'> 👩🏻‍🔬 Air Quality Data

In [2]:
df_air_quality = pd.read_csv('data/air_quality.csv')

df_air_quality.head()

Unnamed: 0,city,aqi,date,iaqi_h,iaqi_p,iaqi_pm10,iaqi_t,o3_avg,o3_max,o3_min,pm10_avg,pm10_max,pm10_min,pm25_avg,pm25_max,pm25_min,uvi_avg,uvi_max,uvi_min
0,Sundsvall,11,2022-09-06,34.3,1020.5,10,15.8,19,25,13,3,4,2,7,8,6,0,0,0
1,Kyiv,6,2022-09-06,98.78,1022.7,2,13.91,19,27,11,7,11,4,23,39,14,0,0,0
2,Stockholm,13,2022-09-06,59.0,1021.0,13,17.0,17,25,6,6,12,3,14,30,9,0,0,0
3,Malmo,23,2022-09-06,54.5,1019.0,4,17.3,27,32,24,5,5,3,10,11,8,1,1,1
4,Stockholm,1,2022-09-07,46.0,1022.0,1,16.0,18,26,8,7,10,4,16,21,8,1,1,1


In [3]:
df_air_quality.date = df_air_quality.date.apply(timestamp_2_time)
df_air_quality.sort_values(by = ['city','date'],inplace = True,ignore_index = True)
df_air_quality.reset_index(inplace = True)

df_air_quality.head()

Unnamed: 0,index,city,aqi,date,iaqi_h,iaqi_p,iaqi_pm10,iaqi_t,o3_avg,o3_max,o3_min,pm10_avg,pm10_max,pm10_min,pm25_avg,pm25_max,pm25_min,uvi_avg,uvi_max,uvi_min
0,0,Kyiv,6,1662411600000,98.78,1022.7,2,13.91,19,27,11,7,11,4,23,39,14,0,0,0
1,1,Kyiv,2,1662498000000,65.58,1020.0,1,21.02,22,32,10,9,12,5,29,43,15,0,0,0
2,2,Kyiv,4,1662584400000,99.9,1017.6,2,12.45,22,32,10,9,12,5,29,43,15,0,0,0
3,3,Kyiv,2,1662670800000,99.9,1021.3,1,10.0,20,33,6,10,17,5,33,56,16,0,0,0
4,4,Malmo,23,1662411600000,54.5,1019.0,4,17.3,27,32,24,5,5,3,10,11,8,1,1,1


#### <span style='color:#ff5f27'> 🌦 Weather Data

In [4]:
df_weather = pd.read_csv('data/weather.csv')

df_weather.head()

Unnamed: 0,city,date,tempmax,tempmin,temp,feelslikemax,feelslikemin,feelslike,dew,humidity,...,windgust,windspeed,winddir,pressure,cloudcover,visibility,solarradiation,solarenergy,uvindex,conditions
0,Kyiv,2022-09-06,17.7,4.6,11.5,17.7,4.6,11.5,1.8,55.3,...,24.5,9.7,267.0,1022.3,34.8,24.1,227.5,19.6,7.0,Partially cloudy
1,Sundsvall,2022-09-06,13.0,3.0,8.6,13.0,0.1,7.4,5.5,81.8,...,31.0,14.3,192.9,1024.1,90.8,15.3,116.1,10.1,5.0,Overcast
2,Stockholm,2022-09-06,15.9,7.8,12.0,15.9,7.1,11.8,7.0,73.6,...,25.2,13.0,70.7,1022.0,59.5,15.3,132.5,11.6,5.0,Partially cloudy
3,Malmo,2022-09-06,20.6,12.4,15.9,20.6,12.4,15.9,9.3,66.6,...,44.6,23.7,97.9,1018.4,69.0,15.2,157.3,13.7,5.0,Partially cloudy
4,Stockholm,2022-09-07,15.9,7.8,12.0,15.9,7.1,11.8,7.0,73.5,...,25.2,13.0,69.9,1022.0,57.5,14.7,142.5,12.5,5.0,Partially cloudy


In [5]:
df_weather.date = df_weather.date.apply(timestamp_2_time)
df_weather.sort_values(by = ['city','date'],inplace = True,ignore_index = True)
df_weather.reset_index(inplace = True)

df_weather.head()

Unnamed: 0,index,city,date,tempmax,tempmin,temp,feelslikemax,feelslikemin,feelslike,dew,...,windgust,windspeed,winddir,pressure,cloudcover,visibility,solarradiation,solarenergy,uvindex,conditions
0,0,Kyiv,1662411600000,17.7,4.6,11.5,17.7,4.6,11.5,1.8,...,24.5,9.7,267.0,1022.3,34.8,24.1,227.5,19.6,7.0,Partially cloudy
1,1,Kyiv,1662498000000,17.7,4.6,11.5,17.7,4.6,11.5,1.8,...,24.5,9.7,267.0,1022.3,34.8,24.1,227.5,19.6,7.0,Partially cloudy
2,2,Kyiv,1662584400000,21.1,7.9,14.3,21.1,7.9,14.3,3.3,...,24.8,9.7,132.9,1019.4,48.4,24.1,217.9,18.7,7.0,Partially cloudy
3,3,Kyiv,1662670800000,17.5,6.6,12.4,17.5,5.7,12.1,4.6,...,36.4,14.8,87.8,1022.5,71.8,24.1,146.3,12.6,5.0,"Rain, Partially cloudy"
4,4,Malmo,1662411600000,20.6,12.4,15.9,20.6,12.4,15.9,9.3,...,44.6,23.7,97.9,1018.4,69.0,15.2,157.3,13.7,5.0,Partially cloudy


## <span style="color:#ff5f27;"> 🔮 Connecting to Hopsworks Feature Store </span>

In [6]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store() 

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/167
Connected. Call `.close()` to terminate connection gracefully.


## <span style="color:#ff5f27;">🪄 Creating Feature Groups</span>

#### <span style='color:#ff5f27'> 👩🏻‍🔬 Air Quality Data

In [7]:
air_quality_fg = fs.get_or_create_feature_group(
        name = 'air_quality_fg',
        description = 'Air Quality characteristics of each day',
        version = 1,
        primary_key = ['index'],
        online_enabled = True,
        event_time = ['date']
    )    

air_quality_fg.insert(df_air_quality)

Feature Group created successfully, explore it at 
https://c.app.hopsworks.ai:443/p/167/fs/109/fg/866


Uploading Dataframe: 0.00% |          | Rows 0/16 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/167/jobs/named/air_quality_fg_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x7f793369b430>, None)

#### <span style='color:#ff5f27'> 🌦 Weather Data

In [8]:
weather_fg = fs.get_or_create_feature_group(
        name = 'weather_fg',
        description = 'Weather characteristics of each day',
        version = 1,
        primary_key = ['index'],
        online_enabled = True,
        event_time = ['date']
    )    

weather_fg.insert(df_weather)

Feature Group created successfully, explore it at 
https://c.app.hopsworks.ai:443/p/167/fs/109/fg/867


Uploading Dataframe: 0.00% |          | Rows 0/16 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/167/jobs/named/weather_fg_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x7f79336a6f40>, None)

---