# <span style="font-width:bold; font-size: 3rem; color:#1EB182;"><img src="../../images/icon102.png" width="38px"></img> **Hopsworks Feature Store** </span><span style="font-width:bold; font-size: 3rem; color:#333;">- Part 02: Feature Pipeline</span>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/advanced_tutorials/{project_name}/{notebook_name}.ipynb)


## 🗒️ This notebook is divided into the following sections:
1. Parse Data
2. Feature Group Insertion

In [None]:
!pip install hopsworks

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
%cd /content/drive/MyDrive/jim

/content/drive/MyDrive/jim


## <span style='color:#ff5f27'> 📝 Imports

In [None]:
import pandas as pd
from datetime import datetime
import time 
import requests

from functions import *

---

## <span style='color:#ff5f27'> 👮🏻‍♂️ API Keys

### Don't forget to create an `.env` configuration file where all the necessary environment variables (API keys) will be stored:
![](images/api_keys_env_file.png)

In [None]:
date_today = datetime.date.today()

---

## <span style='color:#ff5f27'>  🧙🏼‍♂️ Parsing Data

In [None]:
cities = ["Beijing"]
city='beijing'

data_air_quality = [get_air_quality_data(city) for city in cities]

data_weather = get_weather_data_weekly(city, date_today) 

In [None]:
print(data_weather)

            date  tempmax  tempmin  temp  feelslikemax  feelslikemin  \
0  1673395200000      8.2      4.0   6.3           6.6           2.0   
0  1673481600000      7.5      4.0   5.9           5.1           0.9   
0  1673568000000      8.6      6.0   7.3           7.3           3.6   
0  1673654400000      8.5      4.9   6.7           5.9           3.4   
0  1673740800000     10.0      1.9   5.1           8.5          -1.3   
0  1673827200000      3.1      0.2   1.8           0.3          -2.8   
0  1673913600000      2.5     -0.9   0.8           0.2          -4.3   

   feelslike  dew  humidity  precip  ...  snow  snowdepth  windspeed  winddir  \
0        4.6  2.9      79.0     9.6  ...   0.0        0.0       15.5    229.5   
0        3.0  1.2      72.5     7.0  ...   0.0        0.0       19.8    234.9   
0        4.7  3.1      75.9     6.7  ...   0.0        0.0       22.3    240.8   
0        4.5  3.6      81.0     8.9  ...   0.0        0.0       22.7    237.0   
0        2.5  2.9 

In [None]:
data_encoder_1(data_weather)

Unnamed: 0,tempmax,tempmin,temp,feelslikemax,feelslikemin,feelslike,dew,humidity,precip,precipprob,...,snow,snowdepth,windspeed,winddir,cloudcover,visibility,solarradiation,solarenergy,uvindex,conditions
0,0.476758,0.479643,0.622159,0.572781,0.625626,0.877436,0.667548,-0.417067,-0.047268,1.706858,...,-0.660113,-0.656892,-1.316078,-1.839834,0.587703,-0.113229,-1.374508,-1.394978,-1.020621,4
0,0.21719,0.479643,0.451371,0.083822,0.24024,0.281843,-0.32543,-1.625763,-0.421304,-0.370819,...,-0.660113,-0.656892,-0.320753,-0.207305,-0.57583,1.007351,0.329153,0.355325,0.408248,2
0,0.625083,1.329644,1.049132,0.800962,1.186187,0.914661,0.784369,-0.993522,-0.464462,1.188802,...,-0.660113,-0.656892,0.257925,1.576383,0.490742,0.898027,-0.398843,-0.289524,0.408248,4
0,0.588002,0.862143,0.792948,0.3446,1.116117,0.840212,1.076421,-0.04516,-0.14797,-0.888875,...,-0.660113,-0.656892,0.350513,0.427567,1.003251,0.952689,0.103999,0.171082,-1.020621,4
0,1.144219,-0.412857,0.109793,1.19213,-0.530531,0.09572,0.667548,0.977583,2.22572,0.147237,...,0.746215,-0.218964,1.669898,0.790351,1.003251,-1.534454,-0.954222,-1.026494,-1.020621,5
0,-1.414382,-1.135358,-1.299215,-1.480849,-1.056057,-1.318813,-1.201586,0.940392,-1.270079,-0.632574,...,-0.258305,0.656892,-1.316078,-0.358465,-1.933286,0.160083,1.950258,1.921385,1.837117,1
0,-1.636869,-1.602858,-1.726187,-1.513446,-1.581582,-1.691059,-1.66887,1.163536,0.125364,-1.15063,...,2.152543,2.189639,0.674573,-0.388697,-0.57583,-1.370467,0.344163,0.263203,0.408248,1


---

## <span style='color:#ff5f27'> 🧑🏻‍🏫 Dataset Preparation

#### <span style='color:#ff5f27'> 👩🏻‍🔬 Air Quality Data

In [None]:
df_air_quality = get_air_quality_df(data_air_quality)

df_air_quality=df_air_quality.loc[:, ['date', 'aqi']] 
print(df_air_quality)

            date  aqi
0  1673395200000   31


#### <span style='color:#ff5f27'> 🌦 Weather Data

In [None]:
df_weather = get_weather_df_0(data_weather)
df_weather.head()

Unnamed: 0,date,tempmax,tempmin,temp,feelslikemax,feelslikemin,feelslike,dew,humidity,precip,...,snow,snowdepth,windspeed,winddir,cloudcover,visibility,solarradiation,solarenergy,uvindex,conditions
0,1673395200000,8.2,3.9,6.3,6.6,2.3,4.7,3.1,80.1,9.6,...,0.0,0.0,15.5,233.4,97.0,21.2,7.5,0.6,1.0,"Rain, Overcast"


---

## <span style="color:#ff5f27;"> 🔮 Connecting to Hopsworks Feature Store </span>

In [None]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store() 



Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/5333
Connected. Call `.close()` to terminate connection gracefully.




In [None]:
air_quality_fg = fs.get_or_create_feature_group(
    name = 'aqi_fg',
    version = 1,
    primary_key = ['date'],
    online_enabled = True,
    event_time = 'date'
)

In [None]:
weather_fg = fs.get_or_create_feature_group(
    name = 'weather_data_fg',
    primary_key = ['date'],
    online_enabled = True,
    event_time = 'date',
    version = 1
)

---

## <span style="color:#ff5f27;">⬆️ Uploading new data to the Feature Store</span>

In [None]:
air_quality_fg.insert(df_air_quality)

Feature Group created successfully, explore it at 
https://c.app.hopsworks.ai:443/p/5333/fs/5253/fg/14838


Uploading Dataframe: 0.00% |          | Rows 0/1 | Elapsed Time: 00:00 | Remaining Time: ?

Launching offline feature group backfill job...
Backfill Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/5333/jobs/named/zurich_aqi_fg_1_offline_fg_backfill/executions


(<hsfs.core.job.Job at 0x7fef860463a0>, None)

In [None]:
weather_fg.insert(df_weather)

RestAPIError: ignored

---