# <span style="font-width:bold; font-size: 3rem; color:#2656a3;">**Data Engineering and Machine Learning Operations in Business** </span> <span style="font-width:bold; font-size: 3rem; color:#333;">- Part 02: Feature Pipeline</span>

## <span style='color:#2656a3'> 🗒️ The notebook is divided into the following sections:
1. Parsing new data.
2. Inserting the new data into the Feature Store.

## <span style='color:#2656a3'> ⚙️ Import of libraries and packages

First, we install the Python packages required for this notebook. We'll use the --quiet command after specifying the names of the libraries to ensure a silent installation process. Then, we'll proceed to import all the necessary libraries.

In [1]:
# First we go one back in our directory to access the folder with our functions
%cd ..

# Now we import the functions from the features folder
# This is the functions we have created to generate features for electricity prices and weather measures
from features import electricity_prices, weather_measures, calendar

# We go back into the notebooks folder
%cd notebooks

/Users/tobiasmjensen/Documents/aau_bds/m5_data-engineering-and-mlops/exam_assigment/MLOPs-Assignment-
/Users/tobiasmjensen/Documents/aau_bds/m5_data-engineering-and-mlops/exam_assigment/MLOPs-Assignment-/notebooks


In [2]:
# Importing the packages for the needed libraries for the Jupyter notebook
import pandas as pd
import requests

# Ignore warnings
import warnings 
warnings.filterwarnings('ignore')
warnings.filterwarnings('ignore', category=DeprecationWarning)

## <span style='color:#2656a3'> 🪄 Parsing New Data
We are parsing new data setting `historical` to `False` in order to fetch real-time data. This is done for electricity prices and forecast of renewable energy. 

In order to provide real time weather measures, a weather forecast measure for the next 5 days is being fetched.

There are of course no changes to the calendar data, and therefore no new data is retrieved from it.

### <span style="color:#2656a3;">💸 Electricity Prices per day from Energinet

In [3]:
# Fetching non-historical electricity prices for area DK1
electricity_df = electricity_prices.electricity_prices(
    historical=False,
    area=["DK1"]
)

In [4]:
# Display the electricity dataframe
electricity_df

Unnamed: 0,timestamp,datetime,date,hour,dk1_spotpricedkk_kwh
0,1714694400000,2024-05-03 00:00:00,2024-05-03,0,0.22214
1,1714698000000,2024-05-03 01:00:00,2024-05-03,1,0.21893
2,1714701600000,2024-05-03 02:00:00,2024-05-03,2,0.22348
3,1714705200000,2024-05-03 03:00:00,2024-05-03,3,0.22385
4,1714708800000,2024-05-03 04:00:00,2024-05-03,4,0.22706
5,1714712400000,2024-05-03 05:00:00,2024-05-03,5,0.23825
6,1714716000000,2024-05-03 06:00:00,2024-05-03,6,0.26167
7,1714719600000,2024-05-03 07:00:00,2024-05-03,7,0.32045
8,1714723200000,2024-05-03 08:00:00,2024-05-03,8,0.31881
9,1714726800000,2024-05-03 09:00:00,2024-05-03,9,0.2886


### <span style="color:#2656a3;"> 🌤 Weather Measurements from Open Meteo

#### <span style="color:#2656a3;"> 🌈 Forecast Weather Measures

In [5]:
# Fetching weather forecast measures for the next 5 days
weather_forecast_df = weather_measures.forecast_weather_measures(
    forecast_length=5
)

In [6]:
# Display the weather forecast dataframe
weather_forecast_df

Unnamed: 0,timestamp,datetime,date,hour,temperature_2m,relative_humidity_2m,precipitation,rain,snowfall,weather_code,cloud_cover,wind_speed_10m,wind_gusts_10m
0,1714694400000,2024-05-03 00:00:00,2024-05-03,0,14.3,65.0,0.0,0.0,0.0,1.0,25.0,20.5,36.0
1,1714698000000,2024-05-03 01:00:00,2024-05-03,1,13.6,69.0,0.0,0.0,0.0,0.0,12.0,21.6,37.4
2,1714701600000,2024-05-03 02:00:00,2024-05-03,2,13.0,72.0,0.0,0.0,0.0,0.0,7.0,20.9,37.4
3,1714705200000,2024-05-03 03:00:00,2024-05-03,3,12.7,73.0,0.0,0.0,0.0,1.0,26.0,19.8,34.6
4,1714708800000,2024-05-03 04:00:00,2024-05-03,4,12.4,73.0,0.0,0.0,0.0,2.0,54.0,18.7,33.8
...,...,...,...,...,...,...,...,...,...,...,...,...,...
115,1715108400000,2024-05-07 19:00:00,2024-05-07,19,12.0,41.0,0.0,0.0,0.0,0.0,0.0,4.2,10.8
116,1715112000000,2024-05-07 20:00:00,2024-05-07,20,10.7,49.0,0.0,0.0,0.0,0.0,0.0,3.6,8.3
117,1715115600000,2024-05-07 21:00:00,2024-05-07,21,9.6,56.0,0.0,0.0,0.0,0.0,0.0,3.2,5.4
118,1715119200000,2024-05-07 22:00:00,2024-05-07,22,8.7,58.0,0.0,0.0,0.0,0.0,0.0,3.3,5.8


## <span style="color:#2656a3;"> 📡 Connecting to Hopsworks Feature Store

We connect to Hopsworks Feature Store so we can access the Feature Groups and upload the new data into the Feature Groups.

In [7]:
# Importing the hopsworks module
import hopsworks

# Logging in to the Hopsworks project
project = hopsworks.login()

# Getting the feature store from the project
fs = project.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.

Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/554133
Connected. Call `.close()` to terminate connection gracefully.


In [8]:
# Retrieve the feature groups
electricity_fg = fs.get_feature_group(
    name="electricity_prices",
    version=1,
)

weather_fg = fs.get_feature_group(
    name="weather_measurements",
    version=1,
)

### <span style="color:#2656a3;"> ⬆️ Uploading new data to the Feature Store
Here we upload the new data to the retrieved Feature groups.

In [9]:
# Inserting the electricity_df into the feature group named electricity_fg
electricity_fg.insert(electricity_df, 
                      write_options={"wait_for_job" : False})

Uploading Dataframe: 0.00% |          | Rows 0/24 | Elapsed Time: 00:00 | Remaining Time: ?

Launching job: electricity_prices_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/554133/jobs/named/electricity_prices_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x3058ab890>, None)

In [10]:
# Inserting the weather_df into the feature group named weather_fg
weather_fg.insert(weather_forecast_df, 
                  write_options={"wait_for_job" : False})

Uploading Dataframe: 0.00% |          | Rows 0/120 | Elapsed Time: 00:00 | Remaining Time: ?

Launching job: weather_measurements_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/554133/jobs/named/weather_measurements_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x3058f5d10>, None)

---
## <span style="color:#2656a3;">⏭️ **Next:** Part 03: Traning </span>

Next we will create a feature view and training dataset. Further we will train a model and save it in model registry.