# <span style="font-width:bold; font-size: 3rem; color:#2656a3;">**Data Engineering and Machine Learning Operations in Business** </span> <span style="font-width:bold; font-size: 3rem; color:#333;">- Part 02: Feature Pipeline</span>

## 🗒️ This notebook is divided into the following sections:
1. Parse new aata.
2. Insert new data into the Feature Store.

## <span style='color:#2656a3'> ⚙️ Import of libraries and packages

First, we'll install the Python packages required for this notebook. We'll use the --quiet command after specifying the names of the libraries to ensure a silent installation process. Then, we'll proceed to import all the necessary libraries.

In [1]:
# First we have to go one back in out directory so we can find the folder with our functions
%cd ..

# Now you can import the functions from the features folder. 
from features import electricity_prices, weater_measures  # own function# This is functions we have created 

# Go back into the notebooks folder
%cd notebooks

/Users/tobiasmjensen/Documents/aau_bds/m5_data-engineering-and-mlops/exam_assigment/MLOPs-Assignment-
/Users/tobiasmjensen/Documents/aau_bds/m5_data-engineering-and-mlops/exam_assigment/MLOPs-Assignment-/notebooks


In [2]:
# Importing of the packages for the needed libraries for the Jupyter notebook
import pandas as pd
import requests

# Ignore warnings
import warnings 
warnings.filterwarnings('ignore')

## <span style='color:#2656a3'> 🪄 Parsing new data

### <span style="color:#2656a3;">💸 Electricity prices per day from Energinet

In [3]:
# Fetching non-historical electricity prices
electricity_df = electricity_prices.electricity_prices(
    historical=False,
    area=["DK1"]
)

In [4]:
# Display the dataframe
electricity_df

Unnamed: 0,timestamp,time,date,dk1_spotpricedkk_kwh
0,1713916800000,2024-04-24 00:00:00,2024-04-24,0.60093
1,1713920400000,2024-04-24 01:00:00,2024-04-24,0.56915
2,1713924000000,2024-04-24 02:00:00,2024-04-24,0.56609
3,1713927600000,2024-04-24 03:00:00,2024-04-24,0.54513
4,1713931200000,2024-04-24 04:00:00,2024-04-24,0.56729
5,1713934800000,2024-04-24 05:00:00,2024-04-24,0.61048
6,1713938400000,2024-04-24 06:00:00,2024-04-24,0.75842
7,1713942000000,2024-04-24 07:00:00,2024-04-24,0.9562
8,1713945600000,2024-04-24 08:00:00,2024-04-24,0.79878
9,1713949200000,2024-04-24 09:00:00,2024-04-24,0.68494


### <span style="color:#2656a3;">☀️💨 Forecast Renewable Energy next day from Energinet

In [5]:
# Fetching non-historical forecast data
forecast_renewable_energy_df = electricity_prices.forecast_renewable_energy(
    historical=False,
    area=["DK1"]
)

In [6]:
# Display the dataframe
forecast_renewable_energy_df

Unnamed: 0,timestamp,time,date,dk1_offshore_wind_forecastintraday_kwh,dk1_onshore_wind_forecastintraday_kwh,dk1_solar_forecastintraday_kwh
0,1713916800000,2024-04-24 00:00:00,2024-04-24,0.709417,0.680333,0.000119
1,1713920400000,2024-04-24 01:00:00,2024-04-24,0.717667,0.772833,0.000101
2,1713924000000,2024-04-24 02:00:00,2024-04-24,0.69525,0.826167,0.00032
3,1713927600000,2024-04-24 03:00:00,2024-04-24,0.666958,0.824083,0.000324
4,1713931200000,2024-04-24 04:00:00,2024-04-24,0.652,0.82475,0.000289
5,1713934800000,2024-04-24 05:00:00,2024-04-24,0.654625,0.688625,0.000939
6,1713938400000,2024-04-24 06:00:00,2024-04-24,0.490333,0.639958,0.038443
7,1713942000000,2024-04-24 07:00:00,2024-04-24,0.450042,0.536542,0.139543
8,1713945600000,2024-04-24 08:00:00,2024-04-24,0.491125,0.4925,0.360318
9,1713949200000,2024-04-24 09:00:00,2024-04-24,0.497542,0.509417,0.600893


### <span style="color:#2656a3;"> 🌤 Weather measurements from Open Meteo

#### <span style="color:#2656a3;"> 🕰️ Historical Weater Measures

In [7]:
# Fetching non-historical weather data
#historical_weather_df = weater_measures.historical_weater_measures(
#    historical=False
#)

In [8]:
# Display the first 5 rows of the dataframe
#historical_weather_df.head()

#### <span style="color:#2656a3;"> 🌈 Weater Forecast

In [9]:
# Fetching historical electricity prices data
weather_forecast_df = weater_measures.forecast_weater_measures(
    forecast_length=5
)

In [10]:
# Display the first 5 rows of the dataframe
weather_forecast_df.head(5)

Unnamed: 0,timestamp,date,time,temperature_2m,relative_humidity_2m,precipitation,rain,snowfall,weather_code,cloud_cover,wind_speed_10m,wind_gusts_10m
0,1713916800000,2024-04-24,2024-04-24 00:00:00,5.4,94,0.0,0.0,0.0,3,100,12.6,22.3
1,1713920400000,2024-04-24,2024-04-24 01:00:00,5.2,90,0.0,0.0,0.0,3,100,15.8,28.1
2,1713924000000,2024-04-24,2024-04-24 02:00:00,4.7,88,0.0,0.0,0.0,3,99,20.2,35.6
3,1713927600000,2024-04-24,2024-04-24 03:00:00,4.6,90,0.1,0.1,0.0,51,100,23.0,39.6
4,1713931200000,2024-04-24,2024-04-24 04:00:00,4.7,91,0.3,0.3,0.0,51,88,21.2,41.0


In [11]:
# converting to float to aslign with Hopworks FG as it converts the data to float automatically
weather_forecast_df['relative_humidity_2m'] = weather_forecast_df['relative_humidity_2m'].astype(float)
weather_forecast_df['weather_code'] = weather_forecast_df['weather_code'].astype(float)
weather_forecast_df['cloud_cover'] = weather_forecast_df['cloud_cover'].astype(float)

In [12]:
weather_forecast_df.head(5)

Unnamed: 0,timestamp,date,time,temperature_2m,relative_humidity_2m,precipitation,rain,snowfall,weather_code,cloud_cover,wind_speed_10m,wind_gusts_10m
0,1713916800000,2024-04-24,2024-04-24 00:00:00,5.4,94.0,0.0,0.0,0.0,3.0,100.0,12.6,22.3
1,1713920400000,2024-04-24,2024-04-24 01:00:00,5.2,90.0,0.0,0.0,0.0,3.0,100.0,15.8,28.1
2,1713924000000,2024-04-24,2024-04-24 02:00:00,4.7,88.0,0.0,0.0,0.0,3.0,99.0,20.2,35.6
3,1713927600000,2024-04-24,2024-04-24 03:00:00,4.6,90.0,0.1,0.1,0.0,51.0,100.0,23.0,39.6
4,1713931200000,2024-04-24,2024-04-24 04:00:00,4.7,91.0,0.3,0.3,0.0,51.0,88.0,21.2,41.0


## <span style="color:#2656a3;"> 📡 Connecting to Hopsworks Feature Store

First we will connect to Hopsworks Feature Store so we can access and create Feature Groups.

In [13]:
import hopsworks

project = hopsworks.login()

fs = project.get_feature_store()

Connected. Call `.close()` to terminate connection gracefully.







Logged in to project, explore it here https://c.app.hopsworks.ai:443/p/554133
Connected. Call `.close()` to terminate connection gracefully.


In [14]:
# Retrieve feature groups
weather_fg = fs.get_feature_group(
    name="weather_measurements",
    version=1,
)

electricity_fg = fs.get_feature_group(
    name="electricity_prices",
    version=1,
)

forecast_renewable_energy_fg = fs.get_feature_group(
    name="forecast_renewable_energy",
    version=1,
)

### <span style="color:#2656a3;"> ⬆️ Uploading new data to the Feature Store

In [15]:
# Inserting the weather_df into the feature group named weather_fg
weather_fg.insert(weather_forecast_df)

Uploading Dataframe: 100.00% |██████████| Rows 120/120 | Elapsed Time: 00:07 | Remaining Time: 00:00


Launching job: weather_measurements_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/554133/jobs/named/weather_measurements_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x17c4f9290>, None)

In [17]:
# Inserting the electricity_df into the feature group named electricity_fg
electricity_fg.insert(electricity_df)

Uploading Dataframe: 100.00% |██████████| Rows 24/24 | Elapsed Time: 00:08 | Remaining Time: 00:00


Launching job: electricity_prices_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/554133/jobs/named/electricity_prices_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x175bbf110>, None)

In [18]:
# Inserting the forecast_renewable_energy_df into the feature group named forecast_renewable_energy_fg
forecast_renewable_energy_fg.insert(forecast_renewable_energy_df)

Uploading Dataframe: 100.00% |██████████| Rows 24/24 | Elapsed Time: 00:06 | Remaining Time: 00:00


Launching job: forecast_renewable_energy_1_offline_fg_materialization
Job started successfully, you can follow the progress at 
https://c.app.hopsworks.ai/p/554133/jobs/named/forecast_renewable_energy_1_offline_fg_materialization/executions


(<hsfs.core.job.Job at 0x175bbf410>, None)

---
## <span style="color:#2656a3;">⏭️ **Next:** Part 03: Traning </span>

In the next notebook, you will be generating new data for the Feature Groups.