# <span style="font-width:bold; font-size: 3rem; color:#1EB182;">**Hopsworks Feature Store** </span> <span style="font-width:bold; font-size: 3rem; color:#333;">- Part 01: Backfill Features to the Feature Store</span>

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/logicalclocks/hopsworks-tutorials/blob/master/advanced_tutorials/electricity/1_backfill_feature_groups.ipynb)

**Note**: you may get an error when installing hopsworks on Colab, and it is safe to ignore it.

This is the first part of the advanced series of tutorials about Hopsworks Feature Store. As part of this first module, you will work with data related to electricity prices and meteorological observations in Sweden. 

The objective of this tutorial is to demonstrate how to work with the **Hopworks Feature Store**  for batch data with a goal of training and deploying a model that can predict electricity prices in the future.

## 🗒️ This notebook is divided in 3 sections:
1. Load the data and process features,
2. Connect to the Hopsworks feature store,
3. Create feature groups and upload them to the feature store.

### <span style='color:#ff5f27'> 📝 Imports

In [None]:
!pip install -U hopsworks --quiet

In [None]:
import pandas as pd
from datetime import datetime

from functions import *

#ignore warnings
import warnings
warnings.filterwarnings('ignore')

---

## <span style="color:#ff5f27;"> 💽 Load the historical data and 🛠️ Perform Feature Engineering</span>

The data you will use comes from three different sources:

- Electricity prices in Sweden per day from [NORD_POOL](https://www.nordpoolgroup.com).
- Different meteorological observations from [Swedish Meteorological and Hydrological Institute](https://www.smhi.se/).
- Getting National Holidays in Swedish calendar.


### <span style="color:#ff5f27;"> 🌤 Meteorological measurements from SMHI</span>

In [None]:
meteorological_measurements_df = fetch_smhi_measurements(historical_data=True)

In [None]:
meteorological_measurements_df

### <span style="color:#ff5f27;">💸 Electricity prices per day from NORD POOL</span>

In [None]:
electricity_prices_df1 = pd.read_csv("https://repo.hops.works/dev/davit/electricity/nordpol_electricity_intraday_prices.csv")
electricity_prices_df1.columns = list(map(str.lower, electricity_prices_df1.columns))

electricity_prices_df2 = fetch_electricity_prices(historical=True)

electricity_prices_df = pd.concat([electricity_prices_df1, electricity_prices_df2]).drop_duplicates(subset=['day'])

electricity_prices_df["timestamp"] = electricity_prices_df["day"].map(lambda x: int(float(datetime.strptime(x, "%Y-%m-%d").timestamp()) * 1000))
electricity_prices_df.tail()

In [None]:
electricity_prices_df.shape

### <span style="color:#ff5f27;"> 📅 Calendar of Swedish holidays</span>

In [None]:
holidays_df = pd.read_csv("https://repo.hops.works/dev/davit/electricity/holidays.csv")
holidays_df

---

## <span style="color:#ff5f27;"> 📡 Connecting to Hopsworks Feature Store </span>

In [None]:
import hopsworks
project = hopsworks.login()
fs = project.get_feature_store()

---

## <span style="color:#ff5f27;"> 🪄 Creating Feature Groups </span>

A [feature group](https://docs.hopsworks.ai/feature-store-api/latest/generated/feature_group/) can be seen as a collection of conceptually related features. In this case, you will create a feature group for the Meteorological measurements from SMHI, Electricity prices feature group from NORD POOL and Swedish holidays feature group.

In [None]:
meteorological_measurements_fg = fs.get_or_create_feature_group(
    name="meteorological_measurements",
    version=1,
    description="Meteorological measurements from SMHI",
    primary_key=["day"],
    online_enabled=True,
    event_time="timestamp"
)

In [None]:
meteorological_measurements_fg.insert(meteorological_measurements_df, write_options={"wait_for_job": False})

In [None]:
electricity_prices_fg = fs.get_or_create_feature_group(
    name="electricity_prices",
    version=1,
    description="Electricity prices from NORD POOL",
    primary_key=["day"],
    online_enabled=True,
    event_time="timestamp",
)

In [None]:
electricity_prices_fg.insert(electricity_prices_df, write_options={"wait_for_job": False})

In [None]:
swedish_holidays_fg = fs.get_or_create_feature_group(
    name="swedish_holidays",
    version=1,
    description="Swedish holidays calendar.",
    online_enabled=True,
    primary_key=["day"])

In [None]:
swedish_holidays_fg.insert(holidays_df, write_options={"wait_for_job": True})

---

## <span style="color:#ff5f27;">⏭️ **Next:** Part 02 </span>

In the next notebook, you will be generating new data for the Feature Groups.