# 01. Ingestion â€” Fetching data from external APIs

This notebook aims to **test, understand, and document** the preprocessing pipeline used to fetch the raw data from the following sources:

- Hourly electricity demand [(ENTSO-E)](https://documenter.getpostman.com/view/7009892/2s93JtP3F6)
- Hourly weather data [(Open-Meteo)](https://open-meteo.com/en/docs/historical-weather-api?latitude=48.8534&longitude=2.3488&hourly=temperature_2m,relative_humidity_2m,wind_speed_10m,shortwave_radiation_instant)

## 1. Environment setup

In [1]:
import sys
import os

PROJECT_ROOT = os.path.abspath("..")
if PROJECT_ROOT not in sys.path:
    sys.path.append(PROJECT_ROOT)

PROJECT_ROOT

'/Users/bachirijihane/energy-intelligence-platform'

## 2. Imports

In [2]:
import pandas as pd

from src.ingestion.get_entsoe_demand import fetch_entsoe_demand_one_year, fetch_entsoe_demand_and_store
from src.ingestion.get_openmeteo_weather import fetch_openmeteo_weather_one_year, fetch_openmeteo_weather_and_store

## 3. Fetch and store demand data

In [3]:
# Parameters
country = "FR"
year = 2023

In [4]:
# Fetch and store demand data in data/raw/electricity_demand/
fetch_entsoe_demand_and_store(
    country=country,
    country_code="10YFR-RTE------C",
    start_year=year,
    end_year=year + 1)

[FETCH] ENTSO-E demand | FR | 2023
[SAVED] /Users/bachirijihane/energy-intelligence-platform/data/raw/electricity_demand/country=FR/year=2023/demand.parquet | rows=8734
[FETCH] ENTSO-E demand | FR | 2024
[SAVED] /Users/bachirijihane/energy-intelligence-platform/data/raw/electricity_demand/country=FR/year=2024/demand.parquet | rows=8786


In [5]:
# Raw demand files path
demand_path = f"../data/raw/electricity_demand/country={country}/year={year}/demand.parquet"

In [6]:
df_demand = pd.read_parquet(demand_path)

In [7]:
df_demand.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8734 entries, 0 to 8733
Data columns (total 3 columns):
 #   Column    Non-Null Count  Dtype              
---  ------    --------------  -----              
 0   datetime  8734 non-null   datetime64[ns, UTC]
 1   load_MW   8734 non-null   float64            
 2   country   8734 non-null   object             
dtypes: datetime64[ns, UTC](1), float64(1), object(1)
memory usage: 204.8+ KB


## 4. Fetch and store weather data

In [8]:
fetch_openmeteo_weather_and_store(
    country=country,
    latitude=48.8534, # Paris coordinates
    longitude=2.3488,
    start_year=year,
    end_year=year + 1
    )

[FETCH] Open-Meteo weather | FR | 2023
[SAVED] /Users/bachirijihane/energy-intelligence-platform/data/raw/weather/country=FR/year=2023/weather.parquet | rows=8760
[FETCH] Open-Meteo weather | FR | 2024
[SAVED] /Users/bachirijihane/energy-intelligence-platform/data/raw/weather/country=FR/year=2024/weather.parquet | rows=8784


In [9]:
weather_path = f"../data/raw/weather/country={country}/year={year}/weather.parquet"

In [10]:
df_weather = pd.read_parquet(weather_path)

In [11]:
df_weather.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8760 entries, 0 to 8759
Data columns (total 6 columns):
 #   Column                       Non-Null Count  Dtype              
---  ------                       --------------  -----              
 0   datetime                     8760 non-null   datetime64[ns, UTC]
 1   temperature_2m               8760 non-null   float32            
 2   relative_humidity_2m         8760 non-null   float32            
 3   wind_speed_10m               8760 non-null   float32            
 4   shortwave_radiation_instant  8760 non-null   float32            
 5   country                      8760 non-null   object             
dtypes: datetime64[ns, UTC](1), float32(4), object(1)
memory usage: 273.9+ KB
