# Table of Contents
1. [Initial Setup & Testing](#1-initial-setup--testing)
   1. [Initialize client](#11-initialize-client)
   2. [Test Connection](#12-test-connection)
2. [Core Data Extraction Functions](#2-core-data-extraction-functions)

In [1]:
!pip install entsoe-py

Collecting entsoe-py
  Downloading entsoe_py-0.6.19-py3-none-any.whl.metadata (10 kB)
Collecting pandas>=2.2.0 (from entsoe-py)
  Downloading pandas-2.2.3-cp311-cp311-win_amd64.whl.metadata (19 kB)
Downloading entsoe_py-0.6.19-py3-none-any.whl (1.6 MB)
   ---------------------------------------- 0.0/1.6 MB ? eta -:--:--
   ---------------------------------------- 0.0/1.6 MB ? eta -:--:--
   - -------------------------------------- 0.0/1.6 MB 653.6 kB/s eta 0:00:03
   ---- ----------------------------------- 0.2/1.6 MB 1.5 MB/s eta 0:00:01
   --------- ------------------------------ 0.4/1.6 MB 2.4 MB/s eta 0:00:01
   --------------- ------------------------ 0.6/1.6 MB 3.1 MB/s eta 0:00:01
   -------------------- ------------------- 0.8/1.6 MB 3.5 MB/s eta 0:00:01
   -------------------------------- ------- 1.3/1.6 MB 4.1 MB/s eta 0:00:01
   ---------------------------------------  1.6/1.6 MB 4.6 MB/s eta 0:00:01
   ---------------------------------------- 1.6/1.6 MB 4.3 MB/s eta 0:00:00

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pandasai 2.4.2 requires pandas==1.5.3, but you have pandas 2.2.3 which is incompatible.


# 1. Initial Setup & Testing

In [1]:
#%% Import necessary libraries
from entsoe import EntsoePandasClient
import pandas as pd
import numpy as np
import seaborn as sns

import matplotlib.pyplot as plt
import plotly.express as px

## 1.1. Initialize client  
The API client is initialized with an API key to authenticate requests

In [3]:
#%% Initialize client
API_KEY = "0f2f4222-713f-49d2-85b7-04ddd5f2dc1c"
client = EntsoePandasClient(api_key=API_KEY)

## 1.2. Test Connection  
This test ensures the API key is valid and the connection to ENTSO-E platform is working 
before proceeding with more complex queries.  
A dictionary of country codes is defined mapping country names to their ENTSO-E bidding zone codes
    - Includes Belgium, Netherlands, France, Germany/Luxembourg zone etc.

In [None]:
def test_api_connection():
    """Test if API connection works"""
    try:
        # Test with a simple query - Belgian day-ahead prices for last week
        start = pd.Timestamp('2024-01-01', tz='Europe/Brussels')
        end = pd.Timestamp('2024-01-07', tz='Europe/Brussels')
        
        prices = client.query_day_ahead_prices('BE', start=start, end=end)
        print("✅ API connection successful!")
        print(f"Retrieved {len(prices)} price points")
        print(prices.head())
        return True
    except Exception as e:
        print(f"❌ API connection failed: {e}")
        return False

# Run test
test_api_connection()

✅ API connection successful!
Retrieved 145 price points
2024-01-01 00:00:00+01:00    0.10
2024-01-01 01:00:00+01:00    0.01
2024-01-01 02:00:00+01:00    0.00
2024-01-01 03:00:00+01:00   -0.01
2024-01-01 04:00:00+01:00   -0.03
Freq: 60min, dtype: float64


True

# 2. Core Data Extraction Functions  
The ENTSO-E API provides access to different types of electricity market data. For our European Energy Market Analysis, we need three main data types:  

1. **Day-Ahead Prices** - Electricity prices set one day before delivery
2. **Cross-Border Flows** - Physical electricity flows between countries
3. **Load Data** - Actual electricity consumption by country

Let's extract them.

## 2.1. Day-Ahead Prices  
In this subsection, a function is designed to fetch electricity pricing data from the ENTSO-E (European Network of Transmission System Operators for Electricity) platform for multiple European countries.

In [15]:
def get_day_ahead_prices(countries, start_date, end_date):
    """Extract day-ahead electricity prices for multiple countries from ENTSO-E platform.
    
    Args:
        countries (dict): Dictionary mapping country names to their ENTSO-E bidding zone codes.
            Example: {'Belgium': 'BE', 'France': 'FR'}
        start_date (pd.Timestamp): Start date for data extraction, timezone-aware.
        end_date (pd.Timestamp): End date for data extraction, timezone-aware.
    
    Returns:
        dict: Dictionary containing pandas Series of day-ahead prices for each country.
            Keys are country names, values are price series in EUR/MWh.
            Example: {'Belgium': pd.Series(...), 'France': pd.Series(...)}
    
    Raises:
        Exception: If API request fails for any country, error is printed but execution continues.
    
    Note:
        - Prices are in EUR/MWh
        - Data granularity is hourly
        - Missing data points are possible due to API limitations
    """
    prices_data = {}
    
    for country_name, country_code in countries.items():
        try:
            print(f"Fetching day-ahead prices for {country_name}...")
            prices = client.query_day_ahead_prices(
                country_code, 
                start=start_date, 
                end=end_date
            )
            prices_data[country_name] = prices
            print(f"✅ {country_name}: {len(prices)} records")
        except Exception as e:
            print(f"❌ Failed to get {country_name} data: {e}")
    
    return prices_data

## 2.2. Cross-Border Flows  
In this subsection, we focus on extracting cross-border physical electricity flows between European countries. 
These flows represent the actual electricity transferred between countries through interconnectors.

In [16]:
def get_cross_border_flows(from_country, to_country, start_date, end_date):
    """Extract physical electricity flows between two countries from ENTSO-E platform.
    
    Args:
        from_country (str): ENTSO-E bidding zone code for source country (e.g., 'FR' for France)
        to_country (str): ENTSO-E bidding zone code for destination country (e.g., 'BE' for Belgium)
        start_date (pd.Timestamp): Start date for data extraction, timezone-aware
        end_date (pd.Timestamp): End date for data extraction, timezone-aware
    
    Returns:
        pd.Series: Time series of physical flows in MW (positive values indicate flow in specified direction).
                  Returns None if request fails.
    
    Raises:
        Exception: If API request fails, error is printed and None is returned.
    """
    try:
        flows = client.query_crossborder_flows(
            from_country, to_country, 
            start=start_date, end=end_date
        )
        return flows
    except Exception as e:
        print(f"Failed to get flows {from_country}->{to_country}: {e}")
        return None

## 2.3. Load Data
This subsection deals with extracting actual electricity consumption (load) data.

In [12]:
def get_actual_load(countries, start_date, end_date):
    """Extract actual electricity consumption (load) data for multiple countries from ENTSO-E platform.
    
    Args:
        countries (dict): Dictionary mapping country names to their ENTSO-E bidding zone codes.
            Example: {'Belgium': 'BE', 'France': 'FR'}
        start_date (pd.Timestamp): Start date for data extraction, timezone-aware.
        end_date (pd.Timestamp): End date for data extraction, timezone-aware.
    
    Returns:
        dict: Dictionary containing pandas Series of actual load for each country.
            Keys are country names, values are load series in MW.
            Example: {'Belgium': pd.Series(...), 'France': pd.Series(...)}
    
    Raises:
        Exception: If API request fails for any country, error is printed but execution continues.
    """
    load_data = {}
    
    for country_name, country_code in countries.items():
        try:
            load = client.query_load(
                country_code, 
                start=start_date, 
                end=end_date
            )
            load_data[country_name] = load
            print(f"✅ Load data for {country_name}: {len(load)} records")
        except Exception as e:
            print(f"❌ Failed to get load data for {country_name}: {e}")
    
    return load_data

In [14]:
get_actual_load(COUNTRIES, pd.Timestamp('2024-01-01', tz='Europe/Brussels'), 
                pd.Timestamp('2024-01-07', tz='Europe/Brussels'))

✅ Load data for Belgium: 576 records
✅ Load data for Netherlands: 576 records
✅ Load data for France: 144 records
✅ Load data for Germany_Luxembourg: 576 records
✅ Load data for Germany: 576 records
✅ Load data for Luxembourg: 576 records


{'Belgium':                            Actual Load
 2024-01-01 00:00:00+01:00       7521.0
 2024-01-01 00:15:00+01:00       7456.0
 2024-01-01 00:30:00+01:00       7425.0
 2024-01-01 00:45:00+01:00       7344.0
 2024-01-01 01:00:00+01:00       7335.0
 ...                                ...
 2024-01-06 22:45:00+01:00       9199.0
 2024-01-06 23:00:00+01:00       9307.0
 2024-01-06 23:15:00+01:00       9201.0
 2024-01-06 23:30:00+01:00       9200.0
 2024-01-06 23:45:00+01:00       8965.0
 
 [576 rows x 1 columns],
 'Netherlands':                            Actual Load
 2024-01-01 00:00:00+01:00      11257.0
 2024-01-01 00:15:00+01:00      11279.0
 2024-01-01 00:30:00+01:00      11291.0
 2024-01-01 00:45:00+01:00      11266.0
 2024-01-01 01:00:00+01:00      11312.0
 ...                                ...
 2024-01-06 22:45:00+01:00      13288.0
 2024-01-06 23:00:00+01:00      13147.0
 2024-01-06 23:15:00+01:00      12993.0
 2024-01-06 23:30:00+01:00      12871.0
 2024-01-06 23:45:00+01:00 

# 3. Sample Data Collection Script  
This section is focused on European electricity market data collection and analysis. Instead of manually calling each function multiple times, this script orchestrates the entire data collection process for your European Energy Market Analysis project.

In [None]:
def collect_market_data(start_date='2023-01-01', end_date='2023-12-31'):
    """Collect comprehensive European electricity market data from ENTSO-E platform.
    
    Fetches three types of data:
    1. Day-ahead electricity prices for all countries in COUNTRIES dict
    2. Cross-border physical flows between Belgium and neighboring countries
    3. Actual electricity load (consumption) data for all countries
    
    Args:
        start_date (str): Start date in 'YYYY-MM-DD' format. Defaults to '2023-01-01'
        end_date (str): End date in 'YYYY-MM-DD' format. Defaults to '2023-12-31'
        
    Returns:
        dict: Dictionary containing three data types:
            - 'prices': Day-ahead prices by country (EUR/MWh)
            - 'flows': Cross-border flows between countries (MW)
            - 'loads': Actual electricity consumption by country (MW)
    """
    # Convert dates
    start = pd.Timestamp(start_date, tz='Europe/Brussels')
    end = pd.Timestamp(end_date, tz='Europe/Brussels')
    
    print(f"Collecting data from {start_date} to {end_date}")
    print("=" * 50)
    
    # 1. Day-ahead prices
    print("1. Collecting day-ahead prices...")
    prices = get_day_ahead_prices(COUNTRIES, start, end)
    
    # 2. Cross-border flows (key connections to Belgium)
    print("\n2. Collecting cross-border flows...")
    flows = {}
    key_connections = [
        ('FR', 'BE'),  # France to Belgium
        ('NL', 'BE'),  # Netherlands to Belgium  
        ('DE_LU', 'BE'),  # Germany-Luxembourg to Belgium
        ('BE', 'FR'),  # Belgium to France
        ('BE', 'NL'),  # Belgium to Netherlands
        ('BE', 'DE_LU')   # Belgium to Germany-Luxembourg
    ]
    
    for from_c, to_c in key_connections:
        flow_name = f"{from_c}_to_{to_c}"
        flows[flow_name] = get_cross_border_flows(from_c, to_c, start, end)
    
    # 3. Load data
    print("\n3. Collecting load data...")
    loads = get_actual_load(COUNTRIES, start, end)
    
    return {
        'prices': prices,
        'flows': flows, 
        'loads': loads
    }

# Execute data collection
# market_data = collect_market_data('2023-01-01', '2023-12-31')

# 4. Data Processing & Analysis Preparation

In [19]:
def clean_and_prepare_data(market_data):
    """Clean and prepare data for analysis"""
    
    # Combine price data into single DataFrame
    price_df = pd.DataFrame()
    for country, prices in market_data['prices'].items():
        if prices is not None:
            temp_df = pd.DataFrame({
                'datetime': prices.index,
                'price': prices.values,
                'country': country
            })
            price_df = pd.concat([price_df, temp_df])
    
    # Reset index and set datetime as index
    price_df = price_df.reset_index(drop=True)
    price_df['datetime'] = pd.to_datetime(price_df['datetime'])
    price_df = price_df.set_index('datetime')
    
    return price_df

# Example usage:
# clean_data = clean_and_prepare_data(market_data)

In [None]:
clean_and_prepare_data(get_day_ahead_prices(COUNTRIES, start_date='2023-01-01', end_date='2023-12-31'))

In [None]:
get_day_ahead_prices(COUNTRIES, start_date='2023-01-01', end_date='2023-12-31')

In [20]:
clean_and_prepare_data(get_day_ahead_prices(COUNTRIES, start_date='2023-01-01', end_date='2023-12-31'))
def visualize_prices(price_df):
    """Visualize day-ahead prices using seaborn and plotly"""
    
    # Seaborn line plot
    plt.figure(figsize=(14, 7))
    sns.lineplot(data=price_df, x=price_df.index, y='price', hue='country')
    plt.title('Day-Ahead Electricity Prices by Country')
    plt.xlabel('Date')
    plt.ylabel('Price (EUR/MWh)')
    plt.legend(title='Country')
    plt.grid()
    plt.show()
    
    # Plotly interactive plot
    fig = px.line(price_df.reset_index(), x='datetime', y='price', color='country',
                  title='Day-Ahead Electricity Prices by Country',
                  labels={'datetime': 'Date', 'price': 'Price (EUR/MWh)', 'country': 'Country'})
    fig.show()

KeyboardInterrupt: 