# Overview of Energy consumption in the European Union


In this chapter we intend to provide a basic overview of energy consumption evolution in the EU.

We will see:

1. The evolution of total EU consumption per capita brokendown by energy source
2. The evolution of total EU consumption per capita brokendown by consuming sector
3. The evolution of consumption per capita by country
4. For the last year, the relationship between consumer sector and energy source


We first import the libraries we are using, and a json file that contains the subsets of codelist we are going to use with the datasets.

In [1]:
from utils import utils
import json
import plotly.express as px
import plotly.graph_objects as go


with open('utils/sub_codelists.json', 'r') as f:
    sub_codelists = json.load(f)

The data required data are stored in the Eurostat SDMX Repository.

We need to get two datasets:
- The Energy Balances (simplified is enough for this analysis); and
- The population by country.

For each dataset, we need to provide the list of values required, since for this analysis we are not using all of them.

We have created a method that gets automatically the SDMX data from a repository into a pysdmx Dataset, which includes a Pandas DataFrame with the data. For this, we need to provide as arguments the name of the dataset, a dictionary with the constraints and the agency.

We will also take advantage of a method we created to add the labels to the dataframe (in addition to the codes). Doing this is possible because with pysdmx we get both data and metadata in the same Dataset object.

In [4]:

#Defining the constraints for the datasets
nrg_bal_constraints = {
    'freq': 'A',
    'nrg_bal': sub_codelists['nrg_bal_consumption_basic_plus_other'],
    'siec': sub_codelists['nrg_bal_siec_breakdown'],
    'unit': 'GWH',
    'geo': sub_codelists['geo_eu_countries'],
}

demo_pjan_constraints = {
    'freq': 'A',
    'unit': 'NR',
    'age': 'TOTAL',
    'sex': 'T',
    'geo': sub_codelists['geo_eu_countries'],
    }

#Getting the data
nrg_bal = utils.get_dataset_with_selection('nrg_bal_s', nrg_bal_constraints, agency='estat')
#Adding labels to the dataframe
nrg_bal = utils.add_labels(nrg_bal, 'siec')
nrg_bal = utils.add_labels(nrg_bal, 'nrg_bal')


demo_pjan = utils.get_dataset_with_selection('demo_pjan', demo_pjan_constraints, agency='estat')
# We have to subtract 1 year to the TIME_PERIOD to match the nrg_bal_s_total data
demo_pjan.data.TIME_PERIOD  = demo_pjan.data.TIME_PERIOD  - 1


data_url: https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/nrg_bal_s/A.FC_IND_E+FC_TRA_E+FC_OTH_AF_E+FC_OTH_CP_E+FC_OTH_FISH_E+FC_OTH_HH_E+FC_OTH_NSP_E.C0000X0350-0370+C0350-0370+P1000+S2000+G3000+O4000XBIO+RA000+W6100_6220+N900H+E7000+H8000.GWH.BE+BG+CZ+DK+DE+EE+IE+EL+ES+FR+HR+IT+CY+LV+LT+LU+HU+MT+NL+AT+PL+PT+RO+SI+SK+FI+SE?format=SDMX-CSV&startPeriod=2000
metadata_url: https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/dataflow/all/nrg_bal_s/latest?detail=full&references=descendants
data_url: https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/data/demo_pjan/A.NR.TOTAL.T.BE+BG+CZ+DK+DE+EE+IE+EL+ES+FR+HR+IT+CY+LV+LT+LU+HU+MT+NL+AT+PL+PT+RO+SI+SK+FI+SE?format=SDMX-CSV&startPeriod=2000
metadata_url: https://ec.europa.eu/eurostat/api/dissemination/sdmx/2.1/dataflow/all/demo_pjan/latest?detail=full&references=descendants


We are calculating and displaying the total energy consumed (in GW/H) by the European Union, brokend down by type of energy.

In [5]:
data_by_energy_source = nrg_bal.data.groupby(['TIME_PERIOD', 'siec_label'])['OBS_VALUE'].sum().reset_index()

data_by_energy_source.rename(columns={
    'OBS_VALUE': 'Energy consumption in GW/H',
    'TIME_PERIOD': 'Year',
    'siec_label': 'Energy source'
    }, inplace=True)

fig = px.area(data_by_energy_source, x='Year', y='Energy consumption in GW/H', color='Energy source')
fig.show()

The visualization reveals:

- Oil and petroleum products (pink) dominate the energy mix throughout the period, accounting for roughly 40% of total consumption
- Electricity (light blue at bottom) and natural gas (purple) are the next largest contributors
- Renewables and biofuels (yellow) show modest growth over time but remain a relatively small portion of the overall mix
- Most energy sources maintain relatively stable consumption patterns with some slight fluctuations
- There appears to be a small decline in total energy consumption during 2020, possibly related to the pandemic
- Solid fossil fuels (dark blue at top) show a gradual decline toward the end of the period

We are calculating and displaying the total energy consumed (in GW/H) by the European Union, brokend down by consuming sector.

In [6]:
dataset_by_sector = nrg_bal.data.groupby(['TIME_PERIOD', 'nrg_bal_label'])['OBS_VALUE'].sum().reset_index()
dataset_by_sector.rename(columns={
    'OBS_VALUE': 'Energy consumption in GW/H',
    'TIME_PERIOD': 'Year',
    'nrg_bal_label': 'Consuming sector'
    }, inplace=True)
fig = px.area(dataset_by_sector, x='Year', y='Energy consumption in GW/H', color='Consuming sector')
fig.show()

The visualization reveals:

- Transport sector (pink at the top) is the largest consumer of energy throughout the period, representing roughly 30-35% of total consumption
- Households (peach/tan in the middle) form the second largest consuming sector
- Commercial and public services (green) and industry (blue at bottom) also constitute significant portions of energy consumption
- Most sectors show relatively stable consumption patterns until around 2010, after which there's a slight downward trend visible across sectors
- Agriculture and forestry, fishing, and "not elsewhere specified" sectors represent smaller proportions of energy consumption

We are then calculating and displaying the per capita energy comsumption (in KW/H) per European Union country 

In [7]:
def calculate_energy_consumption_per_capita(df):
    df['energy_consumption_PC_kwh'] = df['energy_consumption_gwh'] / df['population'] * 1000000
    return df

dataset_by_country = nrg_bal.data.groupby(['TIME_PERIOD', 'geo'])['OBS_VALUE'].sum().reset_index()

consumption_pc_by_country = dataset_by_country.merge(demo_pjan.data, on=['TIME_PERIOD', 'geo'], how='inner')

consumption_pc_by_country.rename(columns={'OBS_VALUE_x': 'energy_consumption_gwh', 'OBS_VALUE_y': 'population'}, inplace=True)

consumption_pc_by_country = calculate_energy_consumption_per_capita(consumption_pc_by_country)

fig = px.line(consumption_pc_by_country, x='TIME_PERIOD', y='energy_consumption_PC_kwh', color='geo')
fig.show()

Key observations:

- Luxembourg (LU, green line) shows significantly higher per capita energy consumption than the others, peaking around 2005 at nearly 100,000 kWh per person before declining substantially to around 50,000 kWh by 2022
- Malta (MT, orange line) shows the lowest consumption eith 12,000 kWh per capita
- Most other EU countries cluster in the 15,000-40,000 kWh range
- There's a general downward trend in per capita energy consumption across most countries after 2005-2010
- A particularly noticeable decline occurs after 2020, likely due to COVID-19 impacts
- Countries show varying patterns of efficiency improvement, with the Luxembourg showing the most dramatic reduction

Finally, we are generating a Sankey diagram to see what sector consumes what type of energy for the year 2022.

In [8]:
df = nrg_bal.data.merge(demo_pjan.data, on=['TIME_PERIOD', 'geo'], how='inner')
df.rename(columns={'OBS_VALUE_x': 'energy_consumption_gwh', 'OBS_VALUE_y': 'population'}, inplace=True)
df = df[(df['TIME_PERIOD'] == 2022)]

sources = df['siec_label'].unique()
targets = df['nrg_bal_label'].unique()

labels = []
source = []
target = []
value = []

index = 0
index_dict = {}

for s in sources:
    labels.append(s)
    index_dict[s] = index
    index += 1

    for t in targets:
        if t not in index_dict:
            labels.append(t)
            index_dict[t] = index
            index += 1
        source.append(index_dict[s])
        target.append(index_dict[t])
        value.append(df[(df['siec_label'] == s) & (df['nrg_bal_label'] == t)]['energy_consumption_gwh'].sum())
        
    


fig = go.Figure(data=[go.Sankey(
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = labels,
      color = "blue"
    ),
    link = dict(
      source = source,
      target = target,
      value = value
  ))])

fig.update_layout(title_text="Source of energy by sectors", font_size=10)
fig.show()

The chart shows a Sankey diagram illustrating the flow of energy from different sources (left side) to various consuming sectors (right side).
Key observations:

- Oil and petroleum products (bottom left) flow primarily to the transport sector, representing the largest single energy flow in the diagram
- Electricity (top left) distributes across multiple sectors, with significant flows to commercial/public services, industry, and households
- Natural gas shows major flows to households and industry sectors
- Renewables and biofuels connect to multiple sectors but appear to have smaller overall volumes
- Households receive energy from multiple sources, particularly electricity, natural gas, and heat
- The industry sector draws significantly from electricity and natural gas

This visualization effectively demonstrates which energy sources power specific sectors of the economy, highlighting the oil-transport relationship and the versatility of electricity and natural gas in serving multiple sectors. It provides a comprehensive view of the energy system's structure and interdependencies that complements the time-series data shown in previous charts.