# Examine heat-only generation capacity data.

This notebook aims to examine the raw data available for heat-only generation capacity.
Essentially, it's prep for eventually preparing it into a form useable by WP5.

In [None]:
## Activate (and set up) the required Julia environment

using Pkg # Julia package manager.
Pkg.activate(@__DIR__) # Activate the Julia environment in the folder this file is in (namely the `Project.toml`)
Pkg.instantiate() # Download and install the necessary dependencies.

# Load dependencies
using CSV
using XLSX
using DataFrames
using StatsPlots

## Reading and examining the capacity data

The excel file isn't the easiest form for reading,
as cells are merged left and right to make it visually clearer.
Also, there are a number of hidden columns for sub-dividing
the capacity into different capacity and age classes.
However, we're only really interested in the total capacities,
and I expect the capacity and age class information is even more
scarse than the total capacity.

In [None]:
## Read the excel file.

raw_capacity_data = DataFrame(
    XLSX.readtable(
        "input-data\\ENER-C2-2014-641\\WP2_DataAnnex_3_DistrictHeating_ForPublication_201603.xlsx",
        "Technology - Stock",
        "A:D";
        first_row = 8,
        header = false,
        column_labels = [
            :code,
            :country,
            :technology,
            :total_installed_thermal_capacity_MW_th
        ]
    )
)
raw_capacity_data

In [None]:
## Check the set of technologies

Set(raw_capacity_data.technology)

# Less than I was expecting, that's nice.

In [None]:
## Which countries do we have with any heat-only capacity data?

df = dropmissing(raw_capacity_data)
# Heat only
df_heat_only = filter(r -> !contains(r.technology, "CHP"), df)
df_heat_only = combine(
    groupby(df_heat_only, [:code, :country]),
    :total_installed_thermal_capacity_MW_th => sum
)
# CHP
df_chp = filter(r -> contains(r.technology, "CHP"), df)
df_chp = combine(
    groupby(df_chp, [:code, :country]),
    :total_installed_thermal_capacity_MW_th => sum
)
# Combined
df = leftjoin(df_chp, df_heat_only; on=[:code, :country], makeunique=true)

In [None]:
## Plot for visual inspection

sort!(df)
groupedbar(
    df.code,
    Matrix(df[!, end-1:end]),
    bar_position = :stack,
    xrotation = 90,
    label = ["CHP" "heat only"],
    title = "Heat generation capacity from EU HC Tender",
    ylabel = "Capacity [MW]"
)

So we have data for 31 countries, which is far better than I had hoped.
It even seems like the data is at least partially on the right track,
as Finland has a total capacity of ~25 GW which is somewhat close to national
district heating statistics.

Although, there are clear outliers as well.
There's no way Sweden only has 139 MW of district heating capacity.

### District heating demand according to Hotmaps?

For comparison, let's check the yearly DH demands per country from Hotmaps.

In [None]:
## Plot district heating demand from Hotmaps for comparison

demand_data_raw = DataFrame(
    CSV.File("input-data\\scen_current_building_demand\\data\\scen_current_building_demand.csv")
)
df = demand_data_raw[.!ismissing.(demand_data_raw.Scenario), :]
df = filter(
    r -> r.Scenario == "current" && r.Year == 2015 && r.Fuel == "District heating",
    df
)
df = combine(
    groupby(df, [:NUTS0_code]),
    :Value => sum
)
sort!(df)
bar(
    df.NUTS0_code,
    df.Value_sum;
    xticks = (1:length(df.NUTS0_code), df.NUTS0_code),
    xrotation = 45,
    title = "DH demand from Hotmaps",
    ylabel = "DH final energy demand [GWh]"
)

### Heat-only generation from JRC-IDEES?

JRC-IDEES 2021 provides some estimates for heat-only generation for the EU27.
Let's see how that compares.

In [None]:
## Read JRC heat-only data

eu27 = readdir("input-data\\JRC-IDEES-2021")
filter!(s -> !contains(s, '.') && !contains(s, "EU27"), eu27)
jrc_data_raw = DataFrame()
for country in eu27
    df = DataFrame(XLSX.readtable(
        "input-data\\JRC-IDEES-2021\\$(country)\\JRC-IDEES-2021_PowerGen_$(country).xlsx",
        "DistHeat";
        first_row = 3, # Total gross distributed heat production (GWh) - Conventional thermal
        header = false,
        column_labels = vcat(["fuel"], string.(collect(2000:2021)))
    ))
    df.country .= country
    append!(jrc_data_raw, df)
end
jrc_data_raw

In [None]:
## Check fuels

Set(jrc_data_raw.fuel)

# Works out with the length of the dataframe, 23*27=621
# so every country seems to have identical data format.

In [None]:
## Omit totals rows and plot JRC stats

# We want to avoid double-counting some rows.
fuels_to_omit = [
    "Conventional thermal",
    "Residual fuel oil",
    "Other petroleum products",
    "Natural gas",
    "Biogas",
    "Solid biomass",
    "Renewable municipal waste",
    "Industrial wastes",
    "Non-renewable municipal waste"
]
jrc_data = jrc_data_raw[!, [:country, :fuel, Symbol(2021)]] # We're only interested in the latest data.
filter!(r -> !in(r.fuel, fuels_to_omit), jrc_data)
jrc_data = unstack(jrc_data, :fuel, Symbol(2021))
groupedbar(
    jrc_data.country,
    float.(Matrix(jrc_data[!, 2:end]));
    bar_position = :stack,
    legend_position = :outerright,
    xrotation = 90,
    size = (900, 400),
    label = permutedims(names(jrc_data)[2:end]),
    title = "Gross heat-only production",
    ylabel = "Heat [GWh/year]"
)

## CHP generation from JRC-IDEES?

While we're at it, let's also take a look at the CHP production data,
since it's available in the same excel files anyhow.
Capacity data also exists, but it's almost certainly given in electric power,
not thermal power.

In [None]:
## Read and process JRC-IDEES CHP heat production data.

eu27 = readdir("input-data\\JRC-IDEES-2021")
filter!(s -> !contains(s, '.') && !contains(s, "EU27"), eu27)
jrc_chp_production_raw = DataFrame()
for country in eu27
    df = DataFrame(XLSX.readtable(
        "input-data\\JRC-IDEES-2021\\$(country)\\JRC-IDEES-2021_PowerGen_$(country).xlsx",
        "Thermal_CHP";
        first_row = 75, # Total gross distributed heat production (GWh) - Coal
        header = false,
        column_labels = vcat(["fuel"], string.(collect(2000:2021)))
    ))
    df.country .= country
    append!(jrc_chp_production_raw, df)
end
jrc_chp_production_raw

In [None]:
## Check set of CHP fuels

Set(jrc_chp_production_raw.fuel)

# Length checks out with raw data, 10*27=270,
# so every country seems to have full table of data.

In [None]:
## Plot JRC-stats

chp_plot_data = jrc_chp_production_raw[!, [:country, :fuel, Symbol(2021)]]
chp_plot_data = unstack(chp_plot_data, :fuel, Symbol(2021))
groupedbar(
    chp_plot_data.country,
    float.(Matrix(chp_plot_data[!, 2:end]));
    bar_position = :stack,
    legend_position = :outerright,
    xrotation = 90,
    size = (700, 400),
    label = permutedims(names(chp_plot_data)[2:end]),
    title = "CHP heat production",
    ylabel = "Heat [GWh/year]"
)

In [None]:
## Plot JRC CHP versus heat-only production.

heat_only_total_per_country = combine(
    groupby(
        stack(jrc_data, names(jrc_data)[2:end]),
        :country
    ),
    :value => sum
)
chp_total_per_country = combine(
    groupby(
        stack(chp_plot_data, names(chp_plot_data)[2:end]),
        :country
    ),
    :value => sum
)
jrc_comparison_data = leftjoin(
    heat_only_total_per_country,
    chp_total_per_country;
    on = :country,
    makeunique = true
)
groupedbar(
    jrc_comparison_data.country,
    float.(Matrix(jrc_comparison_data[!, 2:end]));
    bar_position = :stack,
    legend_position = :outerright,
    xrotation = 90,
    size = (700, 400),
    label = ["Heat-only" "CHP"],
    title = "JRC heat-only vs CHP heat production",
    ylabel = "Heat [GWh/year]"
)

## Conclusions

I doubt we can rely on this data without at least some backup.
The HC tender data coverage seems better than it likely is,
as compared to the Hotmaps and JRC-IDEES demands
the capacities don't seem feasible for every country.
Poland, Sweden, France, and Austria seem to be the worst cases.


### The plan?

1. Estimate the worst-case ratio of peak useful district heating demand to yearly demand by country using the demand data timeseries.
2. Estimate the share of heat-only generation in district heating production using JRC-IDEES data, as well as map the to-be-modelled technologies to their respective fuel shares.
3. Use Hotmaps yearly DH demand and the above shares to estimate the existing heat-only capacity per country.

Again, we'll likely have to map missing data from neighbouring countries or something.
