# Processing _Danish Energy Agency Technology Catalogue_ heat generation data for Mopo WP5

The goal of this workbook is to process the DEA catalogue data into the format desired by Mopo WP5.
This will require some assumptions about how to aggregate the different technologies in the catalogue into the desired technologies in the input.
Furthermore, since the parameters in the catalogues don't match the desired parameters exactly, we need to do some conversions as well.


## Julia setup

In [None]:
# Activate (and set up) the required Julia environment

using Pkg # Julia package manager.
Pkg.activate(@__DIR__) # Activate the Julia environment in the folder this file is in (namely the `Project.toml`)
Pkg.instantiate() # Download and install the necessary dependencies.

# Load dependencies
using XLSX
using DataFrames
using Statistics
using CSV
using MopoHeatSectorDataProcessing

## Reading the data

In [None]:
# Read and preview data
raw_dh_data = DataFrame(XLSX.readtable("input-data\\dea-technology-catalogues\\technology_data_for_el_and_dh.xlsx", "alldata_flat"))
raw_heat_data = DataFrame(XLSX.readtable("input-data\\dea-technology-catalogues\\technology_data_heating_installations.xlsx", "alldata_flat"))
raw_cc_data = DataFrame(XLSX.readtable("input-data\\dea-technology-catalogues\\technology_data_for_carbon_capture_transport_storage.xlsx", "alldata_flat"))
first(raw_dh_data, 3)

## Mapping DEA technologies to desired technologies

Next up, we'll need to map the technologies in the DEA catalogues to the desired technology categories in the Mopo data.
This should probably be done with some more thought, but for now, I'll settle with including every technology that I think might fit and worry about refining the assumptions later.

We'll begin by figuring out each category first, before putting the whole map together.

In [None]:
# Set of all available EL/DH technologies.
techs = Set(raw_dh_data[:, :Technology]);

### Biomass CHP

In [None]:
# Set of Biomass CHP technologies.
bio_chp = filter(x -> contains(lowercase(x), "biomass chp"), techs)

### Gas CHP

Seems like all the `gas turbine` in the DEA data have CHP capability,
with the exception of the `open cycle` ones!

In [None]:
gas_chp = filter(
    x -> contains(lowercase(x), "gas turbine") && !contains(lowercase(x), "open"),
    techs
)

### Coal CHP

Coal doesn't seem to be big in Denmark, but there's at least one data point.
Weirdly, it seems to be a "power plant" despite it having CHP coefficients?

In [None]:
coal_chp = filter(x -> contains(lowercase(x), "coal power"), techs)

### Nuclear CHP

The DEA catalogues don't have any parameters for nuclear, unfortunately. Skipping this for now.

### Fuelcell CHP

Easy `fuel cell` yields what we want, both with CHP coefficients.

In [None]:
fuelcell_chp = filter(x -> contains(lowercase(x), "fuel cell"), techs)

### Waste CHP

Easy `waste CHP` to avoid HOBs.

In [None]:
waste_chp = filter(x -> contains(lowercase(x), "waste chp"), techs)

### Bio boiler DH

Filtering biomass heat-only boilers is a bit more complicated, as we need to avoid CHP and refurbishment of fossil plants.

In [None]:
bio_boiler_dh = filter(x -> all(contains.(lowercase(x), ["boiler", "biomass"])) && !any(contains.(lowercase(x), ["chp", "rebuild"])), techs)

### Gas boiler DH

Simple `gas boiler` yields what we need.

In [None]:
gas_boiler_dh = filter(x -> contains(lowercase(x), "gas boiler"), techs)

### Oil boiler DH

Seems like the only `oil` unit is a gas turbine?
Not really applicable...

In [None]:
oil_boiler_dh = filter(x -> contains(lowercase(x), " oil"), techs)

### Coal boiler DH

No `coal` boiler data either it seems.

In [None]:
coal_boiler_dh = filter(x -> contains(lowercase(x), "coal"), techs)

### Elec boiler DH

In [None]:
elec_boiler_dh = filter(x -> contains(lowercase(x), "electric boiler"), techs)

### Heatpump DH

Filtering out only electric heat pumps with no special heat sources is a bit more complicated.
We need to avoid including `geothermal` and `industrial` heat sources, as well as `absorption` heat pumps since they essentially "amplify" heat.

In [None]:
heatpump_dh = filter(x -> contains(lowercase(x), "heat pump") && !any(contains.(lowercase(x), ["geothermal", "industrial", "absorption"])), techs)

### Solar DH

Easy `solar dh` to avoid PV.

In [None]:
solar_dh = filter(x -> contains(lowercase(x), "solar dh"), techs)

### Geothermal DH

Easy `geothermal`. However, whether this should be modelled as an electric technology is an open question.

In [None]:
geothermal_dh = filter(x -> contains(lowercase(x), "geothermal"), techs)

### Waste DH

Easy `waste boiler`.

In [None]:
waste_dh = filter(x -> contains(lowercase(x), "waste boiler"), techs)

## Mapping individual heating systems

Next up, let's map individual heating technologies to the desired ones.
First, we'll need to get the set of available techs.

In [None]:
# Set of all available individual heating technologies.
heat_techs = Set(raw_heat_data[:, :Technology]);

### Decentral gas boiler

In [None]:
gas_boiler = filter(x -> contains(lowercase(x), "gas boiler"), heat_techs)

### Decentral bio boiler

In [None]:
bio_boiler = filter(x -> contains(lowercase(x), "biomass"), heat_techs)

### Decentral oil boiler

In [None]:
oil_boiler = filter(x -> contains(lowercase(x), "oil boiler"), heat_techs)

### Decentral coal boiler

No `coal` boiler data it seems.

In [None]:
coal_boiler = filter(x -> contains(lowercase(x), "coal"), heat_techs)

### Decentral A2W heat pump

In [None]:
air_heatpump = filter(x -> contains(lowercase(x), "air-to-water"), heat_techs)

### Decentral G2W heat pump

In [None]:
ground_heatpump = filter(x -> contains(lowercase(x), "ground source"), heat_techs)

### Decentral solar heating

In [None]:
solar_heating = filter(x -> contains(lowercase(x), "solar"), heat_techs)

### Decentral electric heating

In [None]:
electric_heating = filter(x -> contains(lowercase(x), "electric"), heat_techs)

## Carbon capture technology mapping

Based on a brief review of the DEA carbon capture technology catalogue,
basically all post-combustion and oxy-fuel carbon capture solutions
are technically feasible for any fuel-combustion-based technologies.
In reality, the impacts on efficiency and costs would likely vary between technologies.
However, I see no point in making this unnecessarily difficult, since all of these are rough approximations anyhow.

In [None]:
# Set of all available carbon capture technologies.
cc_techs = Set(raw_cc_data[:, :Technology]);

In [None]:
combustion_cc = filter(
    x -> (
        (
            contains(lowercase(x), "combustion") ||
            contains(lowercase(x), "fuel")
        ) && !contains(lowercase(x), "cement")
    ),
    cc_techs
)

## Calculate the desired parameters

Next up, we'll need to try and calculate the required parameters for the technologies.
This will likely be a nightmare due to inconsistent parameter names, units, and coverage.
The list of parameters to calculate is:
- `CAPEX (EUR/MW)`: based on `Nominal investment (* total)` in the sources.
- `FOM (EUR/MW/year)`: based on `Fixed O&M (*total)` in the sources.
- `VOM (EUR/MWh)`: based on `Variable O&M (*total)` in the sources.
- `Lifetime (years)`: based on `Technical lifetime (years)` in the sources.
- `Conversion rate (output/input)`: based on `Electrical efficiency (net, annual average)` and `Heat efficiency (net, annual average)`.
- `CO2 Captured (tCO2/MWh)` Needs to be based on `CO2 capture rate, net (%)` somehow.
- `Fuel cost`: This is not included, as it needs to be a separate assumption?

I suppose the easiest way to attempt to do this is to gather values for these into the dataframes before fetching and averaging them across different mapped technologies. 

In [None]:
describe(raw_dh_data)

In [None]:
## Calculate parameters for dh_data

# Preprocessing
cols = [:Technology, :par, :est, :year, :val] # Omit unnecessary cols
dh_data = deepcopy(raw_dh_data[!, cols])
dh_data.est = lowercase.(dh_data.est) # Avoid potential upper-lowercase conflicts.
dh_data.year = string.(dh_data.year) # Avoid year unstack issues?
filter!(r -> r.est == "ctrl", dh_data) # Filter out lower and upper estimates
filter!(r -> typeof(r.val) != String, dh_data) # Filter out string values.
unique!(dh_data)
dh_data = extrapolate_years( # Fix some data consistency issues when it comes to `year`.
    dh_data;
    extrapolation_years=[2015, 2020, 2025, 2030, 2040, 2050],
    combine=mean
)
dh_data = unstack(dh_data, :par, :val)

# CAPEX, use MEUR/MW_e when possible, MEUR/MW_h when not found.
dh_data[!, :CAPEX_MEUR_MW] = dh_data[!, Symbol("Nominal investment (*total) [MEUR/MW_e]")]
missing_inds = ismissing.(dh_data.CAPEX_MEUR_MW)
dh_data[missing_inds, :CAPEX_MEUR_MW] = dh_data[missing_inds, Symbol("Nominal investment (*total) [MEUR/MW_h]")]

# FOM, use EUR/MW_e/y when possible, EUR/MW_h/y when not found.
dh_data[!, :FOM_EUR_MW_y] = dh_data[!, Symbol("Fixed O&M (*total) [EUR/MW_e/y]")]
missing_inds = ismissing.(dh_data.FOM_EUR_MW_y)
dh_data[missing_inds, :FOM_EUR_MW_y] = dh_data[missing_inds, Symbol("Fixed O&M (*total) [EUR/MW_h/y]")]

# VOM, use EUR/MWh_e when possible, EUR/MWh_h when not found.
dh_data[!, :VOM_EUR_MWh] = dh_data[!, Symbol("Variable O&M (*total) [EUR/MWh_e]")]
missing_inds = ismissing.(dh_data.VOM_EUR_MWh)
dh_data[missing_inds, :VOM_EUR_MWh] = dh_data[missing_inds, Symbol("Variable O&M (*total) [EUR/MWh_h]")]

# Lifetime, this is a simple `Technical lifetime`
dh_data[!, :lifetime_y] = dh_data[!, Symbol("Technical lifetime [years]")]

# Conversion rate (output/input), first for elec. Use annual average before nameplate.
dh_data[!, :conversion_rate_elec_pu] = dh_data[!, Symbol("Electrical efficiency (net, annual average) []")]
missing_inds = ismissing.(dh_data.conversion_rate_elec_pu)
dh_data[missing_inds, :conversion_rate_elec_pu] = dh_data[missing_inds, Symbol("Electrical efficiency (net, name plate) []")]

# Conversion rate (output/input) for heat. Calculate based on electric conversion rate and Cb coefficient first, then supplement for heat only.
dh_data[!, :conversion_rate_heat_pu] = dh_data[!, :conversion_rate_elec_pu] ./ dh_data[!, Symbol("Cb coefficient (50°C/100°C)")]
missing_inds = ismissing.(dh_data.conversion_rate_heat_pu)
dh_data[missing_inds, :conversion_rate_heat_pu] = dh_data[missing_inds, :conversion_rate_elec_pu] ./ dh_data[missing_inds, Symbol("Cb coefficient (40°C/80°C)")]
missing_inds = ismissing.(dh_data.conversion_rate_heat_pu)
dh_data[missing_inds, :conversion_rate_heat_pu] = dh_data[missing_inds, Symbol("Heat efficiency (net, annual average) []")]
missing_inds = ismissing.(dh_data.conversion_rate_heat_pu)
dh_data[missing_inds, :conversion_rate_heat_pu] = dh_data[missing_inds, Symbol("Heat efficiency (net, name plate) []")]

# CO2 captured, zero for now, needs to be deduced from CC tech catalogue.
dh_data[!, :CO2_captured_pu] .= 0.0

# Currency is in 2020 Euros, based on `examine_generation_data.ipynb`
dh_data[!, :currency] .= "2020 EUR"

# Metadata is DEA technology catalogue
dh_data[!, :metadata] .= "Technology Data Catalogue for Electricity and district heating production - Updated 2024, Danish Energy Agency, 2024"

# Drop unnecessary parameters
param_cols = [:Technology, :est, :year]
params = [
    :CAPEX_MEUR_MW,
    :FOM_EUR_MW_y,
    :VOM_EUR_MWh,
    :lifetime_y,
    :conversion_rate_elec_pu,
    :conversion_rate_heat_pu,
    :CO2_captured_pu,
    :currency,
    :metadata
]
dh_data = dh_data[!, vcat(param_cols, params)]
describe(dh_data)

In [None]:
## Calculate the parameters for individual heating system data

# Preprocessing
cols = [:Technology, :par, :est, :year, :val] # Omit unnecessary cols
heat_data = deepcopy(raw_heat_data[!, cols])
heat_data.est = lowercase.(heat_data.est) # Avoid potential upper-lowercase conflicts.
heat_data.year = string.(heat_data.year) # Avoid year unstack issues?
filter!(r -> r.est == "ctrl", heat_data) # Filter out lower and upper estimates
unique!(heat_data)
filter!(r -> typeof(r.val) != String, heat_data) # Filter our some anomalous string data.
heat_data = extrapolate_years( # Fix some data consistency issues when it comes to `year`.
    heat_data;
    extrapolation_years=[2020, 2025, 2030, 2040, 2050],
    combine=mean
)
heat_data = unstack(heat_data, :par, :val; combine=mean)

# CAPEX, need to scale investments per unit to investments per MW.
heat_data[!, :CAPEX_MEUR_MW] = (
    heat_data[!, Symbol("Nominal investment (*total) [k€/unit, 2020]")] ./
    heat_data[!, Symbol("Heat production capacity for one unit [kW_h]")]
)

# FOM, Fixed costs EUR/unit/y scaled per unit to EUR/MW/y.
heat_data[!, :FOM_EUR_MW_y] = (
    heat_data[!, Symbol("Fixed O&M (*total) [€/unit/y, 2020]")] ./
    heat_data[!, Symbol("Heat production capacity for one unit [kW_h]")] .*
    1000 # Scaling from EUR/kW/y to EUR/MW/y
)

# VOM, primarily scaled based on the €/kWh values, missing values supplemented with €/MWh.
heat_data[!, :VOM_EUR_MWh] = heat_data[!, Symbol("Variable O&M (*total) [€/kWh, 2020]")] .* 1000
missing_inds = ismissing.(heat_data.VOM_EUR_MWh)
heat_data[missing_inds, :VOM_EUR_MWh] = heat_data[missing_inds, Symbol("Variable O&M (*total) [€/MWh_h, 2020]")]

# Lifetime
heat_data[!, :lifetime_y] = heat_data[!, Symbol("Technical economic lifetime [years]")]

# Conversion rate (output/input) for elec is non-existent.
heat_data[!, :conversion_rate_elec_pu] .= missing

# Conversion rate (output/input) for heat, simple annual average heat efficiency supplemented with other values if missing.
heat_data[!, :conversion_rate_heat_pu] = heat_data[!, Symbol("Heat efficiency (annual average, net) [p.u.]")]
for var in Symbol.([
    "Heat efficiency (annual average, net, total system, floor heating) [p.u.]",
    "Heat efficiency (annual average, net, floor heating) [p.u.]",
    "Heat efficiency (annual average, net, gas boiler heating) [p.u.]",
    "Heat efficiency (annual average, net, heat pump, radiator heating) [p.u.]",
    "Heat efficiency (annual average, net, hear pump, floor heating) [p.u.]"
])
    missing_inds = ismissing.(heat_data.conversion_rate_heat_pu)
    heat_data[missing_inds, :conversion_rate_heat_pu] = heat_data[missing_inds, var]
end

# CO2 captured, not relevant for individual heating.
heat_data[!, :CO2_captured_pu] .= 0.0

# Currency is in 2020 Euros, based on `examine_generation_data.ipynb`
heat_data[!, :currency] .= "2020 EUR"

# Metadata is DEA technology catalogue
heat_data[!, :metadata] .= "Technology catalogue for Individual Heating - Latest update April 2024, Danish Energy Agency, 2024"

# Drop unnecessary parameters
param_cols = [:Technology, :est, :year]
params = [
    :CAPEX_MEUR_MW,
    :FOM_EUR_MW_y,
    :VOM_EUR_MWh,
    :lifetime_y,
    :conversion_rate_elec_pu,
    :conversion_rate_heat_pu,
    :CO2_captured_pu,
    :currency,
    :metadata
]
heat_data = heat_data[!, vcat(param_cols, params)]
describe(heat_data)

In [None]:
## Calculate parameters for cc_data

# The CC technologies are a bit of a special case,
# as we need to only concern ourselves with very specific technologies
# and manually provide the assumed thermal capacities.
# Furthermore, these parameters are assumed to apply on top of underlying
# technology parameters, representing additional investments and costs.
# However, conversion rate impacts are multiplicative!
cols = [:Technology, :par, :est, :year, :val] # Omit unnecessary cols
cc_data = filter(
    r -> r.Technology in combustion_cc,
    raw_cc_data[!, cols]
)
cc_data.est = lowercase.(cc_data.est) # Avoid potential upper-lowercase conflicts.
filter!(r -> r.est == "est", cc_data) # Filter out lower and upper estimates
cc_data.est .= "ctrl" # Rename "est" to "ctrl" to align with dh and heat catalogues.
cc_data.year = string.(cc_data.year) # Avoid year unstack issues?
unique!(cc_data)
filter!(r -> typeof(r.val) != String, cc_data)
cc_data = extrapolate_years( # Fix some data consistency issues when it comes to `year`.
    cc_data;
    extrapolation_years=[2020, 2025, 2030, 2040, 2050],
    combine=mean
)
cc_data = unstack(cc_data, :par, :val; combine=mean)

# Manually map thermal capacities
thermal_capacity_map = Dict(zip(String.(combustion_cc), [500, 500, 100]))
cc_data[!, :thermal_capacity_MW] = map(
    key -> thermal_capacity_map[key],
    cc_data.Technology
)

# CAPEX, needs to be calculated based on the assumed thermal capacity and other parameters, as well as account for different paramter spellings.
cc_data[!, :CAPEX_MEUR_MW] = (
    cc_data[!, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] ./
    cc_data.thermal_capacity_MW .*
    cc_data[!, Symbol("Specific investment [mill €/[tCO2/hour]] (CO2 output)")]
)
for var in Symbol.([ # Check for potential alternative spellings.
    #"Specific investment [mill €/(tCO2/hour)] (CO2 output)", # Does not occur in combustion_cc
    "Specific investment [mill € /(tCO2/hour)] (CO2 output)"
])
    missing_inds = ismissing.(cc_data.CAPEX_MEUR_MW)
    cc_data[missing_inds, :CAPEX_MEUR_MW] = (
        cc_data[missing_inds, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] ./
        cc_data[missing_inds, :thermal_capacity_MW] .*
        cc_data[missing_inds, var]
    )
end

# FOM, again needs to be calculated based on the assumed thermal capacity and account for various potential spellings.
cc_data[!, :FOM_EUR_MW_y] = (
    cc_data[!, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] ./
    cc_data.thermal_capacity_MW .*
    cc_data[!, Symbol("Fixed O&M [mill €/[tCO2/hour]] (CO2 output)")] .* # Apparently assumed 3% of CAPEX
    1e6 # Scaling from MEUR to EUR.
)
for var in Symbol.([ # Check for potential alternative spellings.
    #"Fixed O&M [mill €/(tCO2/hour)]", # Not found in combustion_cc
    "Fixed O&M [mill € /(tCO2/hour)] (CO2 output)",
    #"Fixed O&M [EUR/tCO2/year]", # Wrong scaling # Not found in combustion_cc
    #"Fixed O&M 2 t CO₂/h [mill €/(tCO2/hour)] (CO2 output)", # Not found in combustion_cc
    #"Fixed O&M [€/ton CO2, 2020]", # Not found in combustion_cc
    #"Fixed O&M [mill € /(tCO2/Hour)] (CO2 output)" # Not found in combustion_cc
])
    missing_inds = ismissing.(cc_data.FOM_EUR_MW_y)
    cc_data[missing_inds, :FOM_EUR_MW_y] = (
        cc_data[missing_inds, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] ./
        cc_data[missing_inds, :thermal_capacity_MW] .*
        cc_data[missing_inds, var] .* # Apparently assumed 3% of CAPEX
        1e6 # Scaling from MEUR to EUR.
    )
end

# VOM, account for typical capacity, seems like spelling is consistent for once.
cc_data[!, :VOM_EUR_MWh] = (
    cc_data[!, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] ./
    cc_data.thermal_capacity_MW .*
    cc_data[!, Symbol("Variable O&M [€/tCO2] (CO2 output)")]
)

# Lifetime not assumed to change due to CC
cc_data[!, :lifetime_y] .= 0.0

# Conversion rate (output/input), first for heat, as it is more straightforward.
cc_data[!, :conversion_rate_heat_pu] = 1 .- (
    cc_data[!, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] .*
    cc_data[!, Symbol("C1] Heat  input [MWh/tCO2] (CO2 output)")] ./
    cc_data.thermal_capacity_MW
)
for var in Symbol.([ # Handle alternative spellings
    "C1] Heat  input [MWh/t CO2] (CO2 output)",
    #"C1] Heat  input [MWh/tCO2] (CO2 output) " # Not found in combustion_cc
])
    missing_inds = ismissing.(cc_data.conversion_rate_heat_pu)
    cc_data[missing_inds, :conversion_rate_heat_pu] = 1 .- (
        cc_data[missing_inds, Symbol("Typical total plant capacity [tCO2/hour] (CO2 output)")] .*
        cc_data[missing_inds, var] ./
        cc_data[missing_inds, :thermal_capacity_MW]
    )
end

# Conversion rate (output/input) for elec, assumes identical drop to heat production.
cc_data[!, :conversion_rate_elec_pu] = cc_data[!, :conversion_rate_heat_pu]

# CO2 captured, again have to handle 
cc_data[!, :CO2_captured_pu] = cc_data[!, Symbol("A3] CO2 capture rate, net [%]")]
missing_inds = ismissing.(cc_data.CO2_captured_pu)
cc_data[missing_inds, :CO2_captured_pu] = cc_data[missing_inds, Symbol("Ax] CO2 capture rate, net [%]")]
scaled_inds = cc_data.CO2_captured_pu .> 1.0 # Some values are given in percents between 1-100, so handle via scaling.
cc_data[scaled_inds, :CO2_captured_pu] ./= 100

# Currency is in 2020 Euros, based on `examine_generation_data.ipynb`
cc_data[!, :currency] .= "2020 EUR"

# Metadata is DEA technology catalogue
cc_data[!, :metadata] .= "Technology catalogue for carbon capture, transport and storage - Updated April 2024, Danish Energy Agency, 2024"

# Drop unnecessary parameters
param_cols = [:Technology, :est, :year]
params = [
    :CAPEX_MEUR_MW,
    :FOM_EUR_MW_y,
    :VOM_EUR_MWh,
    :lifetime_y,
    :conversion_rate_elec_pu,
    :conversion_rate_heat_pu,
    :CO2_captured_pu,
    :currency,
    :metadata
]
cc_data = cc_data[!, vcat(param_cols, params)]
describe(cc_data)

## Technology mapping

Mapping the suitable technologies to their desired Mopo WP5 counterparts,
as well as the applicable carbon capture technologies.

Missing coal and oil boiler parameters have been mapped to either gas or oil,
whichever was available.

In [None]:
technology_mapping = Dict( # (Set of technologies, apply carbon capture)
    "bio-CHP" => (bio_chp, false),
    "bio-CHP+CC" => (bio_chp, true),
    "gas-CHP" => (gas_chp, false),
    "gas-CHP+CC" => (gas_chp, true),
    "coal-CHP" => (coal_chp, false),
    "coal-CHP+CC" => (coal_chp, true),
    "nuclear-CHP" => (Set(), false), # No data in DEA catalogue.
    "fuelcell-CHP" => (fuelcell_chp, false),
    "waste-CHP" => (waste_chp, false),
    "bio-boiler-DH" => (bio_boiler_dh, false),
    "bio-boiler-DH+CC" => (bio_boiler_dh, true),
    "gas-boiler-DH" => (gas_boiler_dh, false),
    "oil-boiler-DH" => (gas_boiler, false), # No data in DEA catalogue, map to gas heat-only boiler.
    "coal-boiler-DH" => (gas_boiler, false), # No data in DEA catalogue, map to gas heat-only boiler.
    "elec-boiler-DH" => (elec_boiler_dh, false),
    "heatpump-DH" => (heatpump_dh, false),
    "solar-DH" => (solar_dh, false),
    "geothermal-DH" => (geothermal_dh, false),
    "waste-DH" => (waste_dh, false),
    "gas-boiler" => (gas_boiler, false),
    "bio-boiler" => (bio_boiler, false),
    "oil-boiler" => (oil_boiler, false),
    "coal-boiler" => (oil_boiler, false), # No data in DEA catalogue, map to oil boiler parameters.
    "air-heatpump" => (air_heatpump, false),
    "ground-heatpump" => (ground_heatpump, false),
    "solar-heating" => (solar_heating, false),
    "electric-heating" => (electric_heating, false),
    "bio-microCHP" => (Set(), false), # No data in DEA catalogue.
    "gas-microCHP" => (Set(), false), # No data in DEA catalogue.
    "fuelcell-microCHP" => (Set(), false), # No data in DEA catalogue.
    "nuclear-SMR-heat" => (Set(), false) # No data in DEA catalogue.
);

## Calculate aggregated parameters based on the above mappings

With the DEA catelogue technologies mapped to the desired categories,
it is finally time to calculate the final parameter tables.
Unfortunately, not all technologies are available from the catalogue,
so I'll likely have to rely on the PyPSA `technology-data` repository
as a backup. In any case, it also serves as a comparison to check
if my processing is at least more or less correct.

In [None]:
## Calculate the average carbon capture technology parameters

cc_data_agg = stack(cc_data)
cc_data_agg = combine(
    groupby(cc_data_agg, [:year, :currency, :metadata, :variable]),
    :value => mean
)
cc_data_agg = unstack(cc_data_agg, :variable, :value_mean)
describe(cc_data_agg)

In [None]:
## Combine dh and heat data into a single dataset for processing, and join with cc params.

# Config
final_years = string.([2020, 2030, 2040, 2050])

# Data processing
full_data = filter(
    r -> r.year in final_years,
    vcat(dh_data, heat_data)
)
leftjoin!(
    full_data,
    cc_data_agg;
    on = [:year, :currency],
    makeunique = true
)
describe(full_data);

In [None]:
## Form the desired technology parameter table

cols = [
    :technology,
    :year,
    :CAPEX_MEUR_MW,
    :FOM_EUR_MW_y,
    :VOM_EUR_MWh,
    :lifetime_y,
    :conversion_rate_elec_pu,
    :conversion_rate_heat_pu,
    :CO2_captured_pu,
    :currency,
    :metadata
]
desired_data = DataFrame()
for (name, (techs, apply_cc)) in technology_mapping
    # Filter relevant technologies.
    df = filter(
        r -> r.Technology in techs,
        full_data
    )
    if isempty(df) # Skip the rest of the loop if df is empty.
        continue
    end
    # Calculate average properties
    df = unstack(
        combine(
            groupby(stack(df), [:year, :currency, :metadata, :metadata_1, :variable]),
            :value => mean
        ),
        :variable,
        :value_mean
    )
    if apply_cc
        df.CAPEX_MEUR_MW .+= df.CAPEX_MEUR_MW_1
        df.FOM_EUR_MW_y .+= df.FOM_EUR_MW_y_1
        df.VOM_EUR_MWh .+= df.VOM_EUR_MWh_1
        df.conversion_rate_elec_pu .*= df.conversion_rate_elec_pu_1 # Multiplicative!
        df.conversion_rate_heat_pu .*= df.conversion_rate_heat_pu_1 # Multiplicative!
        df.CO2_captured_pu .+= df.CO2_captured_pu_1
        df.metadata .*= ", " .* df.metadata_1 # String concat!
    end
    # Final formatting
    df.technology .= name
    append!(desired_data, df[!, cols])
end
describe(desired_data)

## Convert monetary units from 2020 EUR to 2025 EUR

For the purposes of Mopo, we need to convert the 2020 EUR into 2025 EUR.
This is done using the [harmonised index of consumer prices](https://ec.europa.eu/eurostat/cache/metadata/en/prc_hicp_esms.htm)
using the *annual averate rate of change `RCH_A_AVG`* for the full *Euro Area* for the years 2020-2024.

[Link to the data source](https://doi.org/10.2908/PRC_HICP_AIND).

In [None]:
## Convert 2020 EUR to 2025 EUR using HICP

# Copy desired data to avoid messing with things on multiple executions.
final_data = deepcopy(desired_data)

# Annual rate of change in consumer prices for the Euro area, see data source above.
hicp_rch_a_avg_2020_2024 = [0.3, 2.6, 8.4, 5.4, 2.4] ./ 100 
# Calculate total conversion factor by multiplicatively applying the changes.
eur_2020_to_2025_conversion_factor = prod(1 .+ hicp_rch_a_avg_2020_2024)

# Apply monetary conversion to the desired fields
for field in [
    :CAPEX_MEUR_MW,
    :FOM_EUR_MW_y,
    :VOM_EUR_MWh
]
    final_data[!, field] .*= eur_2020_to_2025_conversion_factor
end
# Set correct monetary unit and show
final_data.currency .= "2025 EUR"
describe(final_data)

## Inspect missing data

Unfortunately, the DEA catalogues aren't 100% consistent,
and there are some parameters missing that we'll likely have to fix.
First, we at least have a lot of techs where the 2040 values are missing.
Fortunately, these we can interpolate from the 2030 and 2050 values.

A more fundamental question is what to do with the technologies that
don't have any parameters at all, namely:
1. `oil-boiler-DH` and `coal-boiler-DH` for heat-only boilers. These exist in some countries, but whether we'll even have the existing capacities remains to be seen.
2. `nuclear-CHP` and `nuclear-SMR-heat` don't exist currently as far as I know, but are seen as potential future technologies of interest.
3. `bio-microCHP`, `gas-microCHP`, and `fuelcell-microCHP` very hypothetical future technologies, that I'm personally rather sceptical about.

The DH HOBs would be nice to have for the current system,
but it would seem like the PyPSA technology data catalogue doesn't have them either.
Furthermore, they are likely to become less relevant in the future,
when fossil fuels are expected to be phased out anyhow.
The nuclear options would be nice to have for 2040-2050,
as they might be used to supply fossil-free district heating.

In [None]:
## Check missing lifetimes, no longer an issue after omitting OCGTs from `Gas CHP`.

missing_inds = ismissing.(final_data.lifetime_y)
final_data[missing_inds, :]

In [None]:
## Check missing elec conversion rates

missing_inds = ismissing.(final_data.conversion_rate_elec_pu)
Set(final_data[missing_inds, :technology])

# Heat-only units, as it should be.

In [None]:
## Check missing heat conversion rates

missing_inds = ismissing.(final_data.conversion_rate_heat_pu)
final_data[missing_inds, [:technology, :year, :conversion_rate_heat_pu]]

# Coal CHP was missing 2050 Cb coefficients,
# but this is fixed by pre-processing extrapolation.
# Solar heating also lacks conversion rates,
# but these need to be treated differently in modelling anyways.

## Format and export technology parameters

In [None]:
## Format and export technology parameters

# Config
exportcols = [
    :technology,
    :year,
    :CAPEX_MEUR_MW,
    :FOM_EUR_MW_y,
    :VOM_EUR_MWh,
    :lifetime_y,
    :conversion_rate_elec_pu,
    :conversion_rate_heat_pu,
    :CO2_captured_pu,
    :currency,
    :metadata
]
roundcols = exportcols[3:9]
dgts = 2 # Number of digits for rounding floats

# Export
export_df = sort(deepcopy(final_data))
export_df[!, roundcols] = round.(export_df[!, roundcols]; digits=dgts)
CSV.write("output/heating_technology_params.csv", export_df)
