# Processing _ABM_, _Hotmaps_, and _JRC_ data to estimate the existing heat-only generation capacities

The goal of this notebook is to calculate estimates for the necessary heat-only boiler capacities,
both in distributed heating as well as in district heating systems.
The processing relies on three primary data sources:

1. The ABM heating demand timeseries informs us about the estimated peak demand relative to the yearly demand, used for sizing the capacity.
2. The Hotmaps project results for the yearly heating demands per fuel, mapped into different heating technologies for the distributed heating.
3. The JRC-IDEES yearly district-heat-only production to estimate the district heating system capacity distribution.

## Julia environment setup

In [3]:
## Julia environment setup

using Pkg
Pkg.activate(@__DIR__)
Pkg.instantiate()

using CSV
using DataFrames
using Dates
using Statistics
using Serialization

[32m[1m  Activating[22m[39m project at `c:\MopoHeatSectorDataProcessing`


## Read peak-to-yearly demand ratios calculated in `demand_scaling.ipynb`

In [25]:
## Read peak-to-yearly demand ratios

peak_path = "output/peak_to_yearly_demand_ratios_MW_GWh.csv"
peak_data = DataFrame(CSV.File(peak_path))
rows, cols = size(peak_data)
peak_data = stack(
    peak_data,
    3:cols;
    variable_name=:country,
    value_name=:peak_to_demand_ratio
)
describe(peak_data)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,DataType
1,demand,,DHW,,heating,0,String7
2,unit,,MW/GWh,,MW/GWh,0,String7
3,country,,AT,,UK,0,String
4,peak_to_demand_ratio,0.896544,0.2004,0.6282,4.8857,0,Float64


## Read extended Hotmaps data from `demand_scaling.ipynb`

In [18]:
## Read extended Hotmaps data

hm_data = deserialize("hm_data.ser")
describe(hm_data)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,DataType
1,Scenario,,ambitious,,current,0,String15
2,Supertype,,Nonresidential_private,,Residential,0,String31
3,Type,,cooling,,space heating,0,String31
4,Technology,,Coal boiler or stove,,heat pumps,0,String31
5,Fuel,,District heating,,solar thermal,0,String31
6,scenario_year,2030.22,2012,2030.0,2050,0,Int64
7,datetime,,2012-01-01 00:00,,2050-01-01 00:00,0,String31
8,country,,AT,,UK,0,String3
9,unit,,GWh,,GWh,0,String3
10,scenario_value,1876.6,0.0,98.5536,2.37712e5,0,Float64


## Examine heat pump COPs

The Hotmaps data has separate values for `Electricity` and `Ambient Heat` as the `Fuel` for heat pumps. Let's see how these assumptions vary between countries and end-uses.

In [19]:
## Examine the heat pump COPs based on their electricity and ambient heat consumption

df = hm_data[hm_data.Technology .== "heat pumps", :]
df = unstack(df, :Fuel, :scenario_value)
df = df[df.Electricity .> 0, :]
df.COP .= (df.Electricity + df[!, Symbol("ambient heat")]) ./ df.Electricity
#sort!(df, :COP; rev=true)
describe(df)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,Type
1,Scenario,,ambitious,,current,0,String15
2,Supertype,,Nonresidential_private,,Residential,0,String31
3,Type,,hot water,,space heating,0,String31
4,Technology,,heat pumps,,heat pumps,0,String31
5,scenario_year,2030.42,2012,2030.0,2050,0,Int64
6,datetime,,2012-01-01 00:00,,2050-01-01 00:00,0,String31
7,country,,AT,,UK,0,String3
8,unit,,GWh,,GWh,0,String3
9,category,,nonres,,res,0,String
10,demand,,DHW,,heating,0,String


Well that's a bit alarming: The mean COP in the Hotmaps data is above six, which seems pretty unrealistic. Furthermore, the maximum COP is infeasibly high, and weirdly seems to occur mostly for DHW.

Therefore, there's a risk that the modelled heat pump capacity is not enough to cover the demand in certain countries if I use the `Electricity` fuel as the basis for the assumed existing capacity.

In [20]:
# Let's check space heating and DHW separately:

df_cop_demand = unstack(df, :demand, :COP)
describe(df_cop_demand)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,Type
1,Scenario,,ambitious,,current,0,String15
2,Supertype,,Nonresidential_private,,Residential,0,String31
3,Type,,hot water,,space heating,0,String31
4,Technology,,heat pumps,,heat pumps,0,String31
5,scenario_year,2030.42,2012,2030.0,2050,0,Int64
6,datetime,,2012-01-01 00:00,,2050-01-01 00:00,0,String31
7,country,,AT,,UK,0,String3
8,unit,,GWh,,GWh,0,String3
9,category,,nonres,,res,0,String
10,Electricity,441.207,7.33e-5,41.5911,21560.1,0,"Union{Missing, Float64}"


Surprisingly, the space heating COPs are lower than DHW COPs, which shouldn't be the case. Not really sure what's going on here with the Hotmaps data. Ultimately, it might be necessary to estimate the existing heat pump capacity based on the total and the assumed technology parameter COP, but we'll see.

## Estimate generation capacities based on Hotmaps

We'll have to estimate the technology capacities based on the yearly demands and the estimated demand peak ratio.

In [30]:
## Calculate estimated peak capacities.

hm_capacity_data = leftjoin( # Combine estimated peak demand ratio data with yearly demands.
    hm_data,
    peak_data;
    on=[:country, :demand],
    makeunique=true,
)
rename!(hm_capacity_data, :unit_1 => :unit_ratio)
hm_capacity_data.capacity_MW = ( # Calculate the estimated capacity in MW
    hm_capacity_data.scenario_value
    .* hm_capacity_data.peak_to_demand_ratio
)
describe(hm_capacity_data)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,Type
1,Scenario,,ambitious,,current,0,String15
2,Supertype,,Nonresidential_private,,Residential,0,String31
3,Type,,cooling,,space heating,0,String31
4,Technology,,Coal boiler or stove,,heat pumps,0,String31
5,Fuel,,District heating,,solar thermal,0,String31
6,scenario_year,2030.22,2012,2030.0,2050,0,Int64
7,datetime,,2012-01-01 00:00,,2050-01-01 00:00,0,String31
8,country,,AT,,UK,0,String3
9,unit,,GWh,,GWh,0,String3
10,scenario_value,1876.6,0.0,98.5536,2.37712e5,0,Float64


### Technology mapping

Map the estimated peak capacities to the desired technologies.

In [31]:
## Figure out heating system mappings from Hotmaps to the desired techs.

heat_techs = unique(hm_data.Technology)
gas_boiler = filter(x -> contains(lowercase(x), "gas boiler"), heat_techs)
bio_boiler = filter(x -> contains(lowercase(x), "biomass"), heat_techs)
oil_boiler = filter(x -> contains(lowercase(x), "oil boiler"), heat_techs)
air_heatpump = filter(x -> contains(lowercase(x), "heat pump"), heat_techs) # Data doesn't distinguish between different heat pumps.
ground_heatpump = filter(x -> contains(lowercase(x), "heat pump"), heat_techs) # Data doesn't distinguish between different heat pumps.
solar_heating = filter(x -> contains(lowercase(x), "solar"), heat_techs)
electric_heating = filter(x -> contains(lowercase(x), "electric"), heat_techs)
district_heating = filter(x -> contains(lowercase(x), "district heating"), heat_techs)

1-element Vector{String31}:
 "district heating"

In [32]:
# Map hm_data technologies to the desired heating techs.

distributed_tech_mapping = Dict( # (Set of technologies, assumed share)
    "gas-boiler" => (gas_boiler, 1.0),
    "bio-boiler" => (bio_boiler, 1.0),
    "oil-boiler" => (oil_boiler, 1.0),
    "air-heatpump" => (air_heatpump, 0.7), # Assumed 70% market share for A2WHPs
    "ground-heatpump" => (ground_heatpump, 0.3), # Assumed 30% market share for G2WHPs
    "solar-heating" => (solar_heating, 1.0),
    "electric-heating" => (electric_heating, 1.0),
    "district-heating" => (district_heating, 1.0)
)

Dict{String, Tuple{Vector{String31}, Float64}} with 8 entries:
  "air-heatpump"     => (["heat pumps"], 0.7)
  "bio-boiler"       => (["biomass boiler or stove"], 1.0)
  "ground-heatpump"  => (["heat pumps"], 0.3)
  "gas-boiler"       => (["Gas boiler or stove"], 1.0)
  "oil-boiler"       => (["Fuel oil boiler or stove"], 1.0)
  "solar-heating"    => (["Solar thermal collector"], 1.0)
  "district-heating" => (["district heating"], 1.0)
  "electric-heating" => (["electric direct heaters"], 1.0)

In [33]:
# Sum the capacities together

rename_cols = Dict(
    :Scenario => :scenario,
    :scenario_year => :scenario_year,
    :country => :country,
    :category => :building_category,
    :output_technology => :technology,
    :demand => :demand,
    :weighted_capacity_MW => :capacity,
    :capacity_unit => :unit,
    :demand_category => :demand_category,
)
heating_capacity_data = DataFrame()
for (name, (techs, weight)) in distributed_tech_mapping
    # Filter relevant technologies.
    df = filter(
        r -> r.Technology in techs && r.Fuel != "ambient heat", # Heat pumps have two fuel rows, omitting "ambient heat" since it's likely not what Alvaro wants.
        hm_capacity_data
    )
    if isempty(df) # Skip the rest of the loop if df is empty
        continue
    end
    # Calculate the weighted capacities
    df.output_technology .= name
    df.weighted_capacity_MW .= df.capacity_MW .* weight
    df.capacity_unit .= "MW"
    df.demand_category .= (name == "district-heating" ? "district heating" : "distributed heating")
    # Final formatting
    df = df[!, collect(keys(rename_cols))] # Drop unused columns
    rename!(df, rename_cols)
    country_cols = Symbol.(unique(df.country))
    df = stack( # Avoid nonresidential private vs nonresidential public duplicate row hassle by unstack-stack summing.
        unstack(
            df,
            :country,
            :capacity;
            combine=sum
        ),
        country_cols;
        variable_name=:country,
        value_name=:capacity
    )
    append!(heating_capacity_data, df)
end
describe(heating_capacity_data)

Row,variable,mean,min,median,max,nmissing,eltype
Unnamed: 0_level_1,Symbol,Union…,Any,Union…,Any,Int64,Type
1,technology,,air-heatpump,,solar-heating,0,String
2,demand_category,,distributed heating,,district heating,0,String
3,scenario_year,2030.22,2012,2030.0,2050,0,Int64
4,unit,,MW,,MW,0,String
5,scenario,,ambitious,,current,0,String15
6,demand,,DHW,,heating,0,String
7,building_category,,nonres,,res,0,String
8,country,,AT,,UK,0,String
9,capacity,1773.7,0.0,80.1006,1.49141e5,0,"Union{Missing, Float64}"


## Export capacity data

This is currently WIP

In [None]:
## Export assumed existing capacity data

# Config
dgts = 2 # Number of digits when rounding exports.
export_cols = [
    :scenario,
    :scenario_year,
    :country,
    :building_category,
    :demand_category,
    :demand,
    :technology,
    :unit,
    :capacity
]

# Export table
capacity_export = heating_capacity_data[:, export_cols]
capacity_export.capacity = round.(capacity_export.capacity; digits=dgts)
capacity_export = sort!(unstack(
    capacity_export,
    :country,
    :capacity,
    combine=sum # Avoid nonresidential private vs nonresidential public duplicate row hassle.
))
CSV.write("output/scenario_estimated_existing_capacities_MW.csv", capacity_export)