# Examining _Danish Energy Agency Technology Catalogue_ heat generation data for Mopo WP5

This Jupyter Notebook contains code for examining the Danish Energy Agency data sheets.
The main objective is to see if it we can get everything, or at least most, of what we need.


## Julia setup

The processing is done using Julia `XLSX` and `DataFrames` packages,
so we need to do a bit of setup in the beginning.

In [None]:
## Activate (and set up) the required Julia environment

using Pkg # Julia package manager.
Pkg.activate(@__DIR__) # Activate the Julia environment in the folder this file is in (namely the `Project.toml`)
Pkg.instantiate() # Download and install the necessary dependencies.

# Load dependencies
using XLSX
using DataFrames

## Reading and examining the excel data

Next, we'll read the raw excel datasheets into `DataFrames` for easy processing and organization.
Fortunately, the newer DEA technology catalogues contain the `alldata_flat` sheet,
making them easy to read programmatically.

In [None]:
raw_dh_data = DataFrame(XLSX.readtable("input-data\\dea-technology-catalogues\\technology_data_for_el_and_dh.xlsx", "alldata_flat"))

In [None]:
raw_heat_data = DataFrame(XLSX.readtable("input-data\\dea-technology-catalogues\\technology_data_heating_installations.xlsx", "alldata_flat"))

In [None]:
raw_cc_data = DataFrame(XLSX.readtable("input-data\\dea-technology-catalogues\\technology_data_for_carbon_capture_transport_storage.xlsx", "alldata_flat"))

Now that we have the all the data in a relatively easy to digest format, we can check what exactly it is they contain.
First, I just want to check if the `cat`, `priceyear`, `est`, and `year` columns are consistent between the two datasets.
Otherwise, we'll need special treatment.

First, let's check the parameter categories `cat`:

In [None]:
dh_cat = Set(raw_dh_data[:,"cat"])

In [None]:
heat_cat = Set(raw_heat_data[:,"cat"])

In [None]:
cc_cat = Set(raw_cc_data[:,"cat"])

In [None]:
union(dh_cat, heat_cat, cc_cat)

Well, I guess this is what could be expected. The `Electric regulation ability` category has trailing whitespaces in its name,
doesn't match the `Regulation ability` from the electricity and district heating data,
and the `Technology-specific data` category is missing the hyphen in the individual heat plants data.
The carbon capture catalogue is especially bad, with both `Technical data` and `Energy/technical data`,
as well as misspelled `Technology specific data` with trailing whitespaces.

Let's check `priceyear` next:

In [None]:
dh_priceyear = Set(raw_dh_data[:,"priceyear"])

In [None]:
heat_priceyear = Set(raw_heat_data[:,"priceyear"])

In [None]:
cc_priceyear = Set(raw_cc_data[:,"priceyear"])

In [None]:
union(dh_priceyear, heat_priceyear, cc_priceyear)

Well, at least the price years are consistent.

What about `est`?

In [None]:
dh_est = Set(raw_dh_data[:,"est"])

In [None]:
heat_est = Set(raw_heat_data[:,"est"])

In [None]:
cc_est = Set(raw_cc_data[:,"est"])

In [None]:
union(dh_est, heat_est, cc_est)

The electricity and district heating catalogue and the individual heating catalogue agree,
but again the carbon capture catalogue messes things up with mixed use of capitalization.
Also, there's a new `Est` value, which I'm guessing is denoting an especially bad estimate.

`year`?

In [None]:
dh_year = Set(raw_dh_data[:,"year"])

In [None]:
heat_year = Set(raw_heat_data[:,"year"])

In [None]:
cc_year = Set(raw_cc_data[:,"year"])

In [None]:
union(dh_year, heat_year, cc_year)

Ok, not terrible. The 2015 data is missing from the individual heating plants and carbon capture,
but at least the other years are there and are consistent.
Fortunately, we shouldn't need the 2015 data anyhow.

Finally, we're interested in the parameter names, as they should tell us everything we can hope to find.
Unfortunately, this is the set with by far the most unique entries,
and the risk of not every similar parameter having an identical or consistent name is at its greatest.

In [None]:
dh_par = Set(raw_dh_data[:,"par"])

In [None]:
heat_par = Set(raw_heat_data[:,"par"])

In [None]:
cc_par = Set(raw_cc_data[:,"par"])

In [None]:
union(dh_par, heat_par, cc_par)

Ok, there's quite a lot of parameters, especially in the electricity and district heating and carbon capture data sheets.
Worryingly, there doesn't seem to be that much overlap between the datasets,
so we'll have to hope that everything we need can be found where we need it.

In [None]:
intersect(dh_par, heat_par)

In [None]:
intersect(dh_par, cc_par)

Well that's unfortunate.
The few parameter names that are common between the datasets are not that important.
Furthermore, the individual heat and carbon capture catalogues have no names in common,
although fortunately this isn't that big of a deal.
Seems like we'll have to do this the hard way, and search for relevant parameters using keywords instead,
and map them between the sets.


## Finding the desired parameters

The goal of this exercise is to find values for the `CAPEX (EUR/MW)`,
`FOM (EUR/MW/y)`, `VOM (EUR/MWh)`, `Lifetime`, `Conversion Rate (output/input)`,
`CO2 Captured (ton CO2/MWh)`, and `Fuel Cost` for the WP5 input data excel.
Since the parameter names don't seem to line up,
we'll have to search for them "by hand".

Let's start with `CAPEX`. Seems like `"investment"` yields relevant parameters from both tables:

In [None]:
filter(x -> occursin("investment (*total)", lowercase(x)), dh_par)

In [None]:
filter(x -> occursin("investment", lowercase(x)), heat_par)

In [None]:
filter(x -> occursin("investment", lowercase(x)), cc_par)

Seems like `Nominal investment (*total)` is the closest thing, which I think corresponds to CAPEX anyhow.
The units are different between the two technology catalogues, though,
so we'll need to do some conversions from `kEUR/unit` to `MEUR/MW`.
Regardless, CAPEX seems to be viable.
Adding carbon-capture-related investment costs on top of generation capacity investments might also get tricky,
as the two are using different units.

Next up, we're interested in `FOM (EUR/MW/y)` costs, so let's see what we can find.

In [None]:
filter(x -> occursin("fixed", lowercase(x)), dh_par)

In [None]:
filter(x -> occursin("fixed", lowercase(x)), heat_par)

In [None]:
filter(x -> occursin("fixed", lowercase(x)), cc_par)

In [None]:
filter(x -> occursin("fixed o&m", lowercase(x)), cc_par)

Seems like this shouldn't be a massive problem,
although we'll have to be careful whether electricity or heat capacity is the relevant one.
Carbon capture catalogue unfortunately doesn't clearly state `(*total)` costs,
but we can perhaps assume that the `"Fixed O&M [EUR/tCO2/year]"` is close?

What about `VOM (EUR/MWh)`?

In [None]:
filter(x -> occursin("variable", lowercase(x)), dh_par)

In [None]:
filter(x -> occursin("variable", lowercase(x)), heat_par)

In [None]:
filter(x -> occursin("variable", lowercase(x)), cc_par)

Again, we have relevant parameters available,
but we'll have to be a bit careful about the units.
Especially `EUR/MWh_h` vs `EUR/MWh_e`.
I'm not even sure what `EUR_MWH_i` means.
Carbon capture costs are quite interesting, since they exclude energy costs.
Fortunately, I think we can safely assume that any CC combined power/heat plant would use its own production for its CC.

What about `Lifetime`?

In [None]:
filter(x -> occursin("lifetime", lowercase(x)), dh_par)

In [None]:
filter(x -> occursin("lifetime", lowercase(x)), heat_par)

In [None]:
filter(x -> occursin("lifetime", lowercase(x)), cc_par)

Slightly different parameter names between the catalogues,
but should contain what we need.
The question becomes, do CC system lifetimes affect power plant lifetimes?
I suppose ideally they would have to be treated as different systems,
with different lifetimes, but that's a modelling choise.

`Conversion Rate (output/input)` could be a problem?

In [None]:
filter(x -> occursin("efficiency", lowercase(x)), dh_par)

In [None]:
filter(x -> occursin("efficiency", lowercase(x)), heat_par)

In [None]:
filter(x -> occursin("efficiency", lowercase(x)), cc_par)

In [None]:
filter(x -> occursin("heat  input", lowercase(x)), cc_par)

Well we have "efficiencies",
which should be straight forward for individual heating plants at least.
However, parameters for CHP plants could prove to be challenging.
We don't have anything resembling an efficiency for carbon capture, though.
However, there are `Heat input [MWh/tCO2]` and `Electricity input [MWh/tCO2]`,
which I guess could be used to deduce how much the efficiency of the underlying plant suffers from self-consumption of generation for carbon capture?
Not sure if this is worth it, though.

What about Cb coefficients for CHP plants? These are required for calculating CHP plant fuel-to-heat ratios.

In [None]:
filter(x -> occursin("cb", lowercase(x)), dh_par)

`CO2 Captured (ton CO2/MWh)`?
Seems like neither the electricity and district heating nor the individual heat plants have any carbon capture technologies associated with them in the catalogues.
There's a separate catalogue for this,
so I'm assuming we'll need to pull parameters from there?

In [None]:
filter(x -> occursin("capacity", lowercase(x)), cc_par)

In [None]:
filter(x -> occursin("capture", lowercase(x)), cc_par)

The easier parameter to work with would likely be the `CO2 capture rate, net [%]`,
although to get `CO2 Captured (ton CO2/MWh)` we'd need to know the CO2 in per MWh.
Regardless, this will likely require some work.

Finally, what about `Fuel Cost`?

In [None]:
filter(x -> occursin("cost", lowercase(x)), dh_par)

In [None]:
filter(x -> occursin("cost", lowercase(x)), heat_par)

In [None]:
filter(x -> occursin("cost", lowercase(x)), cc_par)

`fuel` doesn't yield any relevant parameters that aren't also found by `cost`.
As could be expected, the technology catalogues don't really deal with them,
as fuel costs can be seen entirely independent of the technologies themselves.
Furthermore, I think these are irrelevant for endogenously modelled energy carriers.

### Individual heat data specific stuff

For the individual heating systems, most of the parameters are annoyingly given
per unit instead of per capacity.
Thus, we'll need to extract the unit size from the data.

Seems like `Heat production capacity for one unit [KW_h]` is our ticket.

In [None]:
filter(x -> occursin("unit", lowercase(x)), heat_par)

## Conclusions

Overall, the _DEA Technology Catalogues_ are pretty comprehensive,
but don't contain all of the parameters we want in the exact same format we want.
Thus, we'll have to make some assumptions, do some processing, etc. to get what we want.
In any case, it's likely we'll have to revise some parameters or assumptions
to get all the technologies uniform
so doing things programmatically might've been the right call.

- `CAPEX (EUR/MW)`: `Nominal investment (*total)` likely to be the best candidate, although requires some tweaking to arrive at the desired `EUR/MW` values for individual heating systems and carbon capture tech.

- `FOM (EUR/MW/year)`: `Fixed O&M (*total) [EUR/MW_x/year]` seem easiest, although I'll need to figure out per what output we want these costs. Options for `x` seem to include `e` presumably for _electricity_, `h` presumably for _heat_, and `i` for _input_. Individual heating systems and carbon capture additions need some conversions to force into the desired format.

- `VOM (EUR/MWh)`: `Variable O&M (*total) [EUR/MWh_x]`. Although carbon capture needs to be accounted for based on produced `tCO2`.

- `Lifetime`: `Technical lifetime [years]` and `Technical economic lifetime [years]` should contain what we need. Straightforward?

- `Conversion Rate (output/input)`: `Electrical efficiency (net, annual average)` and `Heat efficiency (net, annual average)` should get us most of what we need, although individual heating seems to need assumptions about the heat distribution system. Accounting for carbon capture is more challenging, though. We could maybe deduce some impacts based on the `Electricity input [MWh/tCO2]` and `Heat input [MWh/tCO2]` parameters, but likely gets tricky.

- `CO2 Captured (tCO2/MWh)`: `A3] CO2 capture rate, net [%]` likely the easiest avenue, although needs some assumptions about the CO2 content in the input fuel.

- `Fuel Cost`: The catalogues understandably don't consider fuel costs, as they don't really have anything to do with the technologies. These we need to obtain somewhere else, and likely need to be coordinated between different components.