Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add industry module #340

Merged
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
0cf664e
Create industry modulle file structure
Apr 8, 2024
b995777
Initial setup for steel/iron industry
Apr 8, 2024
65ebb01
added formatting of steel and iron industry
Apr 9, 2024
c04b284
Finish the addition of the steel industry processing.
Apr 10, 2024
80e6e6d
Minor fix: correct config[input] usage in rule: steel industry
Apr 10, 2024
dc4240e
Minor PR fixes
Apr 10, 2024
4151c5e
PR fixes: passed scrap steel share to the cnf. Fixed year range.
Apr 11, 2024
5c60a7f
Update modules/industry/src/steel_industry.py
brynpickering Apr 11, 2024
1f892bf
Add documentation of the industry module
Apr 11, 2024
fd0aa3c
minor corrections and initial schema file
irm-codebase Apr 11, 2024
7e4f565
Add validation for the industry module configuration.
irm-codebase Apr 11, 2024
3a647f3
Remove changes to Snakefile
irm-codebase Apr 11, 2024
f40ccf1
Merge remote-tracking branch 'origin/develop' into add-industry-module
irm-codebase Apr 11, 2024
2820d7b
Fix reintroduction of Snakefile merge issue
irm-codebase Apr 11, 2024
fb4844e
PR fixes: change folder names
irm-codebase Apr 11, 2024
fa3f456
Add schema validation for the industry module
irm-codebase Apr 12, 2024
632f92c
Fix conflict with schema validation in GitHub.
irm-codebase Apr 12, 2024
3f3048f
Merge remote-tracking branch 'origin/add-jrc-idees-industry-processin…
irm-codebase Apr 12, 2024
87dfea5
Integrate JRC processing script, removed raw_data
Apr 12, 2024
7e1160f
xarray processing for steel up to specific demand
irm-codebase Apr 18, 2024
80ea10a
Generalized JRC functions.
irm-codebase Apr 19, 2024
931ff81
Added formatting xarray support
irm-codebase Apr 20, 2024
c83c40f
Final xarray updates. Steel output is now .nc
irm-codebase Apr 22, 2024
d518419
Merge remote-tracking branch 'origin/develop' into add-industry-module
irm-codebase May 7, 2024
626b5d1
Updated to newest JRC processing version.
irm-codebase May 8, 2024
e49651d
Improved naming and assertion tests. Updated type annotations.
irm-codebase May 13, 2024
21d0195
Fix default.env merge conflict (no longer used in industry module)
irm-codebase May 14, 2024
09e708c
Merge branch 'develop' into add-industry-module
irm-codebase May 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,5 @@ __pycache__
# Snakemake
.snakemake/
dag.pdf
**/out/*
**/tmp/*
4 changes: 4 additions & 0 deletions CHANGELOG.md
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,10 @@

## 1.2.0 (unpublished)

### Added (models)

* **ADD** industry module and steel industry energy demand processing. NOT CONNECTED TO THE MAIN WORKFLOW. Industry sectors pending: chemical, "other".
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved

### Added (models)
* **ADD** fully-electrified heat demand (#284).

Expand Down
9 changes: 8 additions & 1 deletion Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,14 @@ from snakemake.utils import validate, min_version, makedirs
configfile: "config/default.yaml"
validate(config, "config/schema.yaml")

# Include modules
configfile: "modules/industry/config.yaml"
module module_industry:
snakefile: "modules/industry/industry.smk"
config: config["industry"]
use rule * from module_industry as module_industry_*
#
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved

root_dir = config["root-directory"] + "/" if config["root-directory"] not in ["", "."] else ""
__version__ = open(f"{root_dir}VERSION").readlines()[0].strip()
test_dir = f"{root_dir}tests/"
Expand Down Expand Up @@ -76,7 +84,6 @@ rule all:
"build/models/continental/summary-of-potentials.nc",
"build/models/continental/summary-of-potentials.csv"


rule all_tests:
message: "Generate euro-calliope pre-built models and run all tests."
input:
Expand Down
2 changes: 1 addition & 1 deletion envs/shell.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ name: shell
channels:
- conda-forge
dependencies:
- curl=7.76.0
- curl=8.6.0
- unzip=6.0
- rsync=3.2.3
5 changes: 5 additions & 0 deletions lib/eurocalliopelib/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,3 +68,8 @@ def tj_to_twh(array):
def gwh_to_tj(array):
"""Convert GWh to TJ"""
return array * 3.6


def tj_to_ktoe(array):
"""Convert TJ to Ktoe"""
return array * 23.88e-3
3 changes: 3 additions & 0 deletions modules/industry/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# About

Placeholder text for the industry module documentation.
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
10 changes: 10 additions & 0 deletions modules/industry/config.yaml
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should have a schema for this config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've created issue #347 to tackle this.

A schema is a great idea, but the module's configuration might change too much while we port features. We'd prefer to do it once the chemical and "other" sectors are ported, to ensure the configuration file is mature enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config must go into the metadata that we create for model builds. See the build_metadata rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config must go into the metadata that we create for model builds. See the build_metadata rule.

The general config already includes our module's configuration. So it is already in the metadata :)

# Include modules
configfile: "modules/industry/config.yaml"

Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
industry:
inputs:
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
path-energy-balances: build/data/annual-energy-balances.csv
path-cat-names: config/energy-balances/energy-balance-category-names.csv
path-carrier-names: config/energy-balances/energy-balance-carrier-names.csv
outputs:
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
params:
year-range: [2000, 2001, 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, 2018]
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
setup:
main-out-path: data/industry # Path to use in case output files must be created at the EC level.
14 changes: 14 additions & 0 deletions modules/industry/env_industry.yaml
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
name: module-industry
channels:
- conda-forge
- bioconda
dependencies:
- python=3.9
- ipdb=0.13.13
- numpy=1.20.2
- pandas=1.2.3
- pycountry=18.12.8
- snakemake-minimal=7.26.0
- pip:
- styleframe==4.2
- -e ./lib
57 changes: 57 additions & 0 deletions modules/industry/industry.smk
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
SRC_PATH = "src"
TMP_PATH = "modules/industry/tmp"
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
OUT_PATH = "modules/industry/out"
DATA_PATH = "modules/industry/raw_data"
# Snakemake searches for conda environments at the same level as this file
CONDA_PATH = "./env_industry.yaml"

# TODO:
# jrc_idees_processed* files in the raw_data folder should ideally be produced by a rule instead
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved

# Ensure rules are defined in order.
# Otherwise commands like "rules.rulename.output" won't work!
rule steel_industry:
message: "Calculate energy demand for the 'Iron and steel' sector in JRC-IDEES."
conda: CONDA_PATH
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
params:
year_range = config["params"]["year-range"]
input:
path_energy_balances = config["inputs"]["path-energy-balances"],
path_cat_names = config["inputs"]["path-cat-names"],
path_carrier_names = config["inputs"]["path-carrier-names"],
path_jrc_energy = f"{DATA_PATH}/jrc_idees_processed_energy.csv.gz",
path_jrc_production = f"{DATA_PATH}/jrc_idees_processed_production.csv.gz",
output:
path_output = f"{TMP_PATH}/annual_demand_steel.csv"
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
script: f"{SRC_PATH}/steel_industry.py"

rule chemical_industry:
message: "."
conda: CONDA_PATH
params:
input:
output:
script: f"{SRC_PATH}/chemicals.py"

rule other_industry:
message: "."
conda: CONDA_PATH
params:
input:
output: f"{TMP_PATH}/other_industry.csv"
script: f"{SRC_PATH}/other_industry.py"

# rule combine_and_scale:
# message: "."
# conda: CONDA_PATH
# params:
# input:
# output:
# script:

# rule verify:
# message: "."
# params:
# input:
# output:
# script:
brynpickering marked this conversation as resolved.
Show resolved Hide resolved
Binary file not shown.
Binary file not shown.
Empty file.
Empty file.
225 changes: 225 additions & 0 deletions modules/industry/src/steel_industry.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,225 @@
import eurocalliopelib.utils as ec_utils
import pandas as pd
from utils import formatting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing your own library functions in Snakemake requires careful consideration. Mainly, you want to make sure that changes in the library code trigger a re-execution of the rule. Currently, they don`t.

Would it make sense to move all re-usable code to the euro-calliope-lib? That will circumvent the problem. Currently the lib lives in the main repo, but we are considering to pull that into it's own repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a strong argument to use wrappers instead...

Moving these utils to another location (outside the module) sort of defeats the purpose of modularisation in the first place...

from utils import jrc_idees_parser as jrc

CAT_NAME_STEEL = "Iron and steel"

H2_LHV_KTOE = 2.863 # 0.0333 TWh/kt LHV -> 2.863ktoe/kt
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
RECYCLED_STEEL = 0.5 # 50% H-DRI Iron, 50% scrap steel
H2_TO_STEEL = (1 - RECYCLED_STEEL) * 0.05 # 0.05t_h2/t_steel in H-DRI
HDRI_CONSUMPTION = 0.0116 # H-DRI: 135kWh_e/t = 0.0116ktoe/kt


def get_steel_demand_df(
year_range: list,
path_energy_balances: str,
path_cat_names: str,
path_carrier_names: str,
path_jrc_energy: str,
path_jrc_production: str,
path_output: str = "",
irm-codebase marked this conversation as resolved.
Show resolved Hide resolved
) -> pd.DataFrame:
"""Execute the data processing pipeline for the "Iron and steel" sub-sector.

Args:
year_range (list): years to include and interpolate.
path_energy_balances (str): country energy balances (usually from eurostat).
path_cat_names (str): eurostat category mapping file.
path_carrier_names (str): eurostat carrier name mapping file.
path_jrc_energy (str): jrc country-specific industrial energy demand file.
path_jrc_production (str): jrc country-specific industrial production file.
path_output (str): location of steel demand output file.

Returns:
pd.DataFrame: dataframe with steel demand per country.
"""
# -------------------------------------------------------------------------
# Prepare data files
# -------------------------------------------------------------------------
energy_balances_df = pd.read_csv(
path_energy_balances, index_col=[0, 1, 2, 3, 4], squeeze=True
)
cat_names_df = pd.read_csv(path_cat_names, header=0, index_col=0)
carrier_names_df = pd.read_csv(path_carrier_names, header=0, index_col=0)
energy_df = pd.read_csv(path_jrc_energy, index_col=[0, 1, 2, 3, 4, 5, 6])
prod_df = pd.read_csv(path_jrc_production, index_col=[0, 1, 2, 3])
# Ensure dataframes only have data specific to this industry
cat_names_df = cat_names_df[cat_names_df["jrc_idees"] == CAT_NAME_STEEL]
energy_df = energy_df.xs(CAT_NAME_STEEL, level="cat_name", drop_level=False)
prod_df = prod_df.xs(CAT_NAME_STEEL, level="cat_name", drop_level=False)

# -------------------------------------------------------------------------
# Process data
# -------------------------------------------------------------------------
steel_energy_consumption = process_steel_energy_consumption(energy_df, prod_df)

# -------------------------------------------------------------------------
# Format the final output
# -------------------------------------------------------------------------
steel_energy_consumption.columns = steel_energy_consumption.columns.astype(
int
).rename("year")
filled_consumption_df = formatting.fill_missing_data(
energy_balances_df,
cat_names_df,
carrier_names_df,
steel_energy_consumption,
year_range,
)

units = filled_consumption_df.index.get_level_values("unit")
filled_consumption_df.loc[units == "ktoe"] = filled_consumption_df.loc[
units == "ktoe"
].apply(ec_utils.ktoe_to_twh)
filled_consumption_df = filled_consumption_df.rename({"ktoe": "twh"}, level="unit")
filled_consumption_df.index = filled_consumption_df.index.set_names(
"subsector", level="cat_name"
)
filled_consumption_df = filled_consumption_df.stack()

if path_output:
filled_consumption_df.reorder_levels(formatting.LEVEL_ORDER).to_csv(path_output)

return filled_consumption_df


def process_steel_energy_consumption(
jrc_energy_df: pd.DataFrame, jrc_prod_df: pd.DataFrame
) -> pd.DataFrame:
"""Processing of steel energy demand for different carriers.

Calculates energy consumption in the iron and steel industry based on expected
change in processes to avoid fossil feedstocks. All process specific energy consumption
(energy/t_steel) is based on the Electric Arc process (EAF), except sintering, which
will be required for iron ore processed using H-DRI, but is not required by EAF.

This function does the following:
1. Finds all the specific consumption values by getting
a. process energy demand / produced steel => specific demand
b. process electrical demand / electrical consumption => electrical efficiency
c. specific demand / electricial efficiency => specific electricity consumption
2. Gets total process specific electricity consumption by adding specific consumptions
for direct electric processes, EAF, H-DRI, smelting, sintering, refining, and finishing
3. Gets specific hydrogen consumption for all countries that will process iron ore
4. Gets specific space heat demand based on demand associated with EAF plants
5. Gets total demand for electricity, hydrogen, and space heat by multiplying specific
demand by total steel production (by both EAF and BF-BOF routes).

Args:
jrc_energy_df (pd.DataFrame): jrc country-specific steel energy demand.
jrc_prod_df (pd.DataFrame): jrc country-specific steel production.

Returns:
pd.DataFrame: processed dataframe with the expected steel energy consumption.
"""

# sintering/pelletising
sintering_specific_consumption = jrc.get_specific_electricity_consumption(
"Integrated steelworks",
"Steel: Sinter/Pellet making",
jrc_energy_df,
jrc_prod_df,
)
# smelters
eaf_smelting_specific_consumption = jrc.get_specific_electricity_consumption(
"Electric arc", "Steel: Smelters", jrc_energy_df, jrc_prod_df
)
# EAF
eaf_specific_consumption = jrc.get_specific_electricity_consumption(
"Electric arc", "Steel: Electric arc", jrc_energy_df, jrc_prod_df
)
# Rolling & refining
refining_specific_consumption = jrc.get_specific_electricity_consumption(
"Electric arc",
"Steel: Furnaces, Refining and Rolling",
jrc_energy_df,
jrc_prod_df,
)
finishing_specific_consumption = jrc.get_specific_electricity_consumption(
"Electric arc", "Steel: Products finishing", jrc_energy_df, jrc_prod_df
)
# Auxiliaries (lighting, motors, etc.)
auxiliary_specific_consumption = jrc.get_auxiliary_electricity_consumption(
"Electric arc", jrc_energy_df, jrc_prod_df
)

# Total electricity consumption
# If the country produces steel from Iron ore (assuming 50% recycling):
# sintering/pelletizing * iron_ore_% + smelting * recycled_steel_% + H-DRI + EAF + refining/rolling + finishing + auxiliaries
# If the country only recycles steel:
# smelting + EAF + refining/rolling + finishing + auxiliaries

total_specific_consumption = (
sintering_specific_consumption.mul(1 - RECYCLED_STEEL)
.add(
eaf_smelting_specific_consumption
# if no sintering, this country/year recycles 100% of steel
.where(sintering_specific_consumption == 0)
# if there is sintering, update smelting consumption to equal our assumed 2050 recycling rate
# and add weighted H-DRI consumption to process the remaining iron ore
.fillna(
eaf_smelting_specific_consumption.mul(RECYCLED_STEEL).add(
HDRI_CONSUMPTION
)
)
)
.add(eaf_specific_consumption)
.add(refining_specific_consumption)
.add(finishing_specific_consumption)
.add(auxiliary_specific_consumption)
)
# In case our model now says a country does produce steel,
# we give them the average of energy consumption of all other countries
total_specific_consumption = (
total_specific_consumption.where(total_specific_consumption > 0)
.fillna(total_specific_consumption.mean())
.assign(carrier="electricity")
.set_index("carrier", append=True)
)

# Hydrogen consumption for H-DRI, only for those country/year combinations that handle iron ore
# and don't recycle all their steel
h2_specific_consumption = H2_LHV_KTOE * H2_TO_STEEL
total_specific_h2_consumption = (
total_specific_consumption.where(sintering_specific_consumption > 0)
.fillna(0)
.where(lambda x: x == 0)
.fillna(h2_specific_consumption)
.rename(index={"electricity": "hydrogen"})
)
total_specific_consumption = total_specific_consumption.append(
total_specific_h2_consumption
)

# Space heat
space_heat_specific_demand = (
jrc_energy_df.xs(("demand", "Electric arc", "Low enthalpy heat"))
.div(jrc_prod_df.xs("Electric arc").droplevel("unit"))
.assign(carrier="space_heat")
.set_index("carrier", append=True)
.sum(level=total_specific_consumption.index.names)
.rename(index={"ktoe": "ktoe/kt"})
)
total_specific_consumption = total_specific_consumption.append(
space_heat_specific_demand
)

steel_consumption = total_specific_consumption.mul(
jrc_prod_df.xs("Iron and steel", level="cat_name").sum(level="country_code"),
level="country_code",
).rename(index={"ktoe/kt": "ktoe"})

return steel_consumption


if __name__ == "__main__":
get_steel_demand_df(
year_range=snakemake.params.year_range,
path_energy_balances=snakemake.input.path_energy_balances,
path_cat_names=snakemake.input.path_cat_names,
path_carrier_names=snakemake.input.path_carrier_names,
path_jrc_energy=snakemake.input.path_jrc_energy,
path_jrc_production=snakemake.input.path_jrc_production,
path_output=snakemake.output.path_output,
)