Merge pull request #340 from irm-codebase/add-industry-module

Add industry module
calliope-project · May 14, 2024 · d189682 · d189682
2 parents d62b8ce + 09e708c
commit d189682
Show file tree

Hide file tree

Showing 16 changed files with 635 additions and 16 deletions.
diff --git a/.github/workflows/schemavalidation.yaml b/.github/workflows/schemavalidation.yaml
@@ -31,3 +31,6 @@ jobs:
               run: python ./tests/validate_schema.py ./tests/resources/schema.yaml --config ./tests/resources/test.yaml
             - name: Validate test schema itself
               run: python ./tests/validate_schema.py ./tests/resources/schema.yaml
+            # Modules
+            - name: Validate industry config
+              run: python ./tests/validate_schema.py ./modules/industry/schema.yaml --config ./modules/industry/config.yaml
diff --git a/.gitignore b/.gitignore
@@ -21,3 +21,5 @@ __pycache__
 # Snakemake
 .snakemake/
 dag.pdf
+**/out/*
+**/tmp/*
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -4,6 +4,8 @@
 
 ### Added (models)
 
+* **ADD** industry module and steel industry energy demand processing. NOT CONNECTED TO THE MAIN WORKFLOW. Industry sectors pending: chemical, "other". (Fixes #308, #310, #347, #345 and #346)
+
 * **ADD** Spatial resolution that aligns with the regions defined by the [e-Highway 2050 project](https://cordis.europa.eu/project/id/308908/reporting) (`ehighways`) (#370).
 
 * **ADD** fully-electrified heat demand (#284).
@@ -24,7 +26,7 @@
 * **ADD** configuration option to build model timeseries data over multiple years, using `first-year` and `final-year` temporal scopes. Available years are 2010-2016 at time of implementing functionality (#152).
 * **ADD** nuclear technology capacity allocation workflow which uses the configuration parameter `nuclear-capacity-scenario` to select whether today's capacities define limits in the model definition ("current") or whether ranges set bounds on future capacity (by linking to a configuration CSV file) (#78).
 * **ADD** a Snakemake rule that generates a .csv and .nc file that provide an summary of the potentials (= per-tech constraints) for each technology and location (#250).
-* **ADD** ability to run on Apple silicon devices (#263).
+* **ADD** ability to run on Apple silicon devices (#263).Fixes #308, #310, #347, #345 and #346.
     * Updated geo packages from gdal 3.2 -> 3.3.
 * **ADD** re-execution triggers based on config and env changes (#264).
 * **ADD** continuous integration test of all conda environments on both ARM macOS and Linux (#369).

diff --git a/Snakefile b/Snakefile
@@ -6,6 +6,18 @@ from snakemake.utils import validate, min_version, makedirs
 configfile: "config/default.yaml"
 validate(config, "config/schema.yaml")
 
+# >>>>>> Include modules >>>>>>
+# Industry
+configfile: "modules/industry/config.yaml"
+validate(config, "modules/industry/schema.yaml")
+
+module module_industry:
+    snakefile: "modules/industry/industry.smk"
+    config: config["industry"]
+use rule * from module_industry as module_industry_*
+# <<<<<< Include modules <<<<<<
+
+
 root_dir = config["root-directory"] + "/" if config["root-directory"] not in ["", "."] else ""
 __version__ = open(f"{root_dir}VERSION").readlines()[0].strip()
 test_dir = f"{root_dir}tests/"
@@ -80,7 +92,6 @@ rule all:
             file=["example-model.yaml", "build-metadata.yaml", "summary-of-potentials.nc", "summary-of-potentials.csv"]
         )
 
-
 rule all_tests:
     message: "Generate euro-calliope pre-built models and run all tests."
     input:

diff --git a/lib/eurocalliopelib/utils.py b/lib/eurocalliopelib/utils.py
@@ -167,3 +167,8 @@ def pj_to_twh(array):
 def tj_to_twh(array):
     """Convert TJ to TWh"""
     return pj_to_twh(array) / 1000
+
+
+def tj_to_ktoe(array):
+    """Convert TJ to Ktoe"""
+    return array * 23.88e-3
diff --git a/modules/industry/README.md b/modules/industry/README.md
@@ -0,0 +1,21 @@
+# About
+
+This module is dedicated to getting and processing energy demand for industry.
+
+Basic info of this module:
+
+- Main data sources: JRC IDEES and eurostat
+- Spatial resolution: national (because of JRC IDEES)
+- Temporal resolution: annual aggregated
+
+Some industries are specified as 'sub-module' in this module, such as iron and steel industry. This allows you to flexibly configure the decarbonisation level of each industry separately. All other industries are grouped into 'other industries'.
+
+For workflow users, i.e. non-developers, `config.yaml` is the main file to look into.
+Here, one can specify parameters such as the years to be included in the data processing, assumptions such as the share of recycled steel, and the specific industry sector to be included (such as iron and steel industry).
+The structure of the `config.yaml` file is:
+
+- inputs - the needed data sources from other modules in euro-calliope (for now, we assume this module depends on other modules and is not yet a stand-alone module)
+- outputs - the output of the whole industry module, usually files, possibly passed on to other modules in euro-calliope.
+- params - the parameters that affect the calculation process and result in this module.
+By changing the value of the parameters, each user can tailor the workflow to their own needs.
+- setup - anything that concerns the general data pipeline of the module.
diff --git a/modules/industry/config.yaml b/modules/industry/config.yaml
@@ -0,0 +1,13 @@
+industry:
+    inputs:
+        path-energy-balances: build/data/annual-energy-balances.csv
+        path-cat-names: config/energy-balances/energy-balance-category-names.csv
+        path-carrier-names: config/energy-balances/energy-balance-carrier-names.csv
+        path-jrc-industry-energy: build/data/jrc-idees/industry/processed-energy.nc
+        path-jrc-industry-production: build/data/jrc-idees/industry/processed-production.nc
+    outputs:
+        placeholder-out1:
+        placeholder-out2:
+    params:
+        steel:
+            recycled-steel-share: 0.5  # % of recycled scrap steel for H-DRI
diff --git a/modules/industry/env_industry.yaml b/modules/industry/env_industry.yaml
@@ -0,0 +1,18 @@
+name: module-industry
+channels:
+    - conda-forge
+    - bioconda
+dependencies:
+    - python=3.11
+    - ipdb=0.13.13
+    - numpy=1.23
+    - pandas=1.5.3
+    - pycountry=18.12.8
+    - snakemake-minimal=8.10.7
+    - netcdf4=1.6.5
+    - xarray=2022.9.0
+    - bottleneck=1.3.8
+    - pip=21.0.1
+    - pip:
+        - styleframe==4.2
+        - -e ./lib
diff --git a/modules/industry/industry.smk b/modules/industry/industry.smk
@@ -0,0 +1,56 @@
+# Paths dependent on main Snakefile
+MODULE_PATH = "modules/industry"
+BUILD_PATH = f"{MODULE_PATH}/build"
+DATA_PATH = f"{MODULE_PATH}/raw_data"
+
+# Paths relative to this snakefile (snakemake behaviour is inconsitent)
+SCRIPT_PATH = "scripts"  # scripts are called relative to this file
+CONDA_PATH = "./env_industry.yaml"
+
+# Ensure rules are defined in order.
+# Otherwise commands like "rules.rulename.output" won't work!
+rule steel_industry:
+    message: "Calculate energy demand for the 'Iron and steel' sector in JRC-IDEES."
+    conda: CONDA_PATH
+    params:
+        config_steel = config["params"]["steel"]
+    input:
+        path_energy_balances = config["inputs"]["path-energy-balances"],
+        path_cat_names = config["inputs"]["path-cat-names"],
+        path_carrier_names = config["inputs"]["path-carrier-names"],
+        path_jrc_industry_energy = config["inputs"]["path-jrc-industry-energy"],
+        path_jrc_industry_production = config["inputs"]["path-jrc-industry-production"],
+    output:
+        path_output = f"{BUILD_PATH}/annual_demand_steel.nc"
+    script: f"{SCRIPT_PATH}/steel_industry.py"
+
+rule chemical_industry:
+    message: "."
+    conda: CONDA_PATH
+    params:
+    input:
+    output:
+    script: f"{SCRIPT_PATH}/chemicals.py"
+
+rule other_industry:
+    message: "."
+    conda: CONDA_PATH
+    params:
+    input:
+    output: f"{BUILD_PATH}/other_industry.csv"
+    script: f"{SCRIPT_PATH}/other_industry.py"
+
+# rule combine_and_scale:
+#     message: "."
+#     conda: CONDA_PATH
+#     params:
+#     input:
+#     output:
+#     script:
+
+# rule verify:
+#     message: "."
+#     params:
+#     input:
+#     output:
+#     script:
diff --git a/modules/industry/schema.yaml b/modules/industry/schema.yaml
@@ -0,0 +1,55 @@
+$schema: https://json-schema.org/draft/2020-12/schema
+type: object
+additionalProperties: true
+properties:
+    industry:
+        description: Module subsection to ensure no name conflicts are possible between modules.
+        type: object
+        additionalProperties: false
+        properties:
+            inputs:
+                type: object
+                additionalProperties: false
+                description: Inputs are paths of prerequired files.
+                properties:
+                    path-energy-balances:
+                        type: string
+                        description: |
+                            Annual energy balance file.
+                            Columns [cat_code,carrier_code,unit,country,year,value].
+                    path-cat-names:
+                        type: string
+                        description: |
+                            Category mapping file.
+                            Columns [cat_code,top_cat,sub_cat_contribution,sub_cat_1,sub_cat_2,jrc_idees].
+                    path-carrier-names:
+                        type: string
+                        description: |
+                            Carrier mapping file.
+                            Columns [carrier_code,carrier_name,hh_carrier_name,com_carrier_name,ind_carrier_name,oth_carrier_name].
+                    path-jrc-industry-energy:
+                        type: string
+                        description: |
+                            JRC processed industry energy demand .nc file.
+                    path-jrc-industry-production:
+                        type: string
+                        description: |
+                            JRC processed industrial production .nc file.
+            outputs:
+                type: object
+                description: Outputs are paths for the files produced by the module.
+            params:
+                type: object
+                additionalProperties: false
+                description: Parameters allow users to configure module behaviour.
+                properties:
+                    steel:
+                        type: object
+                        additionalProperties: false
+                        description: "Parameters specific to the 'Iron and steel' industry category."
+                        properties:
+                            recycled-steel-share:
+                                type: number
+                                description: "Share of recycled metal in the H-DRI steel process."
+                                minimum: 0
+                                maximum: 1
diff --git a/modules/industry/scripts/chemical_industry.py b/modules/industry/scripts/chemical_industry.py
diff --git a/modules/industry/scripts/other_industry.py b/modules/industry/scripts/other_industry.py