Add "other" industry demand #355

tud-mchen6 · 2024-04-11T13:50:11Z

Fixes #309 .

Adding "other" industry in the industry module.

#340 is a prerequisite for this PR.

Checklist

Any checks which are not relevant to the PR can be pre-checked by the PR creator. All others should be checked by the reviewer. You can add extra checklist items here if required by the PR.

CHANGELOG updated
Minimal workflow tests pass
Tests added to cover contribution (not relevant)
Documentation updated (not relevant)
Configuration schema updated (not relevant)

…en/add-chemical-industry

…into add-other-industry-demand

…ther-industry-demand

brynpickering

@tud-mchen6 could you confirm that the only change in this PR compared to #340 is the chemicals_industry.py and other_industry.py files?

modules/industry/src/chemicals_industry.py

brynpickering

Just realised that the PR title mentions only "other", so feel free to ignore my "chemical_industry.py" comments.

brynpickering · 2024-04-12T12:42:43Z

modules/industry/src/other_industry.py

+    jrc_prod_df = jrc_prod_df.drop(specific_industries, level="cat_name")
+
+    # -------------------------------------------------------------------------
+    # Process data


I would split out these main commented steps into separate functions

brynpickering · 2024-04-12T12:43:29Z

modules/industry/src/other_industry.py

+    # if it can be met by electricity (exclusively or otherwise),
+    # then it's an end-use electricity demand
+    electrical_consumption = (
+        jrc.get_carrier_demand("Electricity", demand, jrc_energy_df)


For each of the named carriers ("Electricity", "Natural gas (incl.biogas)", etc.), I'd move them to constants at the start of the file so they're easier to see grouped together.

Moved these to the configuration. Fully flexible now!

brynpickering · 2024-04-12T12:44:05Z

modules/industry/src/other_industry.py

+        "subsector", level="cat_name"
+    )
+    all_other_consumption_filled = all_other_consumption_filled.stack()
+    breakpoint()


leftover from debugging

tud-mchen6 · 2024-04-16T07:44:58Z

@tud-mchen6 could you confirm that the only change in this PR compared to #340 is the chemicals_industry.py and other_industry.py files?

If compared to the commit of #340 at the same time point, the only change is indeed that other_industry.py is added and relevant rule is added in the industry.smk file.

Now the branch of #340 has developed quite far (e.g. integrating the JRC data processing script), and these branches need to be merged with some effort. However, #340 has never developed the functionality of chemical industry or "other" industry, so functionality wise they are still separated.

irm-codebase · 2024-04-27T10:28:30Z

This PR now integrates the JRC module code and xarray processing.
Also, I've added some quality of life processing for "other industries":

You can now "turn off" the processing of specific sectors via the configuration. All non specific sectors will be parsed through the "other" rule.
You can now select which carriers to extract at a final energy level and at an end-use level.
You can select the method of extraction. For now only "priority" is available, which reflects how SCEC does it.

brynpickering · 2024-05-14T15:35:55Z

@irm-codebase ready to rebase onto develop now that #340 is merged in!

…-demand

irm-codebase

@brynpickering ready for review!

CHE pre-processing was leading to some funky issues, so I've added comments on how I dealt with it.

irm-codebase · 2024-05-15T15:35:48Z

modules/industry/scripts/utils/filling.py

@@ -90,8 +81,11 @@ def fill_missing_countries_years(
    _to_fill = _to_fill.bfill(dim="year")
    all_filled = _to_fill.ffill(dim="year")

-    all_filled = jrc.ensure_standard_coordinates(all_filled)
-    all_filled = all_filled.assign_attrs(units="twh")
+    # TODO: CHE has no values for "Wood and wood products" and "Transport Equipment".


CHE was triggering assert failures due to some missing data. For now I am just assuming those values are 0 (same as SCEC).

This is mostly because the CHE processing above this module does not seem to provide data for these sectors, meaning they are filled with nan in all years, so none of our filling methods work.

Let me know what you think.

Yeah, we don't have much choice on that. I assume they are in "other industry" in the CHE data so we can't extract them.

brynpickering

Looks good, just some minor changes to make. I'll go through it again tomorrow by running the code and checking the results at different points. Would be good if a data consistency check existed somewhere, i.e. that no energy demand is lost / added.

brynpickering · 2024-05-29T16:30:32Z

modules/industry/config.yaml

@@ -9,5 +9,10 @@ industry:
        placeholder-out1:
        placeholder-out2:
    params:
+        specific-industries: ["Iron and steel", "Chemicals Industry"]


I'd rename this param. "specific" isn't very descriptive and "industries" might be better as "subsectors". subsectors-to-decarbonise, subsectors-to-electrify, electrified-subsectors, ... ? I don't like any of those particularly but maybe they can trigger a better idea 😅

separated-subsectors, subsectors-to-process-individually?

I changed it to non-generic-categories and changed other to generic-config.
Hopefully this makes processing clearer.

brynpickering · 2024-05-29T16:35:35Z

modules/industry/industry.smk

-    output:
-        path_output = f"{BUILD_PATH}/annual_demand_steel.nc"
-    script: f"{SCRIPT_PATH}/steel_industry.py"
+if "Iron and steel" in config["params"]["specific-industries"]:


Is this the best way to add this conditionality? @timtroendle maybe you can comment.

The other approach would be to use a conditional list as inputs in a later rule, e.g.:

rule merge_industry_demands: input: specific_industries = expand(f"{BUILD_PATH}/annual_demand_{subsector}.nc", subsector=subsector_translator(config["params"]["specific-industries"]))

where subsector_translator is a helper function to map e.g. Steel and Iron to steel.

Not sure I have something useful to say as I am not fully aware of the context. I wonder: Why would steel industry be excluded? Is there a use-case for that? If not, then there is no conditionality needed.

If it is not a "specific" industry then it gets automatically lumped in as "other". So it's possible to overhaul the steel industry to decarbonise feedstocks or to just pipe all demands without converting any processes

Is this the best way to add this conditionality? @timtroendle maybe you can comment.

The other approach would be to use a conditional list as inputs in a later rule, e.g.:

rule merge_industry_demands: input: specific_industries = expand(f"{BUILD_PATH}/annual_demand_{subsector}.nc", subsector=subsector_translator(config["params"]["specific-industries"]))

where subsector_translator is a helper function to map e.g. Steel and Iron to steel.

I like this approach because it's pretty easy to follow and does not add much complexity.
For now I'll keep it as-is since the merging step needs development (SCEC also has a scaling step there).

modules/industry/industry.smk

brynpickering · 2024-05-29T16:37:27Z

modules/industry/industry.smk

    conda: CONDA_PATH
    params:
+        config_params = config["params"],


This is a moment to separate out params. Pass specific-industries and other to the script individually so that e.g. steel param changes don't re-trigger this rule.

brynpickering · 2024-05-29T16:38:32Z

modules/industry/industry.smk

    input:
-    output: f"{BUILD_PATH}/other_industry.csv"
+        path_energy_balances = config["inputs"]["path-energy-balances"],


Since inputs are all paths, I tend to prefer not prepending with path_ here and path- in the config.

I'd like to keep path to avoid confusion between variables holding data and those holding strings.
A bit more explicit.

but if all inputs hold strings...? How about config["input-paths"]["energy-balances"] etc?

brynpickering · 2024-05-29T16:44:30Z