Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add industry module #340

Merged

Conversation

irm-codebase
Copy link
Contributor

@irm-codebase irm-codebase commented Apr 10, 2024

Fixes #308, #310, #347, #345 and #346.

This pull request includes the "skeleton" of the industry module, and includes the steel industry sub-workflow as an example.

Checklist

Any checks which are not relevant to the PR can be pre-checked by the PR creator. All others should be checked by the reviewer. You can add extra checklist items here if required by the PR.

  • CHANGELOG updated
  • Minimal workflow tests pass
  • Steel rule done

@irm-codebase irm-codebase removed the request for review from timtroendle April 10, 2024 10:28
Copy link
Member

@brynpickering brynpickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Will need @timtroendle to comment on the utils files. We have eurocalliopelib at the moment to store utility functions. Should these go there? Or does the modularisation require a different approach?

modules/industry/config.yaml Outdated Show resolved Hide resolved
modules/industry/industry.smk Outdated Show resolved Hide resolved
modules/industry/industry.smk Outdated Show resolved Hide resolved
modules/industry/src/steel_industry.py Outdated Show resolved Hide resolved
modules/industry/src/steel_industry.py Outdated Show resolved Hide resolved
modules/industry/src/utils/formatting.py Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should have a schema for this config.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've created issue #347 to tackle this.

A schema is a great idea, but the module's configuration might change too much while we port features. We'd prefer to do it once the chemical and "other" sectors are ported, to ensure the configuration file is mature enough.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config must go into the metadata that we create for model builds. See the build_metadata rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config must go into the metadata that we create for model builds. See the build_metadata rule.

The general config already includes our module's configuration. So it is already in the metadata :)

# Include modules
configfile: "modules/industry/config.yaml"

Co-authored by: Meijun Chen (tud-mchen6)
@brynpickering brynpickering self-requested a review April 11, 2024 08:21
Copy link
Member

@timtroendle timtroendle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an excited pull request, thanks!

In my review, I mainly focussed on the structure assuming @brynpickering has checked the data processing logic. There are a few consistency related issues that need addressing.

In addition, this change is not adding a feature and it's not show-casing the entire integration and no tests. Would it be possible to refactor the change so it's a feature? What I mean is: can we build this into the model? I understand you derive annual steal demand, so from there you could quite easily create flat steel demand profiles for all locations and put those into the model. Only then would we see the entire integration pipeline: model feature, documentation, and tests.

CHANGELOG.md Outdated Show resolved Hide resolved
Snakefile Outdated Show resolved Hide resolved
modules/industry/README.md Outdated Show resolved Hide resolved
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This config must go into the metadata that we create for model builds. See the build_metadata rule.

modules/industry/config.yaml Show resolved Hide resolved
modules/industry/industry.smk Outdated Show resolved Hide resolved
modules/industry/industry.smk Outdated Show resolved Hide resolved
modules/industry/industry.smk Show resolved Hide resolved

import eurocalliopelib.utils as ec_utils
import pandas as pd
from utils import formatting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Importing your own library functions in Snakemake requires careful consideration. Mainly, you want to make sure that changes in the library code trigger a re-execution of the rule. Currently, they don`t.

Would it make sense to move all re-usable code to the euro-calliope-lib? That will circumvent the problem. Currently the lib lives in the main repo, but we are considering to pull that into it's own repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a strong argument to use wrappers instead...

Moving these utils to another location (outside the module) sort of defeats the purpose of modularisation in the first place...

modules/industry/src/utils/formatting.py Outdated Show resolved Hide resolved
@timtroendle
Copy link
Member

@irm-codebase and @tud-mchen6 maybe we can discuss the design briefly after the group meeting today? @brynpickering may be useful to include in the discussion.

@irm-codebase
Copy link
Contributor Author

@irm-codebase and @tud-mchen6 maybe we can discuss the design briefly after the group meeting today? @brynpickering may be useful to include in the discussion.

That would be good! No problem on my side 👍

@tud-mchen6 tud-mchen6 mentioned this pull request Apr 11, 2024
5 tasks
@irm-codebase
Copy link
Contributor Author

irm-codebase commented Apr 22, 2024

Relatively big update to this PR to fix #346.

This was necessary because #354 made our pandas code go at a snail's pace. Moving to xarray was more efficient than trying to debug the cause (plus, the module is quite fast now). All results have been tested using checksums, and I can confirm they are the same.

@irm-codebase irm-codebase added the Industry Industrial energy demand label Apr 23, 2024
@brynpickering
Copy link
Member

@irm-codebase if you can rebase this onto current develop then I'll review

@irm-codebase
Copy link
Contributor Author

irm-codebase commented May 8, 2024

@brynpickering updated to the newest JRC version (checksum passed, no difference 👍 )

I had to modify a couple of lines at the JRC level because the production dataset did not assign names to the variable. I also passed the attributes down to the variable level. Hope that's OK.

Copy link
Member

@brynpickering brynpickering left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, we're close to having this ready to merge in!

scripts/jrc-idees/industry.py Outdated Show resolved Hide resolved
CHANGELOG.md Show resolved Hide resolved
modules/industry/README.md Outdated Show resolved Hide resolved
modules/industry/README.md Outdated Show resolved Hide resolved
modules/industry/README.md Outdated Show resolved Hide resolved
modules/industry/scripts/utils/jrc_idees_parser.py Outdated Show resolved Hide resolved
modules/industry/scripts/utils/formatting.py Outdated Show resolved Hide resolved
modules/industry/scripts/utils/formatting.py Outdated Show resolved Hide resolved
modules/industry/scripts/steel_industry.py Show resolved Hide resolved
modules/industry/scripts/utils/formatting.py Outdated Show resolved Hide resolved
@irm-codebase
Copy link
Contributor Author

@brynpickering added most of your suggestions, with the exception of the gap filling test.

Let me know if you find anything else during review. Otherwise, I'll merge this in and update the "chemical" and "other" cases.

@brynpickering
Copy link
Member

One thing that is possibly missing is documentation in the main euro-calliope docs about these new modules. But for now I think we can ignore that as we may well move this to a different repo before the next release...

@timtroendle
Copy link
Member

I would add documentation now. It's now in this repo and we may or may not move it somewhere else in the future. Like the model, the documentation is quite modular anyway, isn't it? So we could easily move the industry bit out of the documentation, too, should we decide to pull out this module into its own repo. We may as well decide to keep the documentation in the main repo and simply mention the optional module (one aspect we haven't discussed at all so far).

@irm-codebase
Copy link
Contributor Author

@timtroendle I'd say that adding full documentation now is only additional overhead: this is only part of the module (we are still missing other industry and chemical industry), plus disaggregation, testing and scaling steps. Also, we have already added pre-eliminary documentation explaining usage (see the README.md).

Besides, I agree with Bryn: moving module subworkflows into a separate repo makes a lot of sense. Still, I created issue #381 to make sure we stay on point with this.

For now, let's finish this PR once and for all, please 🙏🙏🙏?

@timtroendle
Copy link
Member

Yeah fair point. Sorry, didn't want to be annoying, it's just easy to forget these things.

envs/default.yaml Outdated Show resolved Hide resolved
@brynpickering brynpickering merged commit d189682 into calliope-project:develop May 14, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Industry Industrial energy demand
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define and create interfaces for the integration of an industry sub-module
4 participants