Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update NUTS processing #382

Merged
merged 2 commits into from
May 29, 2024
Merged

Conversation

brynpickering
Copy link
Member

@brynpickering brynpickering commented May 14, 2024

Fixes #380.

In the process I have cleaned up the NUTS processing script. It now has less going on for the same result and has type hinting and more docstrings.

Checklist

Any checks which are not relevant to the PR can be pre-checked by the PR creator. All others should be checked by the reviewer. You can add extra checklist items here if required by the PR.

  • CHANGELOG updated
  • Minimal workflow tests pass
  • Tests added to cover contribution
  • Documentation updated
  • Configuration schema updated

@brynpickering brynpickering marked this pull request as draft May 14, 2024 11:02
@brynpickering brynpickering force-pushed the feature-update-nuts-processing branch from 5b63c02 to 6f8d7a0 Compare May 14, 2024 11:11
@brynpickering brynpickering marked this pull request as ready for review May 14, 2024 11:12
Copy link
Member

@timtroendle timtroendle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. There are a few comprehension questions and minor improvement suggestions.

@@ -9,7 +9,7 @@ data-sources:
hydro-basins: https://www.dropbox.com/sh/hmpwobbz9qixxpe/AADeU9iCgMd3ZO1KgrFmfWu6a/HydroBASINS/standard/eu/hybas_eu_lev07_v1c.zip?dl=1
hydro-stations: https://zenodo.org/record/5215920/files/energy-modelling-toolkit/hydro-power-database-v10.zip?download=1
load: https://data.open-power-system-data.org/time_series/2019-06-05/time_series_60min_stacked.csv
nuts: https://ec.europa.eu/eurostat/cache/GISCO/geodatafiles/NUTS_{nuts_year}_01M_SH.zip
nuts: https://ec.europa.eu/eurostat/cache/GISCO/distribution/v2/nuts/shp/NUTS_RG_01M_{nuts_year}_4326.shp.zip
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these have the same resolution? If they were more or less precise compared with the previous file, this could lead to different results.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more comments here:

  • Only now do I see the devilish superfluous whitespace. Had I seen this before, of course I would have rejected the pull request right away.
  • Expecting the {nuts_year} wildcard in the schema could be a good idea. We already do that with other data sources.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Yes, they are the same resolution (01M).
  2. SHAME
  3. will do.


ISO alpha2: alpha2
ISO alpha2 with EU codes: alpha2_eu
ISO alpha3: alpha3
name: country name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This must be inverted: country name: name

@@ -23,11 +23,12 @@ def eu_country_code_to_iso3(eu_country_code):

def convert_country_code(input_country, output="alpha3"):
"""
Converts input country code or name into either a 2- or 3-letter code.
Converts input country code or name into a 2- or 3-letter code or a normalised name.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe: normalised pycountry name


def normalise(path_to_nuts, path_to_output, crs, study_area, all_countries, schema):
"""Normalises raw NUTS data.
LOGGER = logging.getLogger(__name__)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the logger work? It clearly doesn't log to a file, right? Is console logging the default?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really work at the moment. If you set the log level to warning I think it will log to stdout, otherwise it's effectively silent. It's a placeholder for sorting out logging properly :P

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated it to actually log to the console. Could be an option to set the logging level using a config param or to log to a file given by the rule. For now it simply emulates what the former print statements were doing.


Args:
path_to_nuts (str): Path to downloaded NUTS data.
path_to_output (str): Path to save output GeoPackage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"path to save geopackage TO", right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"path to which geopackage will be saved" I suppose is the most grammatically correct...

},
crs=gdf_nuts.crs,
).to_crs(crs)
gdf_nuts_clean = _fix_country_names(gdf_nuts_clean)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to avoid storing and restoring these data in different variables, you could "pd.pipe" them.

Remove:
* countries not in study country codes
* entire shapes is not in study area bounding box
* parts of multipolygons that are not in the study area (keeping the rest)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe change for understandability:

parts of multipolygons that are not in the study area (e.g. remove islands and keep the rest)

@brynpickering brynpickering merged commit 263935e into develop May 29, 2024
4 checks passed
@brynpickering brynpickering deleted the feature-update-nuts-processing branch May 29, 2024 16:20
jnnr pushed a commit to jnnr/euro-calliope that referenced this pull request Aug 27, 2024
…-update-nuts-processing

Update NUTS processing
jnnr pushed a commit to jnnr/euro-calliope that referenced this pull request Aug 27, 2024
…-update-nuts-processing

Update NUTS processing
jnnr pushed a commit to jnnr/euro-calliope that referenced this pull request Sep 3, 2024
…-update-nuts-processing

Update NUTS processing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update NUTS link to enable download of different NUTS years
2 participants