Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential bug in network parameter, num_parallel or length #862

Open
pz-max opened this issue Sep 12, 2023 · 17 comments
Open

Potential bug in network parameter, num_parallel or length #862

pz-max opened this issue Sep 12, 2023 · 17 comments
Labels
bug Something isn't working

Comments

@pz-max
Copy link
Member

pz-max commented Sep 12, 2023

We thought the OSM data is doing quite well in EU looking at:

Reported by Tom B:
PyPSA-Eur (entsoe) is MUCH closer to official statistics (which here include Turkey, unlike PyPSA-Eur): https://eepublicdownloads.entsoe.eu/clean-documents/Publications/Statistics/Factsheet/entsoe_sfs2021_web.pdf

PyPSA-Eur 159,000 for 380 kV is much closer to official 186,000 than OSM's 290,000.
Ditto for 220+300: our 125,000 km is much closer to official 131,000 km than OSM's 930,000 km.

TODO:
Two possible reasons for big OSM error:

  • false num_parallel data in OSM
  • false num_parallel calcuations/assumptions when building the OSM network
@pz-max pz-max added the bug Something isn't working label Sep 12, 2023
@ekatef
Copy link
Collaborator

ekatef commented Sep 13, 2023

Thanks for sharing @pz-max!

Agree that the result is quite surprising. When doing grid validation for Central and Western Asia, we found some discrepancies, but they have never be as high as more than eight times...

My feeling is that it would be good:

  1. to look into some more details of the discrepancies found for elec.nc trying to provide some additional insights by voltage classes and countries;
  2. compare OSM-extracted lengths values after cleaning with ENTSO-extracted lengths..

@pz-max
Copy link
Member Author

pz-max commented Sep 14, 2023

One extra point:
3. Maybe not the correct .crs system was set in the PyPSA-Earth config...
So elec.nc might be wrong to begin with

@GbotemiB
Copy link
Contributor

GbotemiB commented Sep 16, 2023

Here is a follow on @ekatef suggestions.

Here is a voltage comparison plot between OSM and ENTSOE
image

Using log transformation on the y-axis to scale the data
image

Here is also a more detailed country stat comparison for Tw/Km
image

Using log transformation on the y-axis to scale the data
image

One of the things I noticed is that the country code with LU is missing data in the entsoe network.

@pz-max
Copy link
Member Author

pz-max commented Sep 16, 2023

Thanks @GbotemiB , regarding units.

  • let's only look into kilometer [km] for each voltage level

@ekatef
Copy link
Collaborator

ekatef commented Sep 16, 2023

@GbotemiB Amazing plots! 🤩
An interesting catch for Luxembourg. It seems form ENTSOE Factsheet that data for LU are included into ENTSO data. Not sure why we don't have data for it in PyPSA-Eur elec.nc. Good to be aware of this

@pz-max agree that comparison for line lengths would be highly interesting. My feeling is that is would be great if we could provide Emmanuel with clean_osm_data. Actually, I hope that inter-comparison results would also look nicer for lengths :D

@GbotemiB
Copy link
Contributor

@pz-max @ekatef
Here is the plot considering Kilometer for each voltage level
image

Here is the same plot for better understanding
image

image

@pz-max
Copy link
Member Author

pz-max commented Sep 18, 2023

@GbotemiB can you create a repo with the notebook such that we can review the code easily? (Meaning the data needs to be uploaded somewhere for the notebook too)

@ekatef
Copy link
Collaborator

ekatef commented Sep 18, 2023

@GbotemiB Ouch... I'd say, the result is quite surprising 😄 Great to have cross-comparison

Agree with @pz-max that your comparison work would be a great contribution to documentation repo. Would you mind to fork it and create a PR with your notebook?

@davide-f
Copy link
Member

I personally believe that we may revise and investigate the conversion of the raw osm data into the cleaning phase as well.
There we do some data filling that may be verified.

As test cases AT and MK may be good to test given the errors.
I may try to share the entire resources folder for that that may support the investigation.
Do you agree?

@ekatef
Copy link
Collaborator

ekatef commented Sep 21, 2023

I personally believe that we may revise and investigate the conversion of the raw osm data into the cleaning phase as well. There we do some data filling that may be verified.

As test cases AT and MK may be good to test given the errors. I may try to share the entire resources folder for that that may support the investigation. Do you agree?

@davide-f my feeling is that it would be perfect :) There are still some some validations for cleaned OSM data, while not sure anybody looked into effects of the cleaning procedure itself.

@pz-max @GbotemiB What is your opinion on this?

@davide-f
Copy link
Member

I personally believe that we may revise and investigate the conversion of the raw osm data into the cleaning phase as well. There we do some data filling that may be verified.
As test cases AT and MK may be good to test given the errors. I may try to share the entire resources folder for that that may support the investigation. Do you agree?

@davide-f my feeling is that it would be perfect :) There are still some some validations for cleaned OSM data, while not sure anybody looked into effects of the cleaning procedure itself.

@pz-max @GbotemiB What is your opinion on this?

Here you can find selected folders of "resources" for the selected countries:
https://drive.google.com/drive/folders/18dV790r11hHKIwpbyDBaxMV4XbhBFQde?usp=drive_link

In particular, it contains folders shapes, osm and base_network that should be all that's needed.
The config file is also included

@ekatef
Copy link
Collaborator

ekatef commented Sep 21, 2023

Here you can find selected folders of "resources" for the selected countries: https://drive.google.com/drive/folders/18dV790r11hHKIwpbyDBaxMV4XbhBFQde?usp=drive_link

In particular, it contains folders shapes, osm and base_network that should be all that's needed. The config file is also included

@davide-f Fantastic, thank you very much! 😄

@GbotemiB
Copy link
Contributor

GbotemiB commented Sep 25, 2023

@pz-max @ekatef @davide-f

Here is a comparison between osm-raw, osm-clean and entsoe data for AT

image
image
image

Here are the corresponding plot for MK
image
image
image

@ekatef
Copy link
Collaborator

ekatef commented Sep 25, 2023

@GbotemiB amazing result! 🎉 🎉 🎉

My feeling is that the line length for ENTSO is lower than for OSM data due to the coastline paradox. Not sure yet which exactly implication does it has for power flow calculations.

Actually, openinframap also gives a complicated picture consistent with OSM map you provided. An eastern part of Austria as an example:

image

@GbotemiB
Copy link
Contributor

After doing a bit of cleaning on the data with @ekatef .

  • The 300kV on osm-clean AT, has only one instance, which might be an error. So we decided to drop it.
  • we decided to eliminate classification for voltage at 400kV and just transform it into >380kV.

The results:
AT
image
image

MK
image
image

@davide-f
Copy link
Member

davide-f commented Sep 26, 2023

The numbers to me do not look bad at all!
For these cases, I think that we are definitely in tolerance (orange vs green) for MK and AT is similar.

To test if the issue is the spatial resolution, the geometries may be simplified using simplify (https://wichita.ogs.ou.edu/OpenLayers-2.12/examples/simplify-linestring.html) on the geometries using a tolerange similar to the one of entso-e and see if numbers match better.

Moreover, as a second comparison, may be good to check the total TW by voltage, calculated as: s_nom line , for lines beyond 10km.
Note: each s_nom of each line already accounts for the num_parallel, so there is no need to multiply s_nom * num_parallel, otherwise we double count the number of parallel conductors

@ekatef
Copy link
Collaborator

ekatef commented Sep 26, 2023

The numbers to me do not look bad at all! For these cases, I think that we are definitely in tolerance (orange vs green) for MK and AT is similar.

To test if the issue is the spatial resolution, the geometries may be simplified using simplify (https://wichita.ogs.ou.edu/OpenLayers-2.12/examples/simplify-linestring.html) on the geometries using a tolerange similar to the one of entso-e and see if numbers match better.

Moreover, as a second comparison, may be good to check the total TW by voltage, calculated as: s_nom line , for lines beyond 10km. Note: each s_nom of each line already accounts for the num_parallel, so there is no need to multiply s_nom * num_parallel, otherwise we double count the number of parallel conductors

Thanks a lot @davide-f!

My feeling is that the hint with simplification may be very helpful 🙂 It looks like Douglas–Peucker is exactly what we need here. @GbotemiB this algorithm is available in geopandas as
simplify method. Agree with Davide that it is a great idea to apply gpd.simplify() to cleaned OSM lines geometry and look how would it impact the comparison result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: In Progress
Development

No branches or pull requests

4 participants