Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/#165 centralize map zensus to boundaries #180

Merged
merged 8 commits into from
Mar 29, 2021

Conversation

ClaraBuettner
Copy link
Contributor

@ClaraBuettner ClaraBuettner commented Mar 24, 2021

This branch adds a table which maps zensus data to vg250 municipalities and nuts3 regions.
The table is used by the society prognosis and the population_in_municipalities.
@gplssm Since it wasn't that easy for me to adjust your SQLalchemy code, I re-wrote it with (geo)pandas because it was much easier for me and also needed less lines of code. Could you please take a look and tell me if you are fine with this?

As I also posted in #165 I decided to map all zensus cells with a population to municipalities.

Fixes #165

@ClaraBuettner ClaraBuettner added the 🏗️ integration Integrating a data processing step label Mar 24, 2021
@ClaraBuettner ClaraBuettner self-assigned this Mar 24, 2021
@ClaraBuettner
Copy link
Contributor Author

All 'Tests, code style & coverage' failed. But looking into the log, I don't think that this is caused by my changes. It looks more like this happens because the file egon-data.pid-...yml was moved to the current working directory in #159.

Copy link
Contributor

@gplssm gplssm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I tested locally and had a quick look on vg250_gem_population that's now created via the mapping table. Seems plausible!

Thanks for making these changes!

Regarding the implementation in geopandas rather than in postgis. That's totally fine! Geopandas made significant performance improvements recently (2019 I guess) by using pygeos directly. Now, geopandas should be as fast as postgis or even faster.

Since the issue about tests is already described in #183, that's fine and not our problem here.

Regarding the comment on the implementation in #165 (comment): I'm generally fine with that. I might lead to different number of inhabitants depending on which table is used. If you ask vg250_gem_population you probably get a higher value compared to create the sum over all census cells that are located with the centroid inside Germany (namely destatis_zensus_population_per_ha_inside_germany). But this is actually a matter of how data is used subsequently.

I'll a note to the CHANGELOG an gonna merge.

@gplssm gplssm merged commit 9d1ddd2 into dev Mar 29, 2021
@gplssm gplssm deleted the features/#165-centralize-map-zensus-to-boundaries branch March 29, 2021 10:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏗️ integration Integrating a data processing step
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Map zensus cells to bounaries only once
2 participants