Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: support categorical variables in area_interpolate #135

Merged
merged 1 commit into from Feb 27, 2021

Conversation

martinfleis
Copy link
Member

Hi,

we needed to transfer categorical data (land cover) so I used the existing area_interpolate and added support for categorical variables. It measures a ratio of each unique category present in each polygon. See the example below.

sac1 = load_example("Sacramento1")
sac2 = load_example("Sacramento2")
sac1 = geopandas.read_file(sac1.get_path("sacramentot2.shp"))
sac2 = geopandas.read_file(sac2.get_path("SacramentoMSA2.shp"))
categories = ["cat", "dog", "donkey", "wombat", "capybara"]
sac1["animal"] = (categories * ((len(sac1) // len(categories)) + 1))[
    : len(sac1)
]

res = area_interpolate(
        source_df=sac1,
        target_df=sac2,
        categorical_variables=["animal"],
    )
>>> print(res.head())

   animal_cat  animal_dog  animal_donkey  animal_wombat  animal_capybara  \
0    0.431909    0.000000       0.000000       0.000062         0.000630   
1    0.069708    0.000000       0.000000       0.000000         0.000000   
2    0.630183    0.000000       0.000000       0.000000         0.354106   
3    0.462047    0.378258       0.158367       0.000597         0.000000   
4    0.992120    0.000000       0.000000       0.000000         0.006820   

                                            geometry  
0  POLYGON ((-120.14554 39.22748, -120.14743 39.2...  
1  POLYGON ((-120.37896 39.31638, -120.37917 39.3...  
2  POLYGON ((-120.60887 39.31545, -120.58559 39.3...  
3  POLYGON ((-120.03947 39.23825, -120.03950 39.2...  
4  POLYGON ((-120.65622 39.30815, -120.65456 39.3... 

@darribas and I thought it would be good to add it to tobler. Since it uses a lot of existing machinery, I just added a keyword to area_interpolate but can turn it into an independent function if that is preferable.

@sjsrey sjsrey self-assigned this Feb 25, 2021
@sjsrey sjsrey self-requested a review February 25, 2021 18:35
Copy link
Member

@sjsrey sjsrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great addition.

@sjsrey sjsrey merged commit 32c8525 into pysal:master Feb 27, 2021
@martinfleis martinfleis deleted the categorical branch March 5, 2021 09:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants