Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plot chloropleth with consistent legend and bins #1019

Closed
tommylees112 opened this issue Jun 20, 2019 · 10 comments
Closed

Plot chloropleth with consistent legend and bins #1019

tommylees112 opened this issue Jun 20, 2019 · 10 comments

Comments

@tommylees112
Copy link

tommylees112 commented Jun 20, 2019

How do I set a consistent colorscheme for three axes in the same figure?

The following should be a wholly reproducible example to run the code and get the same figure I have posted below.

Get the shapefile data from the Office for National Statistics. Run this in a terminal as a bash file / commands.

wget --output-document 'LA_authorities_boundaries.zip' 'https://opendata.arcgis.com/datasets/8edafbe3276d4b56aec60991cbddda50_1.zip?outSR=%7B%22latestWkid%22%3A27700%2C%22wkid%22%3A27700%7D&session=850489311.1553456889'

mkdir LA_authorities_boundaries
cd LA_authorities_boundaries
unzip ../LA_authorities_boundaries.zip

The python code that reads the shapefile and creates a dummy GeoDataFrame for reproducing the behaviour.

import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt

gdf = gpd.read_file(
    'LA_authorities_boundaries/Local_Authority_Districts_December_2015_Full_Extent_Boundaries_in_Great_Britain.shp'
)

# 380 values
df = pd.DataFrame([])
df['AREA_CODE'] = gdf.lad15cd.values
df['central_pop'] = np.random.normal(30, 15, size=(len(gdf.lad15cd.values)))
df['low_pop'] = np.random.normal(10, 15, size=(len(gdf.lad15cd.values)))
df['high_pop'] = np.random.normal(50, 15, size=(len(gdf.lad15cd.values)))

Join the shapefile from ONS and create a geopandas.GeoDataFrame

def join_df_to_shp(pd_df, gpd_gdf):
    """"""
    df_ = pd.merge(pd_df, gpd_gdf[['lad15cd','geometry']], left_on='AREA_CODE', right_on='lad15cd', how='left')

    # DROP the NI counties
    df_ = df_.dropna(subset=['geometry'])

    # convert back to a geopandas object (for ease of plotting etc.)
    crs = {'init': 'epsg:4326'}
    gdf_ = gpd.GeoDataFrame(df_, crs=crs, geometry='geometry')
    # remove the extra area_code column joined from gdf
    gdf_.drop('lad15cd',axis=1, inplace=True)

    return gdf_

pop_gdf = join_df_to_shp(df, gdf)

Make the plots

fig,(ax1,ax2,ax3,) = plt.subplots(1,3,figsize=(15,6))

pop_gdf.plot(
    column='low_pop', ax=ax1, legend=True,  scheme='quantiles', cmap='OrRd',
)
pop_gdf.plot(
    column='central_pop', ax=ax2, legend=True, scheme='quantiles', cmap='OrRd',
)
pop_gdf.plot(
    column='high_pop', ax=ax3, legend=True,  scheme='quantiles', cmap='OrRd',
)
for ax in (ax1,ax2,ax3,):
    ax.axis('off')

enter image description here

I want all three ax objects to share the same bins (preferable the central_pop scenario quantiles) so that the legend is consistent for the whole figure.

This way I should see darker colors (more red) in the far right ax showing the high_pop scenario.

How can I set the colorscheme bins for the whole figure / each of the ax objects?

The simplest way I can see this working is either
a) Provide a set of bins to the geopandas.plot() function
b) extract the colorscheme / bins from one ax and apply it to another.

@knaaptime
Copy link

knaaptime commented Jun 20, 2019

Under the hood, geopandas uses mapclassify, and the easiest way to achieve what you want would be to just use it directly:

import geopandas as gpd
import pandas as pd
import matplotlib.pyplot as plt
from mapclassify import Quantiles, User_Defined

# Note you can read directly from the URL
gdf = gpd.read_file('https://opendata.arcgis.com/datasets/8edafbe3276d4b56aec60991cbddda50_1.zip?outSR=%7B%22latestWkid%22%3A27700%2C%22wkid%22%3A27700%7D&session=850489311.1553456889'
)

# 380 values
df = pd.DataFrame([])
df['AREA_CODE'] = gdf.lad15cd.values
df['central_pop'] = np.random.normal(30, 15, size=(len(gdf.lad15cd.values)))
df['low_pop'] = np.random.normal(10, 15, size=(len(gdf.lad15cd.values)))
df['high_pop'] = np.random.normal(50, 15, size=(len(gdf.lad15cd.values)))

def join_df_to_shp(pd_df, gpd_gdf):
    """"""
    df_ = pd.merge(pd_df, gpd_gdf[['lad15cd','geometry']], left_on='AREA_CODE', right_on='lad15cd', how='left')

    # DROP the NI counties
    df_ = df_.dropna(subset=['geometry'])

    # convert back to a geopandas object (for ease of plotting etc.)
    crs = {'init': 'epsg:4326'}
    gdf_ = gpd.GeoDataFrame(df_, crs=crs, geometry='geometry')
    # remove the extra area_code column joined from gdf
    gdf_.drop('lad15cd',axis=1, inplace=True)

    return gdf_

pop_gdf = join_df_to_shp(df, gdf)

fig,(ax1,ax2,ax3,) = plt.subplots(1,3,figsize=(15,6))

# define your bins
bins = Quantiles(pop_gdf['central_pop'], 5).bins

# create a new column with the discretized values and plot that col
# repeat for each view
pop_gdf.assign(cl=User_Defined(df['low_pop'].dropna(), bins).yb).plot(
    column='cl', ax=ax1, cmap='OrRd'
)
pop_gdf.assign(cl=User_Defined(df['central_pop'].dropna(), bins).yb).plot(
    column='cl', ax=ax2, cmap='OrRd',
)
pop_gdf.assign(cl=User_Defined(df['high_pop'].dropna(), list(bins)).yb).plot(
    column='cl', ax=ax3, cmap='OrRd',
)
for ax in (ax1,ax2,ax3,):
    ax.axis('off')

image

@tommylees112
Copy link
Author

tommylees112 commented Jun 21, 2019

That's so great thank you. If you would forgive me - I have 2 questions about your plots.

  1. I didn't get the lovely colorbar you had as a legend
  2. I need the colorbar to be discrete with the bin labels

(btw I have just looked at your research profile. That is some amazing work that you have done!)

@knaaptime
Copy link

knaaptime commented Jun 21, 2019

ah, sorry about that. I included legend=True in the last plot, which shows the colorbar. If you need the other style legend, I think I would just change the middle plot back to the original

i.e.

pop_gdf.plot(
    column='central_pop', ax=ax2, legend=True, scheme='quantiles', cmap='OrRd', legend=True, legend_kwds={XXX}
)

and if you play around with changing the legend location in the legend_kwds argument you can probably get it to sit on the far right side of all three plots

and thanks for the kind words about my work! :)

@knaaptime
Copy link

fig,(ax1,ax2,ax3,) = plt.subplots(1,3,figsize=(15,6))

bins = Quantiles(pop_gdf['central_pop'], 5).bins


pop_gdf.assign(cl=User_Defined(pop_gdf['low_pop'].dropna(), bins).yb).plot(
    column='cl', ax=ax1, cmap='OrRd'
)
pop_gdf.plot('central_pop', scheme='quantiles',  ax=ax2, cmap='OrRd', legend=True, cax=ax3,
             legend_kwds=dict(loc='upper right', bbox_to_anchor=(3.5, 0.75), title="Legend\n", frameon=False)

)
pop_gdf.assign(cl=User_Defined(pop_gdf['high_pop'].dropna(), list(bins)).yb).plot(
    column='cl', ax=ax3, cmap='OrRd', legend=False
)
for ax in (ax1,ax2,ax3,):
    ax.axis('off')

image

@martinfleis
Copy link
Member

Thank you, @tommylees112 for your question and you, @knaaptime for the precise answer. Issue resolved, closing.

@robroc
Copy link

robroc commented Oct 3, 2019

Is there a way to force the legend in a map with a binned scheme to be in a colorbar style instead of circles with labels? And label only the min and max ends? I'm sure there's a way with matplolib, but if you have an example handy it would be a huge help.

@raphmu86
Copy link

raphmu86 commented Aug 5, 2020

@robroc did you found a solution to your question? would be interested to save some time here too! many thanks!

@ShouravBR
Copy link

@raphmu86 im not sure if this answers your question.

If the column is numerical, it will show up as a colorbar.
To get discrete colors instead of a continuous gradient, pass a custom colormap to the plot function.
To modify ticks, pass params to the legend kwds or use the vmin/vmax

gdf.plot(column=colname,
             cmap=plt.get_cmap('Blues',10),
             vmin=0, vmax=1,
             legend_kwds={'label': 'Coverage', 'ticks': np.arange(0,1.1, 0.2)})

image

@robroc
Copy link

robroc commented Sep 23, 2020

@ShouravBR Does this work if you pass something into scheme? I know the colorbar is default without it.

@ShouravBR
Copy link

@robroc No, the colorbar does not show. Uneven ticks from the scheme make it hard to manually create a colorbar.
image
image

gdf.plot('X',scheme='quantiles', legend=True, cmap='Blues')
gdf.plot('X', legend=True, cmap=plt.get_cmap('Blues',5),
         legend_kwds={'ticks': [3,21,43.20,54,72.4,99]})  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants