# Gender proportions in census 2021 data
I'm so excited to have had a first glance at the new census data that was released in late June 2022. Here, I will plot gender proportions to see which areas have more women then men and vice versa. 

DATA:
- [geo data](https://geoportal.statistics.gov.uk/datasets/ons::local-authority-districts-december-2021-gb-bfc/explore?location=55.174283%2C-3.854058%2C6.64): map with unitary authorities
- [population data](https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/populationandhouseholdestimatesenglandandwalescensus2021)

Map plotting libraries considered:
- [plotly](https://plotly.com/python/mapbox-county-choropleth/)
- [folium](https://towardsdatascience.com/making-3-easy-maps-with-python-fb7dfb1036)
- [bokeh](https://towardsdatascience.com/a-complete-guide-to-an-interactive-geographical-map-using-python-f4c5197e23e0)

I tried to use the plotly mapping library but my geojson file was too big and it crashed. Folium was able to cope.

Some useful resources for customising folium maps:
- [Choroplet map](https://medium.com/analytics-vidhya/create-and-visualize-choropleth-map-with-folium-269d3fd12fa0)
https://towardsdatascience.com/folium-and-choropleth-map-from-zero-to-pro-6127f9e68564
https://leafletjs.com/reference-1.6.0#tooltip
https://medium.com/datasciencearth/map-visualization-with-folium-d1403771717
https://nbviewer.org/gist/talbertc-usgs/18f8901fc98f109f2b71156cf3ac81cd

In [1]:
import pandas as pd
import geopandas as gpd

import folium

## Read census and geo data from ONS

In [2]:
census_data = pd.read_excel('./data/census2021firstresultsenglandwales1.xlsx',sheet_name='P01')
census_data.columns = census_data.iloc[5]
census_data = census_data.dropna().drop(5).rename({'Area code [note 2]':'Area code'},axis=1)

#Gender imbalance calculations
census_data['more_men_than_women'] = census_data.Males - census_data.Females
census_data['perc_men'] = census_data.Males / census_data['All persons'] *100
census_data['men_over_supply'] = census_data.perc_men - 50
census_data['perc_women'] = 100 - census_data['perc_men']
census_data.head()

5,Area code,Area name,All persons,Females,Males,more_men_than_women,perc_men,men_over_supply,perc_women
6,K04000001,England and Wales,59597300,30420100,29177200,-1242900,48.957251,-1.042749,51.042749
7,E92000001,England,56489800,28833500,27656300,-1177200,48.958042,-1.041958,51.041958
8,E12000001,North East,2647100,1353800,1293300,-60500,48.85724,-1.14276,51.14276
9,E06000047,County Durham,522100,266800,255300,-11500,48.898678,-1.101322,51.101322
10,E06000005,Darlington,107800,55100,52700,-2400,48.886827,-1.113173,51.113173


In [3]:
#Read geo data
geo_info = gpd.read_file('./data/Local_Authority_Districts_(December_2021)_GB_BFC.geojson')
geo_info.head()

Unnamed: 0,OBJECTID,LAD21CD,LAD21NM,LAD21NMW,BNG_E,BNG_N,LONG,LAT,GlobalID,SHAPE_Length,SHAPE_Area,geometry
0,1,E06000001,Hartlepool,,447160,531474,-1.27018,54.67614,{CB7275CE-D16E-45F7-8E7D-33032FB9DF9D},0.89986,0.013057,"MULTIPOLYGON (((-1.22470 54.62611, -1.22493 54..."
1,2,E06000002,Middlesbrough,,451141,516887,-1.21099,54.54467,{6598062E-357C-4E8D-B117-5D6DE45F75B7},0.565731,0.007484,"MULTIPOLYGON (((-1.27720 54.54784, -1.27721 54..."
2,3,E06000003,Redcar and Cleveland,,464361,519597,-1.00608,54.56752,{B23F8B9B-4D88-4C21-80D6-8041E2910EF2},1.272659,0.034046,"MULTIPOLYGON (((-1.20098 54.57763, -1.20030 54..."
3,4,E06000004,Stockton-on-Tees,,444940,518183,-1.30664,54.556911,{64624AEF-6611-4E3C-BBA8-870B0B889E1D},1.523316,0.028478,"MULTIPOLYGON (((-1.27211 54.55337, -1.27213 54..."
4,5,E06000005,Darlington,,428029,515648,-1.56835,54.535339,{310B13B3-F45F-452B-88F1-DDD48C1992F5},1.334472,0.027434,"MULTIPOLYGON (((-1.63768 54.61714, -1.63767 54..."


In [8]:
census_data[census_data.perc_women<50]

5,Area code,Area name,All persons,Females,Males,more_men_than_women,perc_men,men_over_supply,perc_women
42,E08000006,Salford,269900,134400,135500,1100,50.203779,0.203779,49.796221
76,E07000166,Richmondshire,49700,24300,25400,1100,51.10664,1.10664,48.89336
96,E06000017,Rutland,41000,20000,21100,1100,51.463415,1.463415,48.536585
109,E07000130,Charnwood,183900,91900,92000,100,50.027189,0.027189,49.972811
141,E07000196,South Staffordshire,110500,55200,55300,100,50.045249,0.045249,49.954751
169,E06000032,Luton,225300,112400,112900,500,50.110963,0.110963,49.889037
174,E07000008,Cambridge,145700,72700,73000,300,50.102951,0.102951,49.897049
216,E07000245,West Suffolk [note 6],179800,89800,90000,200,50.055617,0.055617,49.944383
220,E09000001,City of London,8600,3800,4800,1000,55.813953,5.813953,44.186047
230,E09000030,Tower Hamlets,310300,154500,155800,1300,50.209475,0.209475,49.790525


In [9]:
geo_info[geo_info.LAD21CD.isin(census_data[census_data.perc_women<50]['Area code'])]

Unnamed: 0,OBJECTID,LAD21CD,LAD21NM,LAD21NMW,BNG_E,BNG_N,LONG,LAT,GlobalID,SHAPE_Length,SHAPE_Area,geometry
16,17,E06000017,Rutland,,492992,308655,-0.6263,52.667648,{322A8B38-DB0F-4810-9D07-D2097DFE5EED},1.529882,0.052324,"MULTIPOLYGON (((-0.60944 52.75973, -0.60909 52..."
29,30,E06000032,Luton,,508606,222559,-0.42319,51.891022,{79C0AA4F-AB98-439A-A448-F82A0B9730FE},0.422333,0.005663,"MULTIPOLYGON (((-0.43109 51.92693, -0.43081 51..."
42,43,E06000045,Southampton,,442303,113700,-1.39952,50.9212,{AEA3BB30-EB49-4D63-9793-12AE168495E2},0.715943,0.006381,"MULTIPOLYGON (((-1.47704 50.92865, -1.47695 50..."
59,60,E07000008,Cambridge,,545420,257901,0.126436,52.200169,{3FFFAE5D-47F8-492B-BC28-C1ADED16CF7B},0.447291,0.005351,"MULTIPOLYGON (((0.16542 52.23446, 0.16623 52.2..."
151,152,E07000130,Charnwood,,458365,316155,-1.13694,52.739899,{D0A5019E-F2EB-45D7-B89A-F0F216824E7A},1.399341,0.037149,"MULTIPOLYGON (((-1.07444 52.82473, -1.07432 52..."
174,175,E07000166,Richmondshire,,401039,495786,-1.98552,54.357609,{0CA86635-DEB0-4350-9D61-BE2A9E7B31EE},2.821845,0.182375,"MULTIPOLYGON (((-1.69862 54.53610, -1.69834 54..."
197,198,E07000196,South Staffordshire,,389625,311037,-2.15495,52.696918,{96B44E34-E307-42B1-90DD-89AA0166BFDE},2.202045,0.054095,"MULTIPOLYGON (((-2.09488 52.78527, -2.09518 52..."
238,239,E07000245,West Suffolk,,580944,271124,0.652769,52.308418,{169C0812-ACD8-4F1C-BB1A-B7B485D71BE1},3.328625,0.136235,"MULTIPOLYGON (((0.66724 52.46230, 0.66736 52.4..."
245,246,E08000006,Salford,,374556,398128,-2.38485,53.479271,{0D29D9D0-51E5-4B07-8E87-BF6273DEAB39},0.79954,0.013167,"MULTIPOLYGON (((-2.43715 53.54225, -2.43650 53..."
276,277,E09000001,City of London,,532382,181358,-0.09351,51.51564,{4C6BDE7B-2DCE-41EF-A5BE-FF3E001D2052},0.115155,0.000374,"MULTIPOLYGON (((-0.10415 51.50860, -0.10416 51..."


## Plot map

In [17]:

m = folium.Map(location=[48, -0.09], zoom_start=5) #'Mapbox Bright',"Cartodb Positron"
folium.Choropleth(
    geo_data=geo_info,
    data=census_data,
    columns=["Area code", "perc_women"],
    key_on="feature.properties.LAD21CD",
    fill_color="RdPu",
    #fill_opacity=0.5,
    #line_opacity=0.2,
    legend_name="% of women vs. men in the population",
    nan_fill_color="black",
    nan_fill_opacity=1,
).add_to(m)

#https://www.python-graph-gallery.com/312-add-markers-on-folium-map
folium.Marker(location=[51.515640,-0.093510], popup = 'City of London (56%)',#icon=folium.DivIcon(html=f"""<div style="font-family: courier new; color: pink">City of London (44%)</div>""")).add_to(m)
icon=folium.Icon(color='purple', prefix='fa',icon='male')).add_to(m)
#folium.Marker(location=[52.667648,-0.62630], popup = 'Rutland (51.5%)',icon=folium.Icon(color='purple', prefix='fa',icon='male')).add_to(m)

m.save('map.html')