# STA 220 Data & Web Technologies for Data Analysis

### Lecture 14, 2/20/25, Cartography

### Today's topics
- Cartography
    - Chloropeth maps

## Plotting Spatial Information

The __folium__ package uses the Leaflet JavaScript library to make interactive maps.

The function to create a map is `folium.Map()`. The function's parameters control the position, style, and initial zoom of the map.

If you want to change the size of the map, you first need to create a `folium.Figure()`, and then add the map to the figure with `.add_child()`.

In [7]:
import folium

# Make a map.
m = folium.Map(location = [38.54132868466938, -121.75125428735232], zoom_start = 30)
# Davis: 38.5449, -121.7405

In [8]:
type(m)

folium.folium.Map

In [9]:
# optional: set up a Figure to control the size of the map
fig = folium.Figure(width = 700, height = 400)
fig.add_child(m)


# fig.save("MY_MAP.html")

Recall the Health inspections API. 

In [10]:
import requests 
import time

df = list()
for letter in 'abcdefghijklmnopqrstuvwxyz':
    time.sleep(0.05)
    r=requests.post('https://yoloeco.envisionconnect.com/api/pressAgentClient/searchFacilities', 
                    params = {'PressAgentOid': 'c08cb189-894c-4c8c-b595-a5ef010226b4'}, 
                    data = {'FacilityName': letter})
    df.extend(r.json())

In [11]:
import pandas as pd
import numpy as np

pd.DataFrame(df)[['FacilityName', 'Address', 'CityStateZip']]

Unnamed: 0,FacilityName,Address,CityStateZip
0,AGGIE LIQUOR,507 L ST,DAVIS CA 95616
1,ARIANA FOOD MARKET,1638 W CAPITOL AVE A STE,WEST SACRAMENTO CA 95691
2,A&B LIQUOR,2328 W CAPITOL AVE,WEST SACRAMENTO CA 95691
3,AY! JALISCO TAQUERIA #1,966 SACRAMENTO Ave,WEST SACRAMENTO CA 95691
4,ALI BABA RESTAURANT,220 3RD ST,DAVIS CA 95616
...,...,...,...
861,ZUMA POKE,730 3RD ST,DAVIS CA 95616
862,ZUMAPOKE & LUSH ICE,YOLO COUNTY,WITH IN YOLO COUNTY CA 95695
863,ZUMAPOKE & LUSH ICE - CATERING,730 3RD ST,DAVIS CA 95616
864,ZIMCUISINE,YOLO COUNTY,WITH IN YOLO COUNTY CA 95695


The process of obtaining latitude-longidute pairs from addresses is called geocoding. In the homework, you are using Nominatim to geocode ports. 

In [12]:
food=pd.read_feather("../data/food.feather")[['FacilityName', 'lat','lng']]
food.head()

Unnamed: 0,FacilityName,lat,lng
0,AGGIE LIQUOR,38.548803,-121.734964
1,ARIANA FOOD MARKET,38.580577,-121.529824
2,ARTEAGA'S SUPERMARKET INC,38.590215,-121.525425
3,AY! JALISCO TAQUERIA #1,38.589293,-121.524593
4,ALI BABA RESTAURANT,38.543602,-121.746331


Unlike most of the plotting packages we used before, __folium__ does not automatically handle missing values. So in order to make our map, we first need to remove the missing values from our dataset.

In [13]:
food=food.loc[food.lat.notna() & food.lng.notna()]

In [14]:
m = folium.Map(location = [38.5449, -121.7405], zoom_start = 11, tiles="Cartodb Positron")

cols = ["FacilityName", "lat", "lng"]
for name, lat, lng in food[cols].itertuples(index = False):
    popup = folium.Popup(name, parse_html = True)
    circle = folium.Circle([float(lat), float(lng)], color = "red", radius = 10, 
                           popup = popup)
    m.add_child(circle)
    
fig = folium.Figure(width = 700, height = 400)
fig.add_child(m)

Folium can also be used to create chloropeth maps. Cloropeth maps are similar to heat maps, in which the units of display are (usually) political entities. They were first introduced in France in the 19th century to color _départements_, which are administrative structures roughly equal in size. 

<div>
    <center>
<img src="https://upload.wikimedia.org/wikipedia/commons/3/38/Carte_figurative_de_l%27instruction_populaire_de_la_France.jpg" width="300"/>
</center>
    </div>

The preceding example about the proportion of literate population is a textbook example of chloropeth maps for unclassed data: The gradient ranges from low to high. 

In [15]:
state_geo = requests.get(
    "https://raw.githubusercontent.com/python-visualization/folium-example-data/main/us_states.json"
).json()
state_data = pd.read_csv(
    "https://raw.githubusercontent.com/python-visualization/folium-example-data/main/us_unemployment_oct_2012.csv"
)

`state_geo` is a [GeoJSON](https://geojson.org/) file. GeoJSONs identify a region by specifying the nodes of their corresponding polygon as latitude-longitude pairs. We are going to see later how these are structured. 

In [16]:
m = folium.Map(location=[48, -102], zoom_start=3)

folium.Choropleth(
    geo_data=state_geo, # takes GeoJSON
    name="choropleth",
    data=state_data,
    columns=["State", "Unemployment"],
    key_on="feature.id",
    fill_color="YlGn",
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name="Unemployment Rate (%)",
).add_to(m)

folium.LayerControl().add_to(m)

m

Classed maps color political entities by categorical features. The following example shows the party of each winner of constituencies for the 2019 United Kingdom election. 

<div>
    <center>
<img src="https://upload.wikimedia.org/wikipedia/commons/e/e2/2019UKElectionMap.svg" width="400"/>
</center>
</div>

The problem with this map is that a) lots of additional information (e.g., how well did the Conserative party (blue)) do in scotland) is missing and b) Even though the british electoral system is _winner takes all_, the constituencies are displayed in different sizes (e.g., the larger (rural) constituencies overinflate the success of the Conservative party). 

Issue b) is alleviated by variants of this kind: 

<div>
    <center>
<img src="https://miro.medium.com/v2/resize:fit:1400/1*hfA55y_xlYTs5v3k-_AxCA.png" width="200"/>
</center>
</div>

This is one example of preferring regular shapes over accurate constituency boundaries. The size of the constituencies are equal, as each correspond to one seat in parliament. They convey a more truthful message on the election results than constituencies that scale with area. 

As for issue a), classed cloropeth maps are generally unsuitable to display both range and class. However, one could try: 

<div>
<img src="https://upload.wikimedia.org/wikipedia/commons/a/a6/2019_General_Election_Results.png" width="350"/>
</div>

We can scrape the election results from wikipedia. Some data processing is in order. 

In [17]:
elections = pd.read_html('https://en.wikipedia.org/wiki/Results_of_the_2019_United_Kingdom_general_election') 

In [18]:
england = elections[0]
england = england.drop(england.columns[1:6], axis=1)
england.columns = [i[1] for i in england.columns.to_flat_index()]
england = england.rename(columns = {'Party': 'Winner','Lab[b][c]': 'Lab'})
england = england.loc[england.Constituency.notna() 
                          & (england.Constituency != 'Total for all constituencies')]
england = england[['Constituency', 'Winner', 'Con', 'Lab', 'LD', 'Grn', 'Total']]
england.head() 

Unnamed: 0,Constituency,Winner,Con,Lab,LD,Grn,Total
1,Aldershot,Con,27980,11282,6920,1750,47932
2,Aldridge-Brownhills,Con,27850,8014,2371,771,39342
3,Altrincham and Sale West,Con,26311,20172,6036,1566,54863
4,Amber Valley,Con,29096,12210,2873,1388,45567
5,Arundel and South Downs,Con,35566,9722,13045,2519,61408


In [19]:
scotland = elections[2]
scotland = scotland.drop(scotland.columns[1:4], axis=1)
scotland.columns = [i[1] for i in scotland.columns.to_flat_index()]
scotland = scotland.rename(columns = {'Party': 'Winner','Lab[b]': 'Lab'})
scotland = scotland.loc[scotland.Constituency.notna() 
                          & (scotland.Constituency != 'Total for all constituencies')]
scotland = scotland[['Constituency', 'Winner', 'Con', 'Lab', 'LD', 'Grn', 'Total']]
scotland.head() 

Unnamed: 0,Constituency,Winner,Con,Lab,LD,Grn,Total
1,Aberdeen North,SNP,7535,4939,2846,880.0,37413
2,Aberdeen South,SNP,16398,3834,5018,,45638
3,Airdrie and Shotts,SNP,7011,12728,1419,685.0,39772
4,Angus,SNP,17421,2051,2482,,43170
5,Argyll and Bute,SNP,16930,3248,6832,,48050


In [20]:
wales = elections[3]
wales = wales.drop(wales.columns[1:5], axis=1)
wales.columns = [i[1] for i in wales.columns.to_flat_index()]
wales = wales.rename(columns = {'Party': 'Winner','Lab[b]': 'Lab'})
wales = wales.loc[wales.Constituency.notna() 
                          & (wales.Constituency != 'Total for all constituencies')]
wales = wales[['Constituency', 'Winner', 'Con', 'Lab', 'LD', 'Grn', 'Total']]
wales.head()

Unnamed: 0,Constituency,Winner,Con,Lab,LD,Grn,Total
1,Aberavon,Lab,6518,17008,1072.0,450.0,31598
2,Aberconwy,Con,14687,12653,1821.0,,31865
3,Alyn and Deeside,Lab,18058,18271,2548.0,,43008
4,Arfon,PC,4428,10353,,,29074
5,Blaenau Gwent,Lab,5749,14862,1285.0,386.0,30219


In [21]:
election = pd.concat([england, scotland, wales]).set_index('Constituency').fillna(0)

winner = election['Winner']
election = election.drop(['Winner'], axis = 1)
for col in election.columns:
    election[col] = election[col].astype(int) / election['Total'].astype(int)
election = election.drop('Total', axis = 1)

election.head()

Unnamed: 0_level_0,Con,Lab,LD,Grn
Constituency,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Aldershot,0.583744,0.235375,0.144371,0.03651
Aldridge-Brownhills,0.707895,0.203701,0.060266,0.019597
Altrincham and Sale West,0.479576,0.367679,0.11002,0.028544
Amber Valley,0.638532,0.267957,0.06305,0.030461
Arundel and South Downs,0.579175,0.158318,0.212432,0.041021


In [22]:
winner.head()

Constituency
Aldershot                   Con
Aldridge-Brownhills         Con
Altrincham and Sale West    Con
Amber Valley                Con
Arundel and South Downs     Con
Name: Winner, dtype: object

The geographical information on the consituencies is available as GeoJSON online. For GeoJSON see [here](https://geojson.org/) and [here](https://en.wikipedia.org/wiki/GeoJSON). We now have to merge both files. 

In [23]:
boundaries=requests.get('https://github.com/martinjc/UK-GeoJSON/blob/master/json/electoral/gb/wpc.json?raw=true').json()

In [26]:
boundaries["features"][0] # first Constituency. 

{'type': 'Feature',
 'properties': {'PCON13CD': 'E14000530',
  'PCON13CDO': 'A01',
  'PCON13NM': 'Aldershot'},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-0.804767102029057, 51.245066430314324],
    [-0.80649323571884, 51.25343606205937],
    [-0.806147172256213, 51.25811210674302],
    [-0.804418034017871, 51.261458465802434],
    [-0.804766323857203, 51.270355354742726],
    [-0.803275379760085, 51.272291442640885],
    [-0.805196412397399, 51.27438916587343],
    [-0.806421234750183, 51.27508866361058],
    [-0.807541363875476, 51.27529250949861],
    [-0.807216934463124, 51.27569655334894],
    [-0.790309708323525, 51.28219715201912],
    [-0.792116591219439, 51.2832822820266],
    [-0.796410164857486, 51.283921775061465],
    [-0.798006015473954, 51.284584707063765],
    [-0.798058038389331, 51.28495392223091],
    [-0.79596601969096, 51.286814504331616],
    [-0.797579510920269, 51.287958706658266],
    [-0.800134062052264, 51.28892909051291],
    [-0.80098927845147, 51

Since most libraries are only able to retrieve information under first-level node, we have to modify the GeoJSON to make the names easily accessible. 

In [27]:
for feature in boundaries['features']:
    feature['PCON13NM'] = feature['properties']['PCON13NM']

In [28]:
for b in boundaries["features"]: 
    print(b['PCON13NM'])

Aldershot
Aldridge-Brownhills
Altrincham and Sale West
Amber Valley
Arundel and South Downs
Ashfield
Ashford
Ashton-under-Lyne
Aylesbury
Banbury
Barking
Barnsley Central
Barnsley East
Barrow and Furness
Basildon and Billericay
Basingstoke
Bassetlaw
Bath
Batley and Spen
Battersea
Beaconsfield
Beckenham
Bedford
Bermondsey and Old Southwark
Berwick-upon-Tweed
Bethnal Green and Bow
Beverley and Holderness
Bexhill and Battle
Bexleyheath and Crayford
Birkenhead
Birmingham, Edgbaston
Birmingham, Erdington
Birmingham, Hall Green
Birmingham, Hodge Hill
Birmingham, Ladywood
Birmingham, Northfield
Birmingham, Perry Barr
Birmingham, Selly Oak
Birmingham, Yardley
Bishop Auckland
Blackburn
Blackley and Broughton
Blackpool North and Cleveleys
Blackpool South
Blaydon
Blyth Valley
Bognor Regis and Littlehampton
Bolsover
Bolton North East
Bolton South East
Bolton West
Bootle
Boston and Skegness
Bosworth
Bournemouth East
Bournemouth West
Bracknell
Bradford East
Bradford South
Bradford West
Braintree
Bren

Some constituencies in our election data have non-unicode names. They will not be matched correctly.

In [29]:
election.index

Index(['Aldershot', 'Aldridge-Brownhills', 'Altrincham and Sale West',
       'Amber Valley', 'Arundel and South Downs', 'Ashfield', 'Ashford',
       'Ashton-under-Lyne', 'Aylesbury', 'Banbury',
       ...
       'Pontypridd', 'Preseli Pembrokeshire', 'Rhondda', 'Swansea East',
       'Swansea West', 'Torfaen', 'Vale of Clwyd', 'Vale of Glamorgan',
       'Wrexham', 'Ynys Môn'],
      dtype='object', name='Constituency', length=632)

In [30]:
import re
from unidecode import unidecode

standardize = lambda x: unidecode(re.sub(',', '', x))
election.index = [standardize(i) for i in election.index]

election.index

Index(['Aldershot', 'Aldridge-Brownhills', 'Altrincham and Sale West',
       'Amber Valley', 'Arundel and South Downs', 'Ashfield', 'Ashford',
       'Ashton-under-Lyne', 'Aylesbury', 'Banbury',
       ...
       'Pontypridd', 'Preseli Pembrokeshire', 'Rhondda', 'Swansea East',
       'Swansea West', 'Torfaen', 'Vale of Clwyd', 'Vale of Glamorgan',
       'Wrexham', 'Ynys Mon'],
      dtype='object', length=632)

In [31]:
election.index[508] # given as Weston-Super-Mare in boundaries! 

'Weston-super-Mare'

In [32]:
election = election.rename(index = {'Weston-super-Mare': 'Weston-Super-Mare'})

Any remaining mismatches of the data and GeoJSON file that contains the polygons will have to be dealt with later.  

We want to color the map according to how good each party did in the constituency. 

In [37]:
election = dict(election)
election['Grn']['Aldershot'] 

0.03651005591254277

Lets assign each party a color. `branca.colormap.LinearColormap` create a linar interpolation between two colors. 

In [38]:
import branca.colormap as cmp

colors = {'fire': cmp.LinearColormap(['white', color], vmin=0, vmax=max(election[party])), 
          'fire': cmp.LinearColormap(['white', color], vmin=0, vmax=max(election[party])), 
          'fire': cmp.LinearColormap(['white', color], vmin=0, vmax=max(election[party]))}
          
          \
          for party, color in zip(election.keys(), 
                                  ['#3a85d6', '#ed4224', '#e8ca54', '#6cbd6c'])}

In [42]:
colors['Con']

In [43]:
election['Con']['Aldershot']

0.5837436368188267

In [45]:
colors['Con'](0.5)

'#7fb0e5ff'

The custom coloring `get_color` takes the constituency name from the GeoJSON, removes commas (to deal with another mismatch: 'Birmingham, Edgbaston' to 'Birmingham Edgbaston') and, if data is available for that polygon, colors it according to the vote share.  

In [46]:
def get_color(feature, party):
    value = feature['PCON13NM']
    value = re.sub(',', '', value)
    
    #print(value)
    
    return colors[party](election[party][value])

In [47]:
get_color(boundaries['features'][0], 'Lab') # [0] for the first constituency

'#facbc2ff'

Lets create a map. We set `tiles` to `False` to remove the standard openstreetview map. 

We can use some default [tiles](https://python-visualization.github.io/folium/latest/user_guide/raster_layers/tiles.html) or use custom ones. 

I chose to use one of [Esri](https://server.arcgisonline.com/arcgis/rest/services)s open maps with a monochrome background. 

In [60]:
import folium 
m = folium.Map(location=[52, 0.0], zoom_start=7, 
               width=800, height=500, 
               tiles = None)
folium.TileLayer(
    tiles='https://server.arcgisonline.com/ArcGIS/rest/services/World_Terrain_Base/MapServer/tile/{z}/{y}/{x}',
    attr='Esri',
    name='Esri Satellite', overlay=True, control=False
).add_to(m)

<folium.raster_layers.TileLayer at 0x7fe9f0c14880>

In [61]:
m

In [56]:
boundaries['features'][0]

{'type': 'Feature',
 'properties': {'PCON13CD': 'E14000530',
  'PCON13CDO': 'A01',
  'PCON13NM': 'Aldershot'},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-0.804767102029057, 51.245066430314324],
    [-0.80649323571884, 51.25343606205937],
    [-0.806147172256213, 51.25811210674302],
    [-0.804418034017871, 51.261458465802434],
    [-0.804766323857203, 51.270355354742726],
    [-0.803275379760085, 51.272291442640885],
    [-0.805196412397399, 51.27438916587343],
    [-0.806421234750183, 51.27508866361058],
    [-0.807541363875476, 51.27529250949861],
    [-0.807216934463124, 51.27569655334894],
    [-0.790309708323525, 51.28219715201912],
    [-0.792116591219439, 51.2832822820266],
    [-0.796410164857486, 51.283921775061465],
    [-0.798006015473954, 51.284584707063765],
    [-0.798058038389331, 51.28495392223091],
    [-0.79596601969096, 51.286814504331616],
    [-0.797579510920269, 51.287958706658266],
    [-0.800134062052264, 51.28892909051291],
    [-0.80098927845147, 51

Now, I would like to add each party's performance onto the map sequentially.  Note that we pass `get_color` to the `style_function` argument. The additional parameters govern the boundaries, opacity, and `overlay=False` ensures that each object is given a radio buttion, not a checkmark button. 

In [None]:
get_color

In [1]:
for i in ['Con', 'Lab', 'LD', 'Grn']: 
    fg = folium.FeatureGroup(name=i, overlay=False)

    folium.GeoJson(
            boundaries,
            style_function=lambda feature, party=i: {
                "fillColor": get_color(feature, party),
                "color": "gray",
                "weight": 1,
                "dashArray": "1",
                "fillOpacity": 1,
            }, popup=folium.GeoJsonPopup(fields=["PCON13NM"], aliases = ['Constituency'])
        ).add_to(fg)
    
    fg.add_to(m)
    

folium.LayerControl(collapsed=False).add_to(m)

m

NameError: name 'folium' is not defined

Even though this map does not use regular shapes do map each constituency, we learn, e.g., that the Tories do better in rural areas, while Labour underperformes in these. With notable exceptions, the LibDems are stronger in the rural south.

In [58]:
m.save("./source/british2019_election.html")

In [59]:
!open ./source/british2019_election.html