## Simple geocoding with geopy and ipleaflet

This notebook gives a very quick introduction to using geopy to find geospatial data from textual descriptions, and then plot it on a map using ipleaflet. We're using collection data in csv from the University of Melbourne Archives. This series of photographs is from the Commercial Travellers Association.

In [1]:
from geopy.geocoders import ArcGIS
import pandas as pd

data = pd.read_csv('CTA PHOTOGRAPHS.csv', index_col='Identifier')
data.head()

Unnamed: 0_level_0,Level of description,Title,Date,Description,Location,Format,Rights,Contributor,Collection identifier,Collection title,Published online?,MulMultiMediaRefLocal_1,URL
Identifier,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1
1979.0162.02178,Item,Dagg's Falls,c. 1929-1933,"Black and white photograph, two men standing a...","Dagg's Falls waterfall, QLD, Australia",Photograph,This image is out of copyright. It is provided...,,1979.0162,Commercial Travellers' Association,Yes,92663,http://gallery.its.unimelb.edu.au/umblumaic/im...
1979.0162.02179,Item,View from Eagle Heights Tambourine Mountain,21 August 1926,"Black and white photograph, view from Eagle He...","Eagle Heights Tambourine Mountain, QLD, Australia",Photograph,This image is out of copyright. It is provided...,,1979.0162,Commercial Travellers' Association,Yes,92664,http://gallery.its.unimelb.edu.au/umblumaic/im...
1979.0162.02180,Item,"View from Picnic Point Main Range, Toowoomba S...",21 August 1926,"Black and white photograph, view of mountain s...","Picnic Point, Toowoomba, QLD, Australia",Photograph,This image is out of copyright. It is provided...,,1979.0162,Commercial Travellers' Association,Yes,92665,http://gallery.its.unimelb.edu.au/umblumaic/im...
1979.0162.02181,Item,Carrington Falls near Herberton Cairns Distric...,15 September 1930,"Black and white photograph, man standing midwa...","Carrington Falls, Herberton, QLD, Australia",Photograph,This image is out of copyright. It is provided...,,1979.0162,Commercial Travellers' Association,Yes,92666,http://gallery.its.unimelb.edu.au/umblumaic/im...
1979.0162.02182,Item,Olsen's Harp Caves 16 Miles NW from Rockhampto...,1929-1933,"Black and white photograph, picture of stalagm...","Olsen's Harp Caves, QLD, Australia",Photograph,This image is out of copyright. It is provided...,,1979.0162,Commercial Travellers' Association,Yes,92667,http://gallery.its.unimelb.edu.au/umblumaic/im...


Note that I have manually extracted location text into a separate column for the first 50 records. You may want to try automating this using named entity recognition, regular expressions or another method.

Next we're going to use geopy to lookup location data and extract it into a separate column. I'm using the ArcGIS service, but there are many others wrapped in geopy.

In [2]:
geolocator = ArcGIS()
from numpy import NaN
def find_coords(text):
    if text is not NaN:
        return geolocator.geocode(text)
    
data['coords'] = data['Location'].apply(find_coords)

Now we have our location data, we can put it on a map. We're aqlso going to add popup messages with basic metadata and a link back to the UMA catalogue record.

In [3]:
from ipyleaflet import Map, basemaps, basemap_to_tiles, Marker, Popup
from ipywidgets import HTML


def make_popup(ident, row):
    message = HTML()
    thumb = f"https://gallery.its.unimelb.edu.au/umblumaic/imu.php?request=multimedia&irn={row['MulMultiMediaRefLocal_1']}&bestfit=yes&width=200" 
    message.value = f"""<table style="float:right">
    <tr><td><table>
    <tr><td style="padding:5px">Identifier</td><td>{ident}</td></tr>
    <tr><td style="padding:5px">Title</td><td>{row['Title']}</td></tr>
    <tr><td style="padding:5px">Date</td><td>{row['Date']}</td></tr></table></td>
    <td style="padding:5px"><a href="{row['URL']}" target="_blank"><img src="{thumb}" width="200"/></a></td>
    </tr>
    </table>"""
    return message


c = geolocator.geocode('Australia')
m = Map(basemap=basemap_to_tiles(basemaps.CartoDB.Positron), zoom=4, center=(c.latitude, c.longitude))

for ident, row in data.iterrows():
    if row['coords'] != None:
        marker = Marker(
            location=(row['coords'].latitude, row['coords'].longitude),
            draggable=False, title=ident+': '+row['Title'])
        marker.popup = make_popup(ident, row)
        m.add_layer(marker)
m

Map(center=[-25.70993156999998, 134.48403119800003], controls=(ZoomControl(options=['position', 'zoom_in_text'…

Not bad! We do have some dodgy locations in there, and there's other metadata elements like dates we've done nothing with. Have a think about what this metadata enabels and what it's major drawbacks are in terms of reuse. However, note that if we have multiple images with the same location, we're not going to see them all. So another approach we can take is to contruct a gallery popup for each location.

In [31]:
from xml.etree import ElementTree as ET
import math

def make_itemtable_popup(rows):
    message = HTML()
    num_cols = math.ceil(math.sqrt(len(rows)))
    table = ET.Element('table', style=f"width: {100*num_cols}px")
    t_row = ET.Element('tr')
    table.append(t_row)
    col = 0
    for row in rows:
        col += 1
        if col > num_cols:
            t_row = ET.Element('tr')
            table.append(t_row)
            col = 1
        t_cell = ET.Element('td', style="padding:5px")
        t_row.append(t_cell)
        link = ET.Element('a', href=row['URL'], target="_blank")
        t_cell.append(link)
        thumb = f"https://gallery.its.unimelb.edu.au/umblumaic/imu.php?request=multimedia&irn={row['MulMultiMediaRefLocal_1']}&bestfit=yes&width=200" 
        image = ET.Element('img', src=thumb, alt=f"row['Title']")
        link.append(image)
    message.value = ET.tostring(table, encoding='utf8', method='html').decode()
    return message

places = {}
for ident, row in data.iterrows():
    if row['coords'] != None:
        coords = (row['coords'].latitude, row['coords'].longitude, row['coords'].address)
        if coords in places.keys():
            places[coords].append(row)
        else:
            places[coords] = [row]
    

m2 = Map(basemap=basemap_to_tiles(basemaps.CartoDB.Positron), zoom=4, center=(c.latitude, c.longitude))

for coords, rows in places.items():
    marker = Marker(location=coords[:2], draggable=False, title=coords[2])
    marker.popup = make_itemtable_popup(rows)
    m2.add_layer(marker)
m2

Map(center=[-25.70993156999998, 134.48403119800003], controls=(ZoomControl(options=['position', 'zoom_in_text'…