Using geocoders in Python
=============

A lot of the geographic functionality that we've seen in Google Maps, or OpenStreetMap, can be got at programmatically! This can be really useful for getting information about a place, even when you don't want to show it on a map.

Today we will look at the GeoPy library, which gives us a consistent interface to a bunch of services including Google maps, GeoNames, Bing, Baidu, OpenStreetMap / Nominatim, and others.

Go to a command / terminal window and type

    pip install geopy
    
or

    pip3 install geopy
    
if you are on Linux.

In [1]:
import geopy

Place text search
------------

You have some place name, and you want to find out where it is. We can do this using a variety of services, and [Geopy's documentation](http://geopy.readthedocs.org/en/1.10.0/) tells us how. First we will use Nominatim, which is the database behind OpenStreetMap.

In [2]:
from geopy.geocoders import Nominatim
geocoder = Nominatim()
location = geocoder.geocode("Constantinople")
print(location.raw)
print(location.latitude, location.longitude)
print("Address is %s" % location.address)

{'licence': 'Data © OpenStreetMap contributors, ODbL 1.0. http://www.openstreetmap.org/copyright', 'class': 'place', 'icon': 'https://nominatim.openstreetmap.org/images/mapicons/poi_place_city.p.20.png', 'place_id': '18731084', 'lat': '41.0096334', 'type': 'city', 'boundingbox': ['40.8496334', '41.1696334', '28.8051646', '29.1251646'], 'osm_type': 'node', 'importance': 0.73489371582034, 'osm_id': '1882099475', 'lon': '28.9651646', 'display_name': 'İstanbul, Fatih, İstanbul, Marmara Bölgesi, Türkiye'}
41.0096334 28.9651646
Address is İstanbul, Fatih, İstanbul, Marmara Bölgesi, Türkiye


If we also pass the parameter `exactly_one` and set it to `False`, then we will get every place that matches the search term, instead of just the most likely place.

In [3]:
locationlist = geocoder.geocode("Zurich", exactly_one=False)
print([x.address for x in locationlist])

['Zürich, Bezirk Zürich, Zürich, Schweiz, Suisse, Svizzera, Svizra', 'Zürich, Schweiz, Suisse, Svizzera, Svizra', 'Zurich, CA 168, Big Pine, Inyo County, California, 93513, United States of America', 'Zurich, Súdwest-Fryslân, Friesland, Nederland', 'Zurich, Friesland, Nederland, 8751, Nederland', 'Zurich, Arcadia Town, Wayne County, New York, United States of America', 'Zurich, Blaine County, Montana, United States of America', 'Zurich, Ontario, Canada', 'Zurich, Rooks County, Kansas, United States of America', 'Zurich, Ontario, Canada']


That funny-looking thing in the print statement above is a *list comprehension*. It is a short and sweet way of saying this:

In [4]:
addresses = []
for x in locationlist:
    addresses.append(x.address)
print(addresses)

['Zürich, Bezirk Zürich, Zürich, Schweiz, Suisse, Svizzera, Svizra', 'Zürich, Schweiz, Suisse, Svizzera, Svizra', 'Zurich, CA 168, Big Pine, Inyo County, California, 93513, United States of America', 'Zurich, Súdwest-Fryslân, Friesland, Nederland', 'Zurich, Friesland, Nederland, 8751, Nederland', 'Zurich, Arcadia Town, Wayne County, New York, United States of America', 'Zurich, Blaine County, Montana, United States of America', 'Zurich, Ontario, Canada', 'Zurich, Rooks County, Kansas, United States of America', 'Zurich, Ontario, Canada']


Another geocoding service that is often useful is called GeoNames. It is a database that collects the names, in different languages and different times, of various places. So if you know what a place was called in Hungarian in the seventeenth century but Google Maps has never heard of it, you can try Geonames, which might tell you its current Romanian name!

In [5]:
from geopy.geocoders import GeoNames
geocoder = GeoNames(username="aurum")
location = geocoder.geocode("Edessa", exactly_one=False)
for x in location:
    print("Found a location with data: %s" % x.raw)


Found a location with data: {'countryName': 'Turkey', 'fcode': 'PPLA', 'population': 449549, 'fclName': 'city, village,...', 'lng': '38.79392', 'name': 'Sanliurfa', 'adminCode1': '63', 'fcl': 'P', 'fcodeName': 'seat of a first-order administrative division', 'toponymName': 'Şanlıurfa', 'geonameId': 298333, 'countryId': '298795', 'countryCode': 'TR', 'adminName1': 'Şanlıurfa', 'lat': '37.16708'}
Found a location with data: {'countryName': 'Greece', 'fcode': 'PPLA2', 'population': 18669, 'fclName': 'city, village,...', 'lng': '22.04751', 'name': 'Edessa', 'adminCode1': 'ESYE12', 'fcl': 'P', 'fcodeName': 'seat of a second-order administrative division', 'toponymName': 'Édessa', 'geonameId': 736357, 'countryId': '390903', 'countryCode': 'GR', 'adminName1': 'Central Macedonia', 'lat': '40.8026'}
Found a location with data: {'countryName': 'Turkey', 'fcode': 'ADM1', 'population': 1801980, 'fclName': 'country, state, region,...', 'lng': '38.78174', 'name': 'Şanlıurfa', 'adminCode1': '63', 'fc

Of course Google itself is also an option. Google geolocation requires an API key, which is essentially a special password that is associated with your user account, to use its web API. This is because Google sets limits to how much you can use it for free. The limits should plenty for a normal person's use, though.

For the time being, you can use the API key that I have put in ILIAS as long as you are on the network of the Uni Bern. If you are going to do your own work with Maps, though, then you should go to http://developers.google.com/ and sign up as a developer. You'll then need to create a project, and in that project go to 'Google APIs' -> "Google Maps Geocoding API" and enable it. You'll then need to go to 'Credentials' to make your API key. If you need further help with the options, then talk to me!

In [6]:
google_api_key = 'AIzaSyCb3M6BOdWOEqpvOMfOyntu-lZN28oCvBY'

from geopy.geocoders import GoogleV3
geocoder = GoogleV3(api_key=google_api_key)
location = geocoder.geocode("Zürich")
print(location.raw)

[{'geometry': {'bounds': {'northeast': {'lng': 8.6253701, 'lat': 47.43468}, 'southwest': {'lng': 8.448059899999999, 'lat': 47.32023}}, 'location_type': 'APPROXIMATE', 'location': {'lng': 8.541694, 'lat': 47.3768866}, 'viewport': {'northeast': {'lng': 8.6253701, 'lat': 47.43468}, 'southwest': {'lng': 8.448059899999999, 'lat': 47.32023}}}, 'partial_match': True, 'types': ['locality', 'political'], 'formatted_address': 'Zürich, Switzerland', 'place_id': 'ChIJGaK-SZcLkEcRA9wf5_GNbuY', 'address_components': [{'types': ['locality', 'political'], 'short_name': 'Zürich', 'long_name': 'Zürich'}, {'types': ['administrative_area_level_2', 'political'], 'short_name': 'Zürich District', 'long_name': 'Zürich District'}, {'types': ['administrative_area_level_1', 'political'], 'short_name': 'ZH', 'long_name': 'Zurich'}, {'types': ['country', 'political'], 'short_name': 'CH', 'long_name': 'Switzerland'}]}]


Using an API directly
------------

The Geopy library is very useful in that it lets you do the same job using a variety of services. However, it doesn't handle everything that a particular service might offer. Google has something else, called the "Places API", that gives information not just about typical geographic locations but also about businesses, monuments, and so on. Here is an example of how this is used - if you want to do this with your own API key, you will need to enable the 'Google Places API Web Service' as well, in the online developer console!



In [7]:
import requests

places_search_url = 'https://maps.googleapis.com/maps/api/place/textsearch/json'
search_params = {
    'query': 'Länggass Stübli',
    'key': google_api_key,
    'language': 'en'
    }

r = requests.get( places_search_url, params=search_params)
search_result = r.json()   # See what we got
print(search_result)

{'results': [{'geometry': {'location': {'lng': 7.429128900000002, 'lat': 46.9544466}}, 'reference': 'CoQBewAAADtdoB9of6jT52hJ4QxG5gUW1K6Ids5ZmaIV3OiRHowmTCyY9WltWS6T54XhT16C-UNSVYzOAhi-LuWQjmuClfjVtqecFJ9GUz1ntbKrOnK7Hw0WEhmhRsehrgbiMpjfI78sdf99y7gSt0Orp-0WxTBnflsGtrHbGdH1Swqx2WgmEhAXKMfmICEt-z8EmQYzaV9WGhR9tNDE0qZeKC_psb0mA0a6mLWlaQ', 'photos': [{'width': 250, 'height': 251, 'photo_reference': 'CmRdAAAAAfJUeYFsdWl98W9vlF84vAmi6Be_fgkep6ZnpBJR4lsOBR_HM8drcaCbX3akGecdxWKADHlvCg-r58Hhtmg0dXJyWM6_7PmPXZzxPeb8whbPpx7narh8jFADTQ5bWztzEhCV39mOqXaUqsfidnCLPTaQGhRF1hBtn_DLeB8IqANW3tznWlYFpQ', 'html_attributions': ['<a href="https://maps.google.com/maps/contrib/103880176915790775728/photos">Restaurant Länggass Stübli da Massimo</a>']}], 'name': 'Restaurant Länggass Stübli da Massimo', 'types': ['restaurant', 'food', 'point_of_interest', 'establishment'], 'formatted_address': 'Muesmattstrasse 46, 3012 Bern, Switzerland', 'icon': 'https://maps.gstatic.com/mapfiles/place_api/icons/restaurant-71.pn

Once we have successfully looked up a place on Google, we will have an ID for it. This is Google's way of distinguishing between places of the same name, so that we know we have the right one. We can use that ID to get information about a place we have already looked up, but for this we have to leave Geopy behind and use the API directly!

In [8]:
places_details_url = 'https://maps.googleapis.com/maps/api/place/details/json'
detail_params = {
    'key': google_api_key,
    'placeid': search_result['results'][0]['place_id'],
    'language': 'en'
}

r = requests.get( places_details_url, params=detail_params)
r.json()  # See what we got

{'html_attributions': [],
 'result': {'address_components': [{'long_name': '46',
    'short_name': '46',
    'types': ['street_number']},
   {'long_name': 'Muesmattstrasse',
    'short_name': 'Muesmattstrasse',
    'types': ['route']},
   {'long_name': 'Muesmatt',
    'short_name': 'Muesmatt',
    'types': ['sublocality_level_2', 'sublocality', 'political']},
   {'long_name': 'Länggasse-Felsenau',
    'short_name': 'Länggasse-Felsenau',
    'types': ['sublocality_level_1', 'sublocality', 'political']},
   {'long_name': 'Bern',
    'short_name': 'Bern',
    'types': ['locality', 'political']},
   {'long_name': 'Bern',
    'short_name': 'Bern',
    'types': ['administrative_area_level_2', 'political']},
   {'long_name': 'Bern',
    'short_name': 'BE',
    'types': ['administrative_area_level_1', 'political']},
   {'long_name': 'Switzerland',
    'short_name': 'CH',
    'types': ['country', 'political']},
   {'long_name': '3012', 'short_name': '3012', 'types': ['postal_code']}],
  'adr_ad

Now let's look up a series of places! We'll store our results in places_found, for each place that we find.

Exporting our data to CSV
--------

One thing we can do with the Places API is to look up a bunch of places, get their latitude and longitude or their canonical names, and put those into a big spreadsheet for use elsewhere (or even for importing into Google Maps to make a map!)

Let's look up a bunch of place names so that we can put their information into a CSV file that we will make. We have four search terms, and for each one we'll see if we get a result; if we do, it will go into the `places_found` dictionary that we will use below.

In [9]:
places_to_lookup = ['Moskva', 'Venice', 'Rosslyn Chapel', 'Cantabrigia']
places_found = {}

geocoder = GoogleV3(api_key=google_api_key)
for p in places_to_lookup:
    myresult = geocoder.geocode(p)
    if p is not None:
        print("Found information for %s" % p)
        places_found[p] = myresult

Found information for Moskva
Found information for Venice
Found information for Rosslyn Chapel
Found information for Cantabrigia


The easiest way to make something like a spreadsheet in a computer program is to use CSV, which stands for *comma separated values*. That is what we used earlier to get our UK fat supply data into our map. Python has a built-in module for this, and we use it like this to make a CSV file.

In [10]:
import csv

f = open('myplaces.csv', 'w', newline='', encoding='utf-8')
writer = csv.writer(f)
# First, write our column headers!
writer.writerow(['Place name', 'Address', 'Latitude', 'Longitude'])


39

Now we have an open file called 'myplaces.csv', and we have written one row to it. If you were to close the filehandle now and look at the file, you would see that it looks like this:

    Place name,Address,ID,Latitude,Longitude
    
But we won't close the file yet, because we want to write each of our places into its row.

In [11]:
for p in places_found.keys():
    location = places_found[p]
    writer.writerow([p, location.address, location.latitude, location.longitude])

f.close()  # Always close what you open, if you didn't use 'with'!
    

Now we can make sure the file is there and has what we expect!

In [12]:
with open('myplaces.csv', encoding='utf-8') as f:
    data = f.read()
    
print(data)

Place name,Address,Latitude,Longitude
Venice,"Venice, Italy",45.4408474,12.3155151
Cantabrigia,"Cambridge, Cambridge, UK",52.205337,0.121817
Moskva,"Moscow, Russia",55.755826,37.6173
Rosslyn Chapel,"Rosslyn Chapel, Chapel Loan, Roslin, Midlothian EH25 9PU, UK",55.8553785,-3.1601938



And now you can find that file on your computer (it should be in the same folder as this notebook) and use it for processing anywhere you like, including putting back into Google Maps if you so choose.