<a href="https://colab.research.google.com/github/AndrewZou/Coursera_Capstone/blob/master/Nominatim_geocode_example.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Geocoding addresses in Python with Geopy

This notebook demonstrates a simple python geocoding and mapping workflow.

###  Install geocoding and mapping libraries that we will use.

- [geopy](https://geopy.readthedocs.io/), for consistant interface to different geocoding APIs
- [folium](https://github.com/python-visualization/folium), a python library for making interactive maps

In [1]:
# Geocoding library 
!pip install geopy



In [2]:
# Install folium
!pip install folium




### Import the libraries we just installed.

In [0]:
from geopy.geocoders import Nominatim
import folium

### Create a geocoder that uses the [OpenStreetMap Nominatim API](https://https://wiki.openstreetmap.org/wiki/Nominatim/FAQ)

In [0]:
geolocator = Nominatim(timeout=10, user_agent = "dlab.berkeley.edu-workshop")

Test the geocoder with one address

In [5]:
location = geolocator.geocode('Barrows Hall, Berkeley CA')
location

Location(Barrows Hall, Eshleman Road, Southside, Berkeley, Alameda County, California, 94720, USA, (37.87010715, -122.257926969474, 0.0))

### Upload a CSV file of addresses to geocode

The next cell shows one way to upload a local file for use in a Google Collaboratory Jupyter notebook.
- You can get this file [here](https://raw.githubusercontent.com/dlab-geo/geocoding/master/address_data/SFLandmarks.csv)

In [7]:
from google.colab import files
uploaded = files.upload()

Saving geo.csv to geo.csv


### Import two more libraries to work with our file

- The `io` library will allow us to read in the CSV file.
- The [pandas](https://pandas.pydata.org/) library is widely used for working with tabular data.

In [9]:
import io
import pandas as pd

df = pd.read_csv(io.StringIO(uploaded["geo.csv"].decode('utf-8')))
print(df)

   ID          Landmark           City State
0   1      Union Square  San Francisco    CA
1   2        Coit Tower  San Francisco    CA
2   3  Golden Gate Park  San Francisco    CA
3   4        Twin Peaks  San Francisco    CA


### Geocode the addresses in the Pandas Dataframe

The next cell does the following

- Iterates over all rows in the dataframe `df`.
- Joins the values in the columns `Landmark`, `City`, and `State` into one string
- Submits the string we just created as the address to be geocoded
- Saves results to the `geocodes` list object.

In [0]:
geocodes = [geolocator.geocode(', '.join([df['Landmark'][i], df['City'][i], df['State'][i]]))for i in range(len(df))]


Take a look at the output.

In [11]:
geocodes

[Location(Union Square, SF, California, USA, (37.7879363, -122.40751740318, 0.0)),
 Location(Coit Tower, Telegraph Hill Boulevard, Telegraph Hill, SF, California, 94113, USA, (37.80237905, -122.405834354613, 0.0)),
 Location(Golden Gate Park, Richmond District, SF, California, 94118-4504, USA, (37.7693681, -122.482183711177, 0.0)),
 Location(Twin Peaks, Christmas Tree Point Road, Cole Valley, SF, California, 94114-1818, USA, (37.75464, -122.44648, 0.0))]

Add the output `latitude` and `longitude` values in the `geocodes` list to the `df` dataframe

In [12]:
df['lat'] = [g.latitude for g in geocodes]
df['lon'] = [g.longitude for g in geocodes]
df

Unnamed: 0,ID,Landmark,City,State,lat,lon
0,1,Union Square,San Francisco,CA,37.787936,-122.407517
1,2,Coit Tower,San Francisco,CA,37.802379,-122.405834
2,3,Golden Gate Park,San Francisco,CA,37.769368,-122.482184
3,4,Twin Peaks,San Francisco,CA,37.75464,-122.44648


### Map the output

First, make an empty map centered on San Francisco. Tip, don't name your map **map** as that is a python function.

In [13]:
mapa = folium.Map(location=(37.754640,	-122.446480), zoom_start=12)
display(mapa)

### Add the geocoded locations to the map

In [14]:
for index,row in df.iterrows(): 
  folium.Marker(location=(row['lat'],row['lon']), popup=row['Landmark']).add_to(mapa)

display(mapa)

Click on the markers above to view the contents of each popup.

## Next steps

You can use this basic workflow to geocode named places, zip codes, or addresses, depending on what is supported by the API you use. See the documentation for `geopy` about available geocoding APIs. Then carefully read the documentation for the geocoding API that you decide to use as they typically require `API keys` and have usage limits.

Last Updated on Octber 23, 2018 by Patty Frontiera, pfrontiera@berkeley.edu 

In [23]:
geocodes_ = [geolocator.geocode('North York, Ontario')]

geocodes_

[Location(North York, Willowdale, North York, Toronto, Ontario, M2N 5P7, Canada, (43.7708175, -79.4132998, 0.0))]

In [18]:
!pip install geocoder

Collecting geocoder
[?25l  Downloading https://files.pythonhosted.org/packages/4f/6b/13166c909ad2f2d76b929a4227c952630ebaf0d729f6317eb09cbceccbab/geocoder-1.38.1-py2.py3-none-any.whl (98kB)
[K     |████████████████████████████████| 102kB 4.4MB/s 
[?25hCollecting ratelim (from geocoder)
  Downloading https://files.pythonhosted.org/packages/f2/98/7e6d147fd16a10a5f821db6e25f192265d6ecca3d82957a4fdd592cad49c/ratelim-0.1.6-py2.py3-none-any.whl
Installing collected packages: ratelim, geocoder
Successfully installed geocoder-1.38.1 ratelim-0.1.6


In [19]:
import geocoder # import geocoder

# initialize your variable to None
lat_lng_coords = None
postal_code = 'M5G'
# loop until you get the coordinates
while(lat_lng_coords is None):
  g = geocoder.google('{}, Toronto, Ontario'.format(postal_code))
  lat_lng_coords = g.latlng

latitude = lat_lng_coords[0]
longitude = lat_lng_coords[1]

KeyboardInterrupt: ignored