## Geocoding addresses in python with Geopy
This notebook demonstrates a simple python geocoding and mapping workflow.

###  Install geocoding and mapping libraries that we will use.

- [geopy](https://geopy.readthedocs.io/), for consistant interface to different geocoding APIs
- [folium](https://github.com/python-visualization/folium), a python library for making interactive maps

In [1]:
# Geocoding library


In [2]:
!pip install geopy
!pip install folium

Collecting geopy
  Downloading geopy-2.1.0-py3-none-any.whl (112 kB)
Collecting geographiclib<2,>=1.49
  Downloading geographiclib-1.50-py3-none-any.whl (38 kB)
Installing collected packages: geographiclib, geopy
Successfully installed geographiclib-1.50 geopy-2.1.0


## Import the libraries

In [11]:
import pandas as pd
from geopy.geocoders import Nominatim
import folium
import requests

## Create a geocoder that uses the OpenStreetMap Nominatim API

In [9]:
geolocator = Nominatim(timeout=10, user_agent = "denironyx@gmail.com")

## Test the geocoder with one address
location = geolocator.geocode('Barrows Hall, Berkeley CA')
location

Location(Social Sciences Building, Eshleman Road, Southside, Berkeley, Alameda County, California, 94720-1076, United States, (37.87010975, -122.25792906636786, 0.0))

## Upload a CSV file of addresses to geocode
Fetch the data to geocode with wget

In [6]:
url = "https://raw.githubusercontent.com/dlab-geo/geocoding/master/address_data/SFLandmarks.csv"
r = requests.get(url, allow_redirects=True)
open('SFLandmarks.csv', 'wb').write(r.content)

156

In [7]:
ls

 Volume in drive C is Windows
 Volume Serial Number is 5CCB-DD86

 Directory of c:\Users\Dee\root\Projects\personal_real_projects\geocoding\geocoderpy

06/11/2021  01:43 PM    <DIR>          .
06/11/2021  01:43 PM    <DIR>          ..
06/11/2021  01:08 PM    <DIR>          Include
06/11/2021  01:08 PM    <DIR>          Lib
06/11/2021  01:45 PM             4,978 nominatim_geocode.ipynb
06/11/2021  01:08 PM                75 pyvenv.cfg
06/11/2021  01:09 PM    <DIR>          Scripts
06/11/2021  01:43 PM               156 SFLandmarks.csv
               3 File(s)          5,209 bytes
               5 Dir(s)  33,762,004,992 bytes free


### Read in the file with `pandas`

In [8]:
df = pd.read_csv('SFLandmarks.csv')
print(df)

   ID          Landmark           City State
0   1      Union Square  San Francisco    CA
1   2        Coit Tower  San Francisco    CA
2   3  Golden Gate Park  San Francisco    CA
3   4        Twin Peaks  San Francisco    CA


### Geocode the addresses in the Pandas Dataframe

The next cell does the following

- Iterates over all rows in the dataframe `df`.
- Joins the values in the columns `Landmark`, `City`, and `State` into one string (the full address)
- Submits the string we just created as the address to be geocoded
- Saves results to the `geocodes` list object.

In [11]:
geocodes = [geolocator.geocode(', '.join([df['Landmark'][i], df['City'][i], df['State'][i]])) for i in range(len(df))]

Goecoding output

In [12]:
geocodes

[Location(Union Square, San Francisco, San Francisco City and County, San Francisco, California, United States, (37.7879363, -122.40751740318035, 0.0)),
 Location(Coit Tower, Telegraph Hill Boulevard, Telegraph Hill, San Francisco, San Francisco City and County, San Francisco, California, 94113, United States, (37.80237905, -122.40583435461313, 0.0)),
 Location(Golden Gate Park, San Francisco City and County, San Francisco, California, 94118-4504, United States, (37.769368099999994, -122.48218371117709, 0.0)),
 Location(Twin Peaks, c, Christmas Tree Point Road, San Francisco, San Francisco City and County, San Francisco, California, 94114-1818, United States, (37.75464, -122.44648, 0.0))]

Add the output `latitude` and `longitude` values in the geocodes list to the df dataframe

In [13]:
df['lat'] = [g.latitude for g in geocodes]
df['lon'] = [g.longitude for g in geocodes]
df

Unnamed: 0,ID,Landmark,City,State,lat,lon
0,1,Union Square,San Francisco,CA,37.787936,-122.407517
1,2,Coit Tower,San Francisco,CA,37.802379,-122.405834
2,3,Golden Gate Park,San Francisco,CA,37.769368,-122.482184
3,4,Twin Peaks,San Francisco,CA,37.75464,-122.44648


## Map the output
First, make an empty map centered on San Francisco.

In [18]:
map1 = folium.Map(location=(37.754640,	-122.446480), zoom_start=12)
for index,row in df.iterrows():
    # Add the geocoded locations to the map
    folium.Marker(location=(row['lat'], row['lon']), popup=row['Landmark']).add_to(map1)

display(map1)

## Next steps - Geocode street addresses

You can use this basic workflow to geocode named places, zip codes, or addresses, depending on what is supported by the API you use. See the documentation for `geopy` about available geocoding APIs. Then carefully read the documentation for the geocoding API that you decide to use as they typically require `API keys` and have usage limits.

So, I want to fetch some street address data.

In [21]:
url = 'https://raw.githubusercontent.com/dlab-geo/geocoding/master/address_data/oak_liquor_stores.csv'
r = requests.get(url, allow_redirects=True)
open('oak_liquor_stores.csv', 'wb').write(r.content)

1163

In [26]:
df = pd.read_csv('oak_liquor_stores.csv')
df

Unnamed: 0,id,name,street,city,state,zip,type
0,1,Wah Fay Liquors,2101 8th Ave,Oakland,CA,94606,p
1,2,Vision Liquor,1615 Macarthur Blvd,Oakland,CA,94602,p
2,3,Souza's Liquors,394 12th St,Oakland,CA,94607,p
3,4,Tk Liquors,1500 23th Ave,Oakland,CA,94606,p
4,5,Quadriga Wines Inc,6193 Ridgemont Dr,Oakland,CA,94619,p
5,6,Bev Mo,525 Embarcadero W,Oakland,CA,94607,c
6,7,Fairfax Liquor,5403 Foothill Blvd,Oakland,CA,94601,p
7,8,Saleen Market,1200 78th Ave,Oakland,CA,94621,m
8,9,Park Liquors,828 Franklin St,Oakland,CA,94607,p
9,10,Los Camellos,5913 International Blvd,Oakland,CA,94621,p


In [27]:
df['address'] = (df['street'] + ', ' + df['city'] + ', ' + df['state']).str.strip()
df

Unnamed: 0,id,name,street,city,state,zip,type,address
0,1,Wah Fay Liquors,2101 8th Ave,Oakland,CA,94606,p,"2101 8th Ave, Oakland, CA"
1,2,Vision Liquor,1615 Macarthur Blvd,Oakland,CA,94602,p,"1615 Macarthur Blvd, Oakland, CA"
2,3,Souza's Liquors,394 12th St,Oakland,CA,94607,p,"394 12th St, Oakland, CA"
3,4,Tk Liquors,1500 23th Ave,Oakland,CA,94606,p,"1500 23th Ave, Oakland, CA"
4,5,Quadriga Wines Inc,6193 Ridgemont Dr,Oakland,CA,94619,p,"6193 Ridgemont Dr, Oakland, CA"
5,6,Bev Mo,525 Embarcadero W,Oakland,CA,94607,c,"525 Embarcadero W, Oakland, CA"
6,7,Fairfax Liquor,5403 Foothill Blvd,Oakland,CA,94601,p,"5403 Foothill Blvd, Oakland, CA"
7,8,Saleen Market,1200 78th Ave,Oakland,CA,94621,m,"1200 78th Ave, Oakland, CA"
8,9,Park Liquors,828 Franklin St,Oakland,CA,94607,p,"828 Franklin St, Oakland, CA"
9,10,Los Camellos,5913 International Blvd,Oakland,CA,94621,p,"5913 International Blvd, Oakland, CA"


In [31]:
def geocode_my_address(address):
    print('geocoding:', address)
    try:
        x = geolocator.geocode(address)
        lon_lat = x.longitude, x.latitude
        return lon_lat
    except:
        print("Check address:", address, "there is a problem with it")
        lon_lat = None, None
        return lon_lat

In [32]:
df['lon'], df['lat'] = zip(*df['address'].apply(lambda x: geocode_my_address(x)))

geocoding: 2101 8th Ave, Oakland, CA
geocoding: 1615 Macarthur Blvd, Oakland, CA
geocoding: 394 12th St, Oakland, CA
geocoding: 1500 23th Ave, Oakland, CA
Check address: 1500 23th Ave, Oakland, CA there is a problem with it
geocoding: 6193 Ridgemont Dr, Oakland, CA
geocoding: 525 Embarcadero W,  Oakland, CA
geocoding: 5403 Foothill Blvd, Oakland, CA
geocoding: 1200 78th Ave, Oakland, CA
geocoding: 828 Franklin St, Oakland, CA
geocoding: 5913 International Blvd, Oakland, CA
geocoding: 3210 Harrison St, Oakland, CA
geocoding: 1460 7th St, Oakland, CA
geocoding: 1333 Peralta St, Oakland, CA
geocoding: 3710 Telegraph Ave, Oakland, CA
geocoding: 3293 Lakeshore Ave, Oakland, CA
geocoding: 1647 8th St, Oakland, CA
geocoding: 3849 Martin Luther King Jr Way, Oakland, CA
Check address: 3849 Martin Luther King Jr Way, Oakland, CA there is a problem with it
geocoding: 3900 Grand Ave, Oakland, CA
geocoding: 7305 Edgewater Dr #D, Oakland, CA
geocoding: 350 E 18th St, Oakland, CA


In [33]:
df

Unnamed: 0,id,name,street,city,state,zip,type,address,lon,lat
0,1,Wah Fay Liquors,2101 8th Ave,Oakland,CA,94606,p,"2101 8th Ave, Oakland, CA",-122.244874,37.798335
1,2,Vision Liquor,1615 Macarthur Blvd,Oakland,CA,94602,p,"1615 Macarthur Blvd, Oakland, CA",-122.223741,37.800272
2,3,Souza's Liquors,394 12th St,Oakland,CA,94607,p,"394 12th St, Oakland, CA",-122.270427,37.802578
3,4,Tk Liquors,1500 23th Ave,Oakland,CA,94606,p,"1500 23th Ave, Oakland, CA",,
4,5,Quadriga Wines Inc,6193 Ridgemont Dr,Oakland,CA,94619,p,"6193 Ridgemont Dr, Oakland, CA",-122.167198,37.784349
5,6,Bev Mo,525 Embarcadero W,Oakland,CA,94607,c,"525 Embarcadero W, Oakland, CA",-122.278732,37.79593
6,7,Fairfax Liquor,5403 Foothill Blvd,Oakland,CA,94601,p,"5403 Foothill Blvd, Oakland, CA",-122.208973,37.773331
7,8,Saleen Market,1200 78th Ave,Oakland,CA,94621,m,"1200 78th Ave, Oakland, CA",-122.186324,37.755535
8,9,Park Liquors,828 Franklin St,Oakland,CA,94607,p,"828 Franklin St, Oakland, CA",-122.27202,37.800102
9,10,Los Camellos,5913 International Blvd,Oakland,CA,94621,p,"5913 International Blvd, Oakland, CA",-122.214515,37.772124


Map it out

In [39]:
## drop na
df = df.dropna()
df

Unnamed: 0,id,name,street,city,state,zip,type,address,lon,lat
0,1,Wah Fay Liquors,2101 8th Ave,Oakland,CA,94606,p,"2101 8th Ave, Oakland, CA",-122.244874,37.798335
1,2,Vision Liquor,1615 Macarthur Blvd,Oakland,CA,94602,p,"1615 Macarthur Blvd, Oakland, CA",-122.223741,37.800272
2,3,Souza's Liquors,394 12th St,Oakland,CA,94607,p,"394 12th St, Oakland, CA",-122.270427,37.802578
4,5,Quadriga Wines Inc,6193 Ridgemont Dr,Oakland,CA,94619,p,"6193 Ridgemont Dr, Oakland, CA",-122.167198,37.784349
5,6,Bev Mo,525 Embarcadero W,Oakland,CA,94607,c,"525 Embarcadero W, Oakland, CA",-122.278732,37.79593
6,7,Fairfax Liquor,5403 Foothill Blvd,Oakland,CA,94601,p,"5403 Foothill Blvd, Oakland, CA",-122.208973,37.773331
7,8,Saleen Market,1200 78th Ave,Oakland,CA,94621,m,"1200 78th Ave, Oakland, CA",-122.186324,37.755535
8,9,Park Liquors,828 Franklin St,Oakland,CA,94607,p,"828 Franklin St, Oakland, CA",-122.27202,37.800102
9,10,Los Camellos,5913 International Blvd,Oakland,CA,94621,p,"5913 International Blvd, Oakland, CA",-122.214515,37.772124
10,11,Vernon Market,3210 Harrison St,Oakland,CA,94611,m,"3210 Harrison St, Oakland, CA",-122.255811,37.817983


In [38]:
map1 = folium.Map(location=(37.8044557, -122.2713563), zoom_start=12)
for index, row in df.iterrows():
    folium.Marker(location=(row['lat'], row['lon']), popup = row['name']).add_to(map1)

display(map1)