## Universal Geocoder Sample Use


The Universal Geocoder conducts a point in polygon inclusion test. We are testing whether or not the coordinates fall within the polygons. This type of transormation can also be done using a spatial join tool in ArcGIS.

Zip code, blockgroup, and council district shapefiles were  downloaded at the county GIS portal: https://www5.kingcounty.gov/gisdataportal/ Informal neighorhoods were downloaded at: https://data.seattle.gov/dataset/Neighborhoods/2mbt-aqqx Urban Villages were downloaded at: https://data.seattle.gov/dataset/Urban-Villages/ugw3-tp9e

This geocoder requires geopandas its dependencies. I had to pip install geopandas, pip install rtree, and brew install spatialindex.

In [4]:
import init
import geocoder
import pandas as pd

geo = geocoder.Geocoder()

### Point Location Example

Inputs: lat/lon point in string/tuple format, pickle name (optional)

Outputs: dataframe of geocoded information

In [5]:
df_point_location = geo.geocode_point((47.728, -122.350))
print (df_point_location)

      lat      lon                geometry
0  47.728  -122.35  POINT (-122.35 47.728)
geography     lat     lon   block_group neighborhood_long neighborhood_short  \
0          47.728 -122.35  530330004011   NO BROADER TERM        Bitter Lake   

geography seattle_city_council_district urban_village zipcode  
0                                  SCC5          None   98133  


### Batch Location Example with Pandas
Inputs: input pandas df, pickle name (optional); columns must include be 'lat' and 'lon'; the input dataframe must not include a column named 'geography'.

Outputs: dataframe of geocoded information


In [19]:
input_file = "./data/raw/sample_locations2.csv"
df_input = pd.read_csv(input_file)

df_batch_location = geo.geocode_df(df_input)
print (df_batch_location.head())


geography        lat         lon            block_group   block_group  \
0          47.608315 -122.317345            12th Avenue  530330086002   
1          47.603145 -122.306682   23rd & Union-Jackson  530330087003   
2          47.582350 -122.386420                Admiral  530330096002   
3          47.696854 -122.345977  Aurora-Licton Springs  530330018003   
4          47.670593 -122.382603                Ballard  530330047004   

geography      neighborhood_long neighborhood_long     neighborhood_short  \
0                    12th Avenue          DOWNTOWN            12th Avenue   
1           23rd & Union-Jackson      CENTRAL AREA   23rd & Union-Jackson   
2                        Admiral      WEST SEATTLE                Admiral   
3          Aurora-Licton Springs   NO BROADER TERM  Aurora-Licton Springs   
4                        Ballard           BALLARD                Ballard   

geography neighborhood_short seattle_city_council_district  \
0                 First Hill        

### Large file performance test

In [21]:
# Create a large file (100,000 rows) to test performance

input_file = "./data/raw/sample_locations2.csv"
df_input = pd.read_csv(input_file)
df_input_large = df_input

for i in range(2300):
    df_input_large = pd.concat([df_input_large, df_input])
df_input_large = df_input_large.reset_index(drop=True)

# start timer
import time
start = time.time()

# geocode
df_batch_location_large = geo.geocode_df(df_input_large)
print (df_batch_location_large.head())

# end timer
end = time.time()

print(end - start, "seconds")

geography        lat         lon            block_group   block_group  \
0          47.608315 -122.317345            12th Avenue  530330086002   
1          47.603145 -122.306682   23rd & Union-Jackson  530330087003   
2          47.582350 -122.386420                Admiral  530330096002   
3          47.696854 -122.345977  Aurora-Licton Springs  530330018003   
4          47.670593 -122.382603                Ballard  530330047004   

geography      neighborhood_long neighborhood_long     neighborhood_short  \
0                    12th Avenue          DOWNTOWN            12th Avenue   
1           23rd & Union-Jackson      CENTRAL AREA   23rd & Union-Jackson   
2                        Admiral      WEST SEATTLE                Admiral   
3          Aurora-Licton Springs   NO BROADER TERM  Aurora-Licton Springs   
4                        Ballard           BALLARD                Ballard   

geography neighborhood_short seattle_city_council_district  \
0                 First Hill        