# Geocoding

You can use Python to turn addresses ("141 Neff Annex Missouri, School of Journalism, Columbia, MO 65211") into latitude/longitude coordinates -- useful if your analysis involves geography or you want to make an interactive map for an online package, or whatever.

We have a CSV of payday lenders in Illinois. We're going to use a Python module called [`geopy`](https://geopy.readthedocs.io/en/latest/) to turn their address into latitude/longitude coordinates, then tack those onto the data as new columns and write that to a new file.

In [None]:
# import the Google geocoder from geopy
# import Python's csv and time libaries
from geopy.geocoders import GoogleV3
import csv
import time

# Make a geolocator object
# Set the `timeout` keyword argument to 5 (seconds)
geolocator = GoogleV3(timeout=5)

# in a `with` block, open the file to read from and the file to write to
with open('../data/payday_lenders.csv', 'r') as address_file_in, open('payday_lenders_geocoded.csv', 'w') as geocoded_file_out:

    # make a DictReader object
    reader = csv.DictReader(address_file_in)
    
    # define your list of headers
    headers = ['NAME', 'DBA', 'STADDR', 'STADDR2', 'CITY', 'STATE', 'ZIP', 'MATCH_ADDR', 'LAT_Y', 'LONG_X']
    
    # make a DictWriter object, passing the list of headers to the `fieldnames` keyword argument
    writer = csv.DictWriter(geocoded_file_out, fieldnames=headers)
    
    # write the header row
    writer.writeheader()

    # start for loop here
    for row in reader:
        # We're going to put an if/else here to prevent the whole class from launching a
        # volley of 500 requests at Google. Let's get the first five (row 1 is the header).
        # (`line_num` is an attribute of the DictReader object -- it keeps track of what line number you're on)
        
        if reader.line_num <= 6:

            # Put the address in a Google-recognizable string: ADDRESS, CITY, STATE ZIP            
            addr = (row['STADDR'] + ' ' + row['STADDR2']).strip() + ', ' + row['CITY'] + ', ' + row['STATE'] + ' ' + row['ZIP']
            
            # Geocode that string
            location = geolocator.geocode(addr)
            
            # Plug results from the geocoder right back into the same row of data with new keys
            # values: the returned latitude, longitude and address Google matched on.
            row['LAT_Y'] = location.latitude
            row['LONG_X'] = location.longitude
            row['MATCH_ADDR'] = location.address
            
            # Write the modified row to our new csv.
            writer.writerow(row)
            
            # To keep tabs on what's happening, get a printed message with address and line.
            print('Attempted geocode of ' + addr + ', row ' + str(reader.line_num))
            
            # Before we do all of this with the next row, pause for two seconds.
            time.sleep(2)
        else:
            break

# Alert us with a printed message when this completes and close both files.
print('All done!')