## Geocoding And Reverse Geocoding

Geocoding (converting a physical adderss or location into latitude/longitude) and reverse geocoding (converting a lat/long to a physical address or location) are common tasks when working with geo-data.

Python offers a number of packages to make the task incredibly easy. In the tutorial below, I use pygeocoder, a wrapper for Google's geo-API, to both geocode and reverse geocode.

## Preliminaries

First we want to load the packages we will want to use in the script. Specifically, I am loading pygeocoder for its geo-functionality, pandas for its dataframe structures, and numpy for its missing value (np.nan) functionality.

In [26]:
# Load packages
from pygeocoder import Geocoder
import pandas as pd
import numpy as np


In [9]:
#!pip install pygeocoder

Collecting pygeocoder
  Downloading https://files.pythonhosted.org/packages/3b/79/2cf3a4dfe54705bbf07cbb25940078dfa595608aa4ecb9f0aaaae9faba08/pygeocoder-1.2.5.tar.gz
Building wheels for collected packages: pygeocoder
  Building wheel for pygeocoder (setup.py): started
  Building wheel for pygeocoder (setup.py): finished with status 'done'
  Created wheel for pygeocoder: filename=pygeocoder-1.2.5-cp37-none-any.whl size=8889 sha256=7dc7a9b362e2b1755cd672c55e46a49dc769a76151fb58ffdaefb72189a70191
  Stored in directory: C:\Users\Raj\AppData\Local\pip\Cache\wheels\7c\4c\00\d05c66c4af5411c554c91b8079732c8a0359c2226fb8c01031
Successfully built pygeocoder
Installing collected packages: pygeocoder
Successfully installed pygeocoder-1.2.5


### Create some simulated geo data

Geo-data comes in a wide variety of forms, in this case we have a Python dictionary of five latitude and longitude strings, with each coordinate in a coordinate pair separated by a comma.

In [12]:
# Create a dictionary of raw data
data = {'Site 1': '31.336968, -109.560959',
        'Site 2': '31.347745, -108.229963',
        'Site 3': '32.277621, -107.734724',
        'Site 4': '31.655494, -106.420484',
        'Site 5': '30.295053, -104.014528'}

let us turn the dictionary of simulated data into a dataframe.

In [13]:
# Convert the dictionary into a pandas dataframe
df = pd.DataFrame.from_dict(data, orient='index')

In [14]:
# View the dataframe
df

Unnamed: 0,0
Site 1,"31.336968, -109.560959"
Site 2,"31.347745, -108.229963"
Site 3,"32.277621, -107.734724"
Site 4,"31.655494, -106.420484"
Site 5,"30.295053, -104.014528"


In [24]:
df[0][0]

'31.336968, -109.560959'

You can see now that we have a a dataframe with five rows, with each now containing a string of latitude and longitude. Before we can work with the data, we’ll need to 

1. separate the strings into latitude and longitude 

2. convert them into floats. 

The function below does just that.

In [18]:
# Create two lists for the loop results to be placed
lat = []
lon = []

# For each row in a variable,
for row in df[0]:
    # Try to,
    try:
        # Split the row by comms, convert to float, and append
        # everything before the comma to lat
        lat.append(float(row.split(',')[0]))
        # Split the row by comma, convert to float, and append
        # everything after the comma to lon
        lon.append(float(row.split(',')[1]))
    # But if you get an error
    except:
        # append a missing value to lat
        lat.append(np.NaN)
        lon.append(np.NaN)
        
# Create two new columns from lat and lon
df['latitude'] = lat
df['longitude'] = lon


let's take a look at what we have.

In [19]:
# View the dataframe
df

Unnamed: 0,0,latitude,longitude
Site 1,"31.336968, -109.560959",31.336968,-109.560959
Site 2,"31.347745, -108.229963",31.347745,-108.229963
Site 3,"32.277621, -107.734724",32.277621,-107.734724
Site 4,"31.655494, -106.420484",31.655494,-106.420484
Site 5,"30.295053, -104.014528",30.295053,-104.014528


Great. This is exactly what we want to see, one column of floats for latitude and one column of floats for longitude.

## Reverse Geocoding

To reverse geocode, we feed a specific latitude and londitude pair, in this case the first row (indexed as '0') into pygeocoder's reverse_geocoder function.

In [30]:
# Convert longitude and latitude to a location

# results = Geocoder.reverse_geocode(df['latitude'][0], df['longitude'][0])