- Geocoding: converting a phyiscal address or location into latitude/longitude.
- Reverse geocoding: converting a lat/long to a phyiscal address or location

Python offers a number of packages to make the task  easy. Here I'm using pygeocoder, a wrapper for Google's geo-API, to demonstrate how geocoding and reverse geocoding work.

# Geocoding

In [1]:
# Load packages:
# pygeocoder for its geo-functionality
# pandas for dataframe structures
# numpy for dealing with missing value (np.nan) 
from pygeocoder import Geocoder
import pandas as pd
import numpy as np

### Create some simulated geo data

Geo-data usually consist of latitude and longitude strings, with each coordinate in a coordinate pair seperated by a comma. For demonstration, prepare a dictionary of five latitude and longitude strings. 

In [2]:
# Create a dictionary of raw data
data = {'Site 1': '31.336968, -109.560959',
        'Site 2': '31.347745, -108.229963',
        'Site 3': '32.277621, -107.734724',
        'Site 4': '31.655494, -106.420484',
        'Site 5': '30.295053, -104.014528'}

In [3]:
# Convert the dictionary of raw data into a pandas dataframe
df = pd.DataFrame.from_dict(data, orient='index')

In [4]:
# View the dataframe
df

Unnamed: 0,0
Site 5,"30.295053, -104.014528"
Site 4,"31.655494, -106.420484"
Site 3,"32.277621, -107.734724"
Site 2,"31.347745, -108.229963"
Site 1,"31.336968, -109.560959"


As shown above, we have a dataframe with five rows, with each row containing a string of latitude and longitude. Before we can work with the data, we'll need to 
- seperate the strings into latitude and longitude
- convert them into floats

In [5]:
# Create two empty lists for the loop results to be placed
lat = []
lon = []

# For each row in a varible,
for row in df[0]:
    # Try to,
    try:
        # Split the row by comma, convert to float, and append
        # everything before the comma to lat
        lat.append(float(row.split(',')[0]))
        
        # Split the row by comma, convert to float, and append
        # everything after the comma to lon
        lon.append(float(row.split(',')[1]))
    # But an error occurs
    except:
        # append a missing value to lat
        lat.append(np.NaN)
        # append a missing value to lon
        lon.append(np.NaN)

In [6]:
# Create two new columns from lat and lon
df['latitude'] = lat
df['longitude'] = lon

In [7]:
df

Unnamed: 0,0,latitude,longitude
Site 5,"30.295053, -104.014528",30.295053,-104.014528
Site 4,"31.655494, -106.420484",31.655494,-106.420484
Site 3,"32.277621, -107.734724",32.277621,-107.734724
Site 2,"31.347745, -108.229963",31.347745,-108.229963
Site 1,"31.336968, -109.560959",31.336968,-109.560959


# Reverse Geocoding