![PyData_logo](./static/pydata-logo-madrid-2016.png)

# Remove Before Flight
## Analyzing Flight Safety Data with Python

### Getting missing values of latitude and longitude with Google geocoding API 

###### https://developers.google.com/maps/documentation/geocoding/intro?hl=es#geocoding 

In [1]:
import pandas as pd
import numpy as np

%matplotlib notebook
import matplotlib.pyplot as plt

In [2]:
# Some configuration options:
pd.set_option('max_columns', 70)
pd.set_option('max_rows', 50)

In [3]:
# Read the missing locations
need_location = pd.read_csv('../data/need_location.csv', index_col='Unnamed: 0')
need_location.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4307 entries, 46 to 76878
Data columns (total 5 columns):
ev_city       4285 non-null object
ev_country    4170 non-null object
ev_state      2936 non-null object
latitude      0 non-null float64
longitude     1 non-null object
dtypes: float64(1), object(4)
memory usage: 201.9+ KB


In [4]:
need_location.tail()

Unnamed: 0,ev_city,ev_country,ev_state,latitude,longitude
76812,Banbury,UK,,,
76836,VERMILLION 255B,,GM,,
76846,Kigali,RW,,,
76850,JUNEAU,USA,AK,,
76878,Cozumel,MX,,,


In [5]:
from utils.geoloc import GoogleMapsClient

In [6]:
# read key from file:
# Your API key here https://developers.google.com/maps/documentation/geocoding/get-api-key
with open('./key.txt') as f:
    key = f.readline().rstrip('\n')

In [7]:
Session = GoogleMapsClient(key=key)

index = need_location.index
cities = need_location['ev_city'].values
states = need_location['ev_state'].values
countries = need_location['ev_country'].values

As the number of request is limited to 2500 requests per day, the dataframe needs to be splitted:

In [8]:
need_location.iloc[0:2000]

Unnamed: 0,ev_city,ev_country,ev_state,latitude,longitude
46,NIGHTMUTE,USA,AK,,
50,UTICA,USA,NY,,
56,Toronto,CA,,,
92,Sydney,AS,,,
116,GRAND JUNCTION,USA,CO,,
120,JUNEAU,USA,AK,,
129,YUBA CITY,USA,CA,,
149,Sidney,USA,MT,,
153,Lake in the Woods,CA,ON,,
163,KEMMERER,USA,WY,,


In [9]:
need_location.iloc[2000:]

Unnamed: 0,ev_city,ev_country,ev_state,latitude,longitude
35161,Miami,USA,FL,,
35187,CEDAR RAPIDS,USA,IA,,
35202,BURBANK,USA,CA,,
35212,BEDFORD PARK,USA,IL,,
35238,GARDNER,USA,KS,,
35240,FORT MYERS,USA,FL,,
35244,BETHEL,USA,AK,,
35250,"Naha, Okinawa",JA,,,
35272,Godensee,AU,,,
35275,MANILA,RP,,,


In [8]:
# Part 2: 2000:end
for ii in range(2000, len(need_location)):
    location = Session.get_lat_lon_from_city_country(cities[ii], countries[ii], states[ii])
    if location != []:
        lat = location[0]
        lon = location[1]
        need_location.set_value(index[ii], 'latitude', lat)
        need_location.set_value(index[ii], 'longitude', lon)
        

need_location.iloc[2000:].to_csv('../data/have_location_part2.csv')

wrong request: Kansai, JA
wrong request: Sucre, FN, BL
wrong request: Evora, FN, PO
wrong request: Googlong, NSW, AS
wrong request: Bosnia, BK
wrong request: TREASURE CAY   , BF
wrong request: W. CAMERON 617 , GM
wrong request: VERMILION 250  , GM
wrong request: TREASURE CAY   , BF
wrong request: DUNCAN TOWN    , BF
wrong request: EUCEMBUNE      , AS
wrong request: SHIP SHOAL 225 , GM
wrong request: TUCUTI , PM
wrong request: Makung Island, TW
wrong request: Rumbek, SU
wrong request: EC 270, GM, USA
wrong request: GULF-OF-MEXICO , GM
wrong request: GULF OF MEXICO , GM
wrong request: GULF OF MEXICO , GM
wrong request: STELLA MARIS   , BF
wrong request: Goma, CF
wrong request: SOTO CANO, HO
wrong request: PERDE BAY      , AK, USA
wrong request: RANCHO SANTA YN, MX
wrong request: Bourke , Australia, AS
wrong request: MIXCOLAJA      , GT
wrong request: Neuhausen ob Eck, GE
wrong request: Subic Bay, RP
wrong request: Main Pass 61A, GM, USA
wrong request: SHIP SHOAL 290K, GM
wrong request: V

In [10]:
need_location.iloc[2000:]

Unnamed: 0,ev_city,ev_country,ev_state,latitude,longitude
35161,Miami,USA,FL,25.761680,-80.1918
35187,CEDAR RAPIDS,USA,IA,41.977880,-91.6656
35202,BURBANK,USA,CA,34.180839,-118.309
35212,BEDFORD PARK,USA,IL,41.764172,-87.7813
35238,GARDNER,USA,KS,38.810841,-94.9272
35240,FORT MYERS,USA,FL,26.640628,-81.8723
35244,BETHEL,USA,AK,60.792222,-161.756
35250,"Naha, Okinawa",JA,,26.195134,127.67
35272,Godensee,AU,,48.143000,11.4249
35275,MANILA,RP,,14.599512,120.984


In [11]:
need_location.iloc[2000:].to_csv('../data/have_location_part2.csv')

In [35]:
# Notebook style
from IPython.core.display import HTML
css_file = './static/style.css'
HTML(open(css_file, "r").read())