## Google Geocoding

* Now that everyone has an API Key (Congratulations!), it is time to start using it!

* You can utilize the Google Maps Geocoding API to turn addresses into latitudinal and longitudinal coordinates.

  * This process of converting an address to coordinates is called **geocoding**.

  * Since many APIs only understand locations formatted in terms of latitude/longitude, geocoding will be very valuable in translating addresses into data that APIs - like the Google Places API - can understand.

    * Google's API is not free and if credit card information is provided, Google will charge past a certain usage point. **It is a good idea to avoid pushing your API key to github by using adding the `config.py` to their `.gitignore` file or using environment variables.**

In [281]:
# Dependencies
import requests
import json
import pandas as pd

from pprint import pprint

file = pd.read_csv("winemag-data-130k-v2.csv")

file.head()


Unnamed: 0.1,Unnamed: 0,country,description,designation,points,price,province,region_1,region_2,taster_name,taster_twitter_handle,title,variety,winery
0,0,Italy,"Aromas include tropical fruit, broom, brimston...",Vulkà Bianco,87,,Sicily & Sardinia,Etna,,Kerin O’Keefe,@kerinokeefe,Nicosia 2013 Vulkà Bianco (Etna),White Blend,Nicosia
1,1,Portugal,"This is ripe and fruity, a wine that is smooth...",Avidagos,87,15.0,Douro,,,Roger Voss,@vossroger,Quinta dos Avidagos 2011 Avidagos Red (Douro),Portuguese Red,Quinta dos Avidagos
2,2,US,"Tart and snappy, the flavors of lime flesh and...",,87,14.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Rainstorm 2013 Pinot Gris (Willamette Valley),Pinot Gris,Rainstorm
3,3,US,"Pineapple rind, lemon pith and orange blossom ...",Reserve Late Harvest,87,13.0,Michigan,Lake Michigan Shore,,Alexander Peartree,,St. Julian 2013 Reserve Late Harvest Riesling ...,Riesling,St. Julian
4,4,US,"Much like the regular bottling from 2012, this...",Vintner's Reserve Wild Child Block,87,65.0,Oregon,Willamette Valley,Willamette Valley,Paul Gregutt,@paulgwine,Sweet Cheeks 2012 Vintner's Reserve Wild Child...,Pinot Noir,Sweet Cheeks


In [282]:
test0 = "G7 2012 The 7th Generation Gran Reserva Estate Bottled Cabernet Sauvignon (Loncomilla Valley)"
test2 = "Alta Colina 2012 Old 900 Syrah (Paso Robles)"
test3 = "Prospect 772 2014 Stepping Stones Grenache Blanc (Santa Barbara County)"
test4 = "5 Mini Series of 424"
test5 = "Best wine ever!"

testing = []

testing.append(test0)
testing.append(test2)
testing.append(test3)
testing.append(test4)
testing.append(test5)

testing


['G7 2012 The 7th Generation Gran Reserva Estate Bottled Cabernet Sauvignon (Loncomilla Valley)',
 'Alta Colina 2012 Old 900 Syrah (Paso Robles)',
 'Prospect 772 2014 Stepping Stones Grenache Blanc (Santa Barbara County)',
 '5 Mini Series of 424',
 'Best wine ever!']

In [283]:
list = [10, 9, 20, 30]

for x in list:
    print(x)

10
9
20
30


In [284]:
list = [10, 9, 20, 30]
new_list = []

#any(x > 19 for x in list)

if (any(x > 19 for x in list)) == True:
    numbers = [i for i in list if i >19]
    for x in range (0, len(numbers)):
        new_list.append(numbers[x])

new_list


[20, 30]

In [285]:
def hasNumbers(inputString):
    return any(char.isdigit() for char in inputString) 

In [286]:
testing

['G7 2012 The 7th Generation Gran Reserva Estate Bottled Cabernet Sauvignon (Loncomilla Valley)',
 'Alta Colina 2012 Old 900 Syrah (Paso Robles)',
 'Prospect 772 2014 Stepping Stones Grenache Blanc (Santa Barbara County)',
 '5 Mini Series of 424',
 'Best wine ever!']

In [287]:
testing_list = []

for string in testing:
    internal_list = []
    internal_list2 = []
    numbers = []
    for x in string.split():
        internal_list.append(x)
    if hasNumbers(internal_list) == True:
        for y in internal_list:
            if y.isdigit():
                internal_list2.append(int(y))
        numbers = [i for i in internal_list2 if i > 1900]
        if len(numbers) == 1:
            testing_list.append(numbers[0])
        elif len(numbers) == 0:
            testing_list.append("N/A")
    if hasNumbers(internal_list) == False:
        testing_list.append("N/A")
    
    
    print(internal_list)
    print(internal_list2)
    print(numbers)

    print (string)
    counter = counter + 1

print(counter)
print(testing_list)

['G7', '2012', 'The', '7th', 'Generation', 'Gran', 'Reserva', 'Estate', 'Bottled', 'Cabernet', 'Sauvignon', '(Loncomilla', 'Valley)']
[2012]
[2012]
G7 2012 The 7th Generation Gran Reserva Estate Bottled Cabernet Sauvignon (Loncomilla Valley)
['Alta', 'Colina', '2012', 'Old', '900', 'Syrah', '(Paso', 'Robles)']
[2012, 900]
[2012]
Alta Colina 2012 Old 900 Syrah (Paso Robles)
['Prospect', '772', '2014', 'Stepping', 'Stones', 'Grenache', 'Blanc', '(Santa', 'Barbara', 'County)']
[772, 2014]
[2014]
Prospect 772 2014 Stepping Stones Grenache Blanc (Santa Barbara County)
['5', 'Mini', 'Series', 'of', '424']
[5, 424]
[]
5 Mini Series of 424
['Best', 'wine', 'ever!']
[]
[]
Best wine ever!
129976
[2012, 2012, 2014, 'N/A', 'N/A']


In [294]:
title_list = [title for title in file["title"]]
year_list = []

title_list[0]

'Nicosia 2013 Vulkà Bianco  (Etna)'

In [293]:
counter = 0

for title in title_list:
    title_strings_list = []
    title_numbers_list = []
    year_numbers = []
    for x in string.split():
        title_strings_list.append(x)
    if hasNumbers(title_strings_list) == True:
        for y in title_strings_list:
            if y.isdigit():
                title_numbers_list.append(int(y))
        year_numbers = [i for i in internal_list2 if i > 1900]
        if len(year_numbers) == 1:
            year_list.append(numbers[0])
        elif len(year_numbers) == 0:
            year_list.append("N/A")
    if hasNumbers(title_strings_list) == False:
        year_list.append("N/A")
    counter = counter + 1

print(title_strings_list)
print(title_numbers_list)
print(year_numbers)

['Best', 'wine', 'ever!']
[]
[]


'N/A'

In [186]:
testing_list = []

for string in testing:
    for x in string.split():
        if x.isdigit():
            internal_list.append(int(x))
            
            
    #if (any (x > 1900 for x in internal_list)) == True:
        #numbers = [i for i in internal_list if i >1900]
        #for x in range (0, len(numbers)):
            #testing_list.append(numbers[x])

    
print(internal_list)
print(testing_list)
            #if (any (x > 1900 for x in internal_list)) == True:
                #testing_list.append(x)
            #elif (any (x > 1900 for x in internal_list)) == False:
                #testing_list.append("N/A")

#testing_list

            

[2012, 2012, 900, 772, 2014, 5, 424]
[]


In [124]:
testing_list =[]

def hasNumbers(inputString):
    return any(char.isdigit() for char in inputString) 

for string in testing:
    #internal_check = []
    
    #if hasNumbers(string) == False:
        #testing_list.append("N/A")
        
        
    if hasNumbers(string) == True:
        for x in string.split():
            if x.isdigit():
                internal_check.append(int(x))
                
        if (any (y > 1900) for y in internal_check) == False:
            testing_list.append("N/A")

        if (any (y > 1900) for y in internal_check) == True:
            if y in interal_check > 1900 in internal_check:
                testing_list.append(y)


internal_check


[2012,
 2012,
 900,
 772,
 2014,
 5,
 424,
 2012,
 2012,
 900,
 772,
 2014,
 5,
 424,
 2012,
 2012,
 900,
 772,
 2014,
 5,
 424,
 2012,
 2012,
 900,
 772,
 2014,
 5,
 424]

In [127]:
testing

testing_list =[]

def hasNumbers(inputString):
    return any(char.isdigit() for char in inputString) 

for string in testing:
    if hasNumbers(string) == True:
        for x in string.split():
            if x.isdigit() and int(x) > 1950:
                testing_list.append(int(x))
                break
            #else: 
                #year_list.append("N/A")
    else:
            testing_list.append("N/A")
            
testing_list

[2012, 2012, 2014, 'N/A']

In [123]:
##This is for Parsing the Year out of Title

title_list = [title for title in file["title"]]
#print(year_list[0])
#print(len(title_list))

year_list = []
test_list = []

def hasNumbers(inputString):
    return any(char.isdigit() for char in inputString) 

for string in title_list:
    if hasNumbers(string) == True:
        for x in string.split():
            if x.isdigit() and int(x) > 1950:
                year_list.append(int(x))
                break
            #else: 
                #year_list.append("N/A")
    else:
            test_list.append("N/A")


print(len(title_list))
print(len(year_list))
print(len(test_list))

total = len(year_list) + len(test_list)
print(total)

129971
125336
4275
129611


In [None]:
for string in title_list:
    for x in string.split():
        if x.isdigit():
            year_list.append(int(x))



def hasNumbers(inputString):
    return any(char.isdigit() for char in inputString) 

hasNumbers("St. Julian 2013 Reserve Late Harvest Riesling (Lake Michigan Shore)")
hasNumbers("Collet NV Brut Ros√©  (Champagne)")
            
#test = ["Nicosia 2013 Vulk√† Bianco  (Etna)", "Quinta dos Avidagos 2011 Avidagos Red (Douro)", 
        #"Rainstorm 2013 Pinot Gris (Willamette Valley)", 
        #"St. Julian 2013 Reserve Late Harvest Riesling (Lake Michigan Shore)"]
#for string in test:
    #for s in string.split():
        #if s.isdigit():
            #list.append(int(s))
        
#list

#print("test")

In [10]:
# Target location

for winery in file["winery name"]:

    target_location = winery
    
    # Build the endpoint URL
    target_url = ('https://maps.googleapis.com/maps/api/geocode/json?'
    'address={0}&key={1}').format(target_location, gkey)
    
    print(target_url)

NameError: name 'gkey' is not defined

In [1]:
target_location = win
    
    
    
# Build the endpoint URL
target_url = ('https://maps.googleapis.com/maps/api/geocode/json?'
    'address={0}&key={1}').format(target_location, gkey)

ModuleNotFoundError: No module named 'config'

* **Reminder**: printing the url will also expose your API key. While it is useful for demonstration purposes here, it should be avoided in projects and homework.

* This may be the first time you have seen string substitutions. I will explain.

In [10]:
# Run a request to endpoint and convert result to json
geo_data = requests.get(target_url).json()

# Pretty print the json
pprint(geo_data)

{'results': [{'address_components': [{'long_name': '436',
                                      'short_name': '436',
                                      'types': ['street_number']},
                                     {'long_name': 'Saint Helena Highway',
                                      'short_name': 'St Helena Hwy',
                                      'types': ['route']},
                                     {'long_name': 'Saint Helena',
                                      'short_name': 'St Helena',
                                      'types': ['locality', 'political']},
                                     {'long_name': 'Napa County',
                                      'short_name': 'Napa County',
                                      'types': ['administrative_area_level_2',
                                                'political']},
                                     {'long_name': 'California',
                                      'short_name': 'CA',
        

In [11]:
# Print the json (using json.dumps)
print(json.dumps(geo_data, indent=4, sort_keys=True))

{
    "results": [
        {
            "address_components": [
                {
                    "long_name": "436",
                    "short_name": "436",
                    "types": [
                        "street_number"
                    ]
                },
                {
                    "long_name": "Saint Helena Highway",
                    "short_name": "St Helena Hwy",
                    "types": [
                        "route"
                    ]
                },
                {
                    "long_name": "Saint Helena",
                    "short_name": "St Helena",
                    "types": [
                        "locality",
                        "political"
                    ]
                },
                {
                    "long_name": "Napa County",
                    "short_name": "Napa County",
                    "types": [
                        "administrative_area_level_2",
                        "political"

In [12]:
# Extract latitude and longitude
lat = geo_data["results"][0]["geometry"]["location"]["lat"]
lng = geo_data["results"][0]["geometry"]["location"]["lng"]

# Print the latitude and longitude
print('''
    Winery: {0}
    Latitude: {1}
    Longitude: {2}
    '''.format(target_location, lat, lng))


    City: Heitz Winery
    Latitude: 38.4906878
    Longitude: -122.4508595
    


* Now that we have finished, feel free to visit the [Google Maps Geocoding API](https://developers.google.com/maps/documentation/geocoding/start) documentation page and how the code created is effectively the same as what's expressed in the documentation.

  * It's easy to be intimidated by code documentation but with a little practice it becomes simple to comprehend!