# Google ID

Detailed information (e.g. user rating, price level, geometric location) about an entry in the Google Places API are requested through an identification number assigned by Google. Listed places therefore need to have at least one 'place_id'. 

Unfortunately Google charges you for their API service. Since my free trial ended in December, I am not able to share my API key with you. If you sign up at [Google Cloud](https://cloud.google.com/?hl=de) you will receive a 90 days trial. Be aware that this requires your credit card information.

## 'Place Search' API Request

The 'Place Search' takes an input and returns general informations about a place.

Possible input types are:

1)  via 'phonenumber'(string - international format prefixed by a plus sign)
<br> 2)  via 'textquery' (string)

The Restaurants dataset contains each restaurants name and phone number. Therefore both input types can be used to accquire the 'place_id'. <br> I have decided to try both ways, making sure I get the right results.

### 1) Using each restaurants phone number

The request requires a list of fon numbers from each restaurant:

In [None]:
import csv
import googlemaps
import pandas as pd

# Import original csv.file with relevant informations 
data = pd.read_csv(r'data/restaurants.csv', converters={'fon': str}, delimiter=";")
restaurants_df = pd.DataFrame(data, columns=[
    'unique_id',
    'name',
    'strasse_nr',
    'art',
    'lieferung',
    'fon'])

# Only restaurants shall be included
restaurants_df = restaurants_df.drop(restaurants_df[restaurants_df.art != "Gastronomie (Café, Restaurant, Imbiss, Lebensmittelhandlung, usw.)"].index)

# Creating a phone book
fon_book = restaurants_df['fon']

The 'googlemaps' module is the Python client for connecting to the Google Maps API services:

In [None]:
# API specifications
api_key = input("Enter your Google-API-Key here: ")
gmaps = googlemaps.Client(key=api_key)

# Creating a dictionary with layout {'phonenumber': 'place_id'}
fon_id = {}

# Executing the API request
for fon in fon_book:
    if fon == "":
        pass
    else:
        places_result = gmaps.find_place(input=fon, input_type="phonenumber")
        fon_id[fon] = places_result['candidates']

# Export dictionary as csv.file
with open('data/fon_id.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    for key, value in fon_id.items():
        writer.writerow([key, value])

<br> Executing the request took about 1 minute. I have received the following CSV file:

In [None]:
with open('data/fon_id.csv', 'r') as csv_file:
    for line in csv_file:
        print(line)

<br> Unfortunately a 'place_id' was not found for all fon numbers. This may be connected to the fact that the number used for entering the survey differs from the number stored on Google. I was also astonished that multiple 'place_ids' were found for some phone numbers. In this case, place owners might run several businesses.

### 2) Using each restaurants name

The request requires a list with the name of each restaurant. In order to receive more precise results, I have decided to include the address (i.e. street and number) as well. Otherwise, unspecific names (e.g. "McDonald's") would be searched based on the location of the inquirers IP address:

In [None]:
# Creating new column with request text
restaurants_df['name_street'] = restaurants_df['name'] + " " + restaurants_df['strasse_nr']
name_book_df = restaurants_df[['name', 'name_street']]

# API specifications
api_key = input("Enter your API-Key here: ")
gmaps = googlemaps.Client(key=api_key)

# Creating a dictionary with layout {'name': 'place_id'}
name_id = {}

# Executing the request
for index, restaurant in name_book_df.iterrows():
    name_street = restaurant['name_street']
    name = restaurant['name']
    places_result = gmaps.find_place(input=name_street, input_type='textquery')
    name_id[name] = places_result['candidates']

# Export dictionary as csv.file
with open('data/name_id.csv', 'w') as csv_file:
    writer = csv.writer(csv_file)
    for key, value in name_id.items():
        writer.writerow([key, value])

<br> Executing the request took about 5 minutes. I have received the following CSV file:

In [None]:
with open('data/name_id.csv', 'r') as csv_file:
    for line in csv_file:
        print(line)

Using the restaurants name turned out to return better results then the phone number. However, still not all 'place_id's were found. Additionally, there was still the problem of multiple ids for some places.

## Appending the Place ID's

The results of both requests shall be appended to the Restaurants data:

### 1) received via phone number

In [None]:
# Load fon_id.csv as dataframe
data2 = pd.read_csv(r'data/fon_id.csv', delimiter=',', dtype=str, header=None)
data2.columns = ['fon', 'fon_place_id']
fon_id_df = pd.DataFrame(data2)

# Dissolve nested place_id
data2['fon_place_id'] = data2['fon_place_id'].str.replace("{'place_id': ", "", regex=False)
data2['fon_place_id'] = data2['fon_place_id'].str.replace("}", "", regex=False)

# Merge dataframes based on overlapping 'fon'
restaurants_fon_id_df = pd.merge(restaurants_df, fon_id_df)

Now the 'place_id' received through the phone number API-query is appended to the Restaurants dataframe. Since there are a lot of missing values, it is necessary to add the results received through name as well:

### 2) received via name

In [None]:
# Load name_id.csv as dataframe
data3 = pd.read_csv(r'data/name_id.csv', delimiter=',', encoding='cp1252', dtype=str, header=None)
data3.columns = ['name', 'name_place_id']
name_id_df = pd.DataFrame(data3)

# Dissolve nested place_id
data3['name_place_id'] = data3['name_place_id'].str.replace("{'place_id': ", "", regex=False)
data3['name_place_id'] = data3['name_place_id'].str.replace("}", "", regex=False)

# Merge dataframes based on overlapping 'name'
restaurants_id_df = pd.merge(restaurants_fon_id_df, name_id_df)

<br>The modified Restaurants dataframe with the Google 'place_id' looks like this:

In [None]:
display(restaurants_id_df)

<br> The dataframe is exported as CSV and Pickle file:

In [None]:
restaurants_id_df.to_csv(r'data/restaurants_with_google_id.csv', sep=';', index=False)
restaurants_id_df.to_pickle(r'data/restaurants_with_google_id.pkl')