# Data Collection: Google Places API

## Introduction

**Google Places API**  
To collect neighborhood commercial informations, I will use the [Google Places API](https://developers.google.com/places/web-service/intro). There are several sub-categories, and I will use the [Place Search](https://developers.google.com/places/web-service/search) API. 

**Result limits**
- Results per query: 20
- Max results per location: 60
- If more than 20 results is needed, pass the value of the `next_page_token` to the pagetoken parameter of a new search to see the next set of results. 
- [Reference](https://developers.google.com/places/web-service/search)

**Python Client for Google Maps Services**  
Google Places API has a [Python Client Library](https://github.com/googlemaps/google-maps-services-python), which I will use in this project. 
- [Installation instructions](https://github.com/googlemaps/google-maps-services-python#installation)
- [Library documentation](https://developers.google.com/places/web-service/client-library)
- [Example of using library](https://github.com/googlemaps/google-maps-services-python/blob/81640b0a76fb741f228996f260a05c6e4a2cb27c/googlemaps/test/test_places.py)
- [Method source code](https://github.com/googlemaps/google-maps-services-python/blob/81640b0a76fb741f228996f260a05c6e4a2cb27c/googlemaps/places.py#L101)

## Import Libraries

In [150]:
import pandas as pd
#import requests
#import datetime as dt
#import re
import time
from pathlib import Path

# Import Python Client Library for Google Maps Services
import googlemaps

## Fetch API key

- Create Google Places API key [here](https://developers.google.com/places/web-service/intro?utm_source=google&utm_medium=cpc&utm_campaign=FY18-Q2-global-demandgen-paidsearchonnetworkhouseads-cs-maps_contactsal_saf&utm_content=text-ad-none-none-DEV_c-CRE_315916118135-ADGP_Hybrid+%7C+AW+SEM+%7C+SKWS+~+Places+API-KWID_43700039136946627-kwd-475997044718-userloc_9073479&utm_term=KW_%2Bplace%20%2Bapi-ST_%2Bplace+%2Bapi&gclid=EAIaIQobChMI2Ib1w6Xo5wIVD5SzCh3eDgR9EAAYASAAEgJwXvD_BwE).
- Save the API key as `.txt` file, create a directory with the path `"~/api_keys"`, and store the API key there.

In [154]:
with open(str(Path.home() / 'api_keys/google_api.txt')) as file:
    API_KEY = file.read().replace('\n', '')

## Instantiate client object

In [9]:
# Instance of making a single call
client = googlemaps.Client(key=API_KEY, queries_per_second=2)

## Create a list of keywords

I will use the text search, input a list of key words and a list of zipcodes to fetch businesses. This key word is not exhaustive; the purpose is to sample a few common types of businesses. The decision is arbitrary.

In [136]:
LIST_OF_KEYWORDS = ['stores', 'restaurant', 'coffee shops']

## Import ZCTA data

In [117]:
zcta = pd.read_csv('../data/zcta_nyc.csv')

In [141]:
list_of_zcta = list(zcta['zcta'].values)

## Define functions for requests

In [101]:
# Define a function to send multiple requests and fetch up to 60 results
# def search_place(keyword, zipcode):
#     results = []
    
#     def _get_next_results(next_page_token=None):
#         json_result = client.places(f'{keyword} near {zipcode}', page_token=next_page_token)
#         results.extend(json_result['results'])
#         return json_result.get('next_page_token')
    
#     next_page_token = _get_next_results()
#     while next_page_token:
#         # Must wait for the next_page_token to become valid
#         # See https://github.com/googlemaps/google-maps-services-python/issues/145
#         time.sleep(2)
#         next_page_token = _get_next_results(next_page_token=next_page_token)
    
#     df_businesses = pd.DataFrame(results)
#     df_businesses['searched_keyword'] = keyword
#     df_businesses['searched_zipcode'] = zipcode
#     return df_businesses

In [105]:
def search_place(keyword, zipcode):
    '''
    Inputs one keyword and one zip code
    Send a single request
    Outputs up to 20 results
    '''
    json_result = client.places(f'{keyword} near {zipcode}')
    df_businesses = pd.DataFrame(json_result['results'])
    
    # Add columns searched_keyword and searched_zipcode to the dataframe
    df_businesses['searched_keyword'] = keyword
    df_businesses['searched_zipcode'] = zipcode
    return df_businesses

In [107]:
def serach_all_places(list_of_keywords, list_of_zipcodes):
    '''
    Inputs a list of keywords (n) and a list of zip codes (m)
    Send (n*m) requests
    Outputs up to (n*m*20) results
    '''
    df = pd.DataFrame()
    for zipcode in list_of_zipcodes:
        for keyword in list_of_keywords:
            df = df.append(search_place(keyword, zipcode), ignore_index=True, sort=False) 
    return df

## Fetch data

In [142]:
#data = serach_all_places(LIST_OF_KEYWORDS, list_of_zcta)

In [144]:
# Check the shape of the data
data.shape

(9889, 16)

## Export data as .csv file

In [149]:
data.to_csv('../data/raw_google_data_nyc.csv')