# APIs in Python

## 1. Import Statements

In [None]:
import requests
import json
from datetime import datetime
import pandas as pd
import time
from IPython.display import Image, display

## 2. Get a Google Maps API Key

Some parts of this project require a **Google Maps API key**.

### Step 1: Create a Google Cloud Project
1. Go to:  
   https://console.cloud.google.com/
2. Sign in with your Google account.
3. Create a **new project** (or select an existing one).

### Step 2: Enable the Places API
1. In the Google Cloud Console, go to **APIs & Services → Library**.
2. Search for **Places API** (or Nearby Search API).
3. Click **Enable**.

### Step 3: Create an API Key
1. Go to **APIs & Services → Credentials**.
2. Click **Create Credentials → API Key**.
3. Copy your API key and keep it private.

> **Important:** Google may require billing information, but the free tier is sufficient for this lesson.

### Retrieving Existing API Key
1. Go to **APIs & Services → Credentials**.
2. Click **Show Key**

## 3. Copy Your API Key into this string
If you plan to publish code using APIs do **not** include your API key

In [None]:
API_KEY = ''

## 4. Searching Restaurants Given a Location
The following code block gets data for restaurants given a location and radius.
Notes:
- The first 2 lines define the location in longitude and latitude coordinates and the radius in meters from that point where our search will take place
- Each request is capped at 20 restaurants, for the sake of this excercise this is sufficient
- The output is a json file, we will learn more about reading json files later in this notebook
- The name of the output file includes the date and time. This was added to prevent accidentally overriding data as every time the code runs it would produce a file with a unique name. This is up to the user's preference and can be changed.

### Note
Do not run the code over and over again excessively, there are limits to the amount of times you can use the API before being charged (about 6,000 requests per month)

### Coordinates of Major Cities

| City               | Latitude  | Longitude   |
|-------------------|----------|------------|
| New York, NY       | 40.7128  | -74.0060   |
| Los Angeles, CA    | 34.0522  | -118.2437  |
| Chicago, IL        | 41.8781  | -87.6298   |
| Honolulu, HI       | 21.3069  | -157.8583  |
| Miami, FL          | 25.7617  | -80.1918   |
| Houston, TX        | 29.7604  | -95.3698   |
| Seattle, WA        | 47.6062  | -122.3321  |
| Boston, MA         | 42.3601  | -71.0589   |
| Denver, CO         | 39.7392  | -104.9903  |
| San Francisco, CA  | 37.7749  | -122.4194  |
| Indianapolis, IN   | 39.7684  | -86.1581   |
| Phoenix, AZ        | 33.4484  | -112.0740  |
| Des Moines, IA     | 41.5868  | -93.6250   |
| Columbus, OH       | 39.9612  | -82.9988   |
| Hartford, CT       | 41.7658  | -72.6734   |


In [None]:
# input() makes a text box that the user can input a value
latitude = input('Input your latitude coordinate (example 40.7644): ')
longitude  = input('Input your longitude  coordinate (example -73.9184): ')
LOCATION = str(latitude)+','+str(longitude)
RADIUS = 2000  # in meters, feel free to change this too

# Using the API
url = f"https://maps.googleapis.com/maps/api/place/nearbysearch/json?location={LOCATION}&radius={RADIUS}&type=restaurant&key={API_KEY}"
response = requests.get(url)
restaurants = response.json().get('results', [])

# This part creates the file name with the time stamp. Feel free to change this if you would like.
# Important - make sure your filename ends in .json
timestamp = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
filename = f'restaurants_{timestamp}.json'
   
with open(filename, 'w', encoding='utf-8') as f:
    json.dump(restaurants, f, indent=2)
    
print(f"Saved {len(restaurants)} restaurants to {filename}")

#### Check
Now let's check to see if `restaurants` has stored our data.
We should also check the folder this file is in to see if the file was created.

In [None]:
# Let's look at the type of our, it should be a list
type(restaurants)

In [None]:
# Assuming it is a list, let's look at the first element
print(f"This should be a dictionary: {type(restaurants[0])}")
restaurants[0]

#### Result
The above code should produce something like this

{'business_status': 'OPERATIONAL',<br>
 'geometry': {'location': {'lat': 40.7555621, 'lng': -73.9224658},<br>
  'viewport': {'northeast': {'lat': 40.7569783802915,<br>
    'lng': -73.9213070197085},<br>
...
}

You can also check the other elements of `restuarants` follow a similar format.

We will go further into analyzing the data later in the lesson

#### Reading Files
The output said that our data was saved to a .json file. Let's practice reading json files

This cell should output a list of dictionaries similar to the one above.

In [None]:
file_name = "example_file.json" # You can replace this with your file name
with open(file_name) as f:
    data_from_json = json.load(f)
data_from_json[0] #You can change this to look at different elements

## 5. Getting Coordinates from a Zip Code
The above code is dependent on having the longitude and latitude coordinates for the center of your search. Often times, we may not have that but instead have a zip code. We can get longitude and latitude coordinates from a zip code using Google's Geocoding API.

We first need to enable it using the following steps:
- Go to https://console.cloud.google.com/
- Make sure the correct project is selected (top bar).
- Navigate to APIs & Services → Library
- Search for Geocoding API
- Click Enable

In [None]:
ZIP_CODE = "10001" # You can change this

url = f"https://maps.googleapis.com/maps/api/geocode/json?address={ZIP_CODE}&key={API_KEY}"
response = requests.get(url)
data = response.json()
if data['results']:
    location = data['results'][0]['geometry']['location']
    lat = location['lat']
    lng = location['lng']
    print(f"Coordinates for {ZIP_CODE}: {round(lat,3)}, {round(lng,3)}")
else:
    print("No results found")

## 6. Getting Data from a Specific Address
Suppose we wanted to get rating data for a policy we plan to write or have already written. We would probably already have the address and name of the business. Let's get data from the API given a list of specific locations.

#### Note
When looking up restaurants and other businesses, the API works better if you put the name of the business in the search rather than the building number. This is because the API is designed to search for prominent locations which are primarily identified by their unique names rather than just raw address data.

In [None]:
# Addresses that we will be testing. Feel free to add and/or remove items
addresses = [
    "Gino’s Pizzeria and Restaurant Astoria, NY 11103",
    "Applebee's Grill + Bar 35th Avenue, Astoria, NY 11101",
    "DiWine Natural Wine Bar & Restaurant, Astoria, NY 11103",
    "Spyce Astoria, Astoria, NY 11103",
    "German Doner Kebab Steinway St, Astoria, NY 11103",
    "Chuck E. Cheese 48th Street, Long Island City"
]
MAX_PHOTOS = 3  # cap number of images per location

def get_place_data(address):
    '''
    This function returns a place id from an address. The address must be a string.
    '''
    url = "https://maps.googleapis.com/maps/api/place/findplacefromtext/json"
    params = {
        "input": address,
        "inputtype": "textquery",
        "fields": "place_id",
        "key": API_KEY
    }
    response = requests.get(url, params=params)
    return response.json()

def get_place_details(place_id):
    '''
    Returns detailed place information given a place_id.
    Includes fields useful for insurance pricing and risk assessment.
    '''
    url = "https://maps.googleapis.com/maps/api/place/details/json"
    params = {
        "place_id": place_id,
        "fields": (
            "name,"
            "formatted_address,"
            "types,"
            "price_level,"
            "rating,"
            "user_ratings_total,"
            "business_status,"
            "reviews,"
            "photos,"
            "website,"
            "international_phone_number,"
            "opening_hours,"
            "geometry,"
            "vicinity,"
            "plus_code"
        ),
        "key": API_KEY
    }
    response = requests.get(url, params=params)
    return response.json().get("result", {})


def build_photo_urls(photos, maxwidth=400, max_photos=MAX_PHOTOS):
    '''
    Convert photo references into usable image URLs
    '''
    urls = []
    for p in photos[:max_photos]:  # cap applied here
        ref = p.get("photo_reference")
        if ref:
            urls.append(
                "https://maps.googleapis.com/maps/api/place/photo"
                f"?maxwidth={maxwidth}&photo_reference={ref}&key={API_KEY}"
            )
    return urls

results = []

for addr in addresses:
    print(f"Searching: {addr}")

    data = get_place_data(addr)
    candidates = data.get("candidates", [])

    if candidates:
        place_id = candidates[0]["place_id"]
        details = get_place_details(place_id)

        photos = details.get("photos", [])
        details["photo_urls"] = build_photo_urls(photos)

        results.append(details)
    else:
        print("No match found.")

    time.sleep(0.2)  # Respect rate limits

timestamp = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
filename = f"restaurants_by_address_{timestamp}.json"

with open(filename, "w", encoding="utf-8") as f:
    json.dump(results, f, indent=2)

print(f"Saved {len(results)} restaurants to {filename}")


### Check
Let's see if our data pull worked as intended

In [None]:
# This is the data from the first item in the list
results[0]

In [None]:
# Here are the fields that item has
list(results[0].keys())

In [None]:
# Let's take a closer look at the images using IPython's Image and display functions
def display_photo(url, maxwidth=600):
    response = requests.get(url)
    display(Image(data=response.content))

# Assuming you pulled data for n restaurants, make this a number 0 to n-1
restaurant_number = 1
print(results[restaurant_number]['name'])
for i in range(len(results[restaurant_number]['photo_urls'])):
    display_photo(results[restaurant_number]['photo_urls'][i])

In [None]:
# Let's look more into reviews
print(len(results[0]['reviews']))
results[0]['reviews'][0]

In [None]:
# Cleaning up the formatting
restaurant_number = 3
print(results[restaurant_number]['name'])
print()
for i in range(len(results[restaurant_number]['reviews'])):
    print(results[restaurant_number]['reviews'][i]['author_name'])
    print(str(results[restaurant_number]['reviews'][i]['rating'])+'/5')
    print(results[restaurant_number]['reviews'][i]['text'])
    print('\n')

## 7. Cleaning and Interpreting Data

We have now been over multiple ways of pulling data from the API.

We will now go over methods to make our data easier to understand and use.

In [None]:
# Pick one of the json files you made in the code above
# You can also use the pre-made example file
file_name = 'example_file_detailed.json'
with open(file_name) as f:
    data_from_json = json.load(f)
data_from_json[0]

In [None]:
# Let's check to make sure our data types are all right
print(f"This should be a list: {type(data_from_json)}")
print(f"This should be a dict: {type(data_from_json[0])}")

## 8. Cleaning Data with Pandas

In [None]:
# We can convert our data from the list of dictionaries to a dataframe
pd.DataFrame(data=data_from_json)

In [None]:
# We will save the dataframe as df
df = pd.DataFrame(data=data_from_json)

In [None]:
df.columns

In [None]:
# Let's move the 'name' column to be the first in the list
columns_without_name = [c for c in df.columns if c!='name'] #all columns besides name in the same order
reordered_columns = ['name']+columns_without_name # name and then all other columns
df = df[reordered_columns]
df.head()

### Flattening Dictionaries
Some of our columns have data stored as dictionaries. Some of these dictionaries have dictionaries inside of them. Let's separate the data by column and key to make the data more readable.

In [None]:
def flatten_dict(d, parent_key=""):
    """
    Convert a nested dictionary into a single-level dictionary.

    This function takes a dictionary that may contain other dictionaries
    as values and "flattens" it so that there are no nested dictionaries.
    The keys from inner dictionaries are combined with their parent keys
    using underscores.

    Example:
        Input:
            {"geometry": {"location": {"lat": 40.7,"lng": -73.9}}}

        Output:
            {
                "geometry_location_lat": 40.7,
                "geometry_location_lng": -73.9
            }

    Parameters:
        d (dict):
            The dictionary to flatten. It may contain nested dictionaries.
        parent_key (str):
            Used internally during recursion to build up the full key name.
            When calling the function yourself, you should not set this.
    """
    items = {}

    # Loop through each key-value pair in the dictionary
    for k, v in d.items():
        # Create a new key name.
        # If we are inside a nested dictionary, prepend the parent key.
        # Otherwise, just use the current key.
        new_key = f"{parent_key}_{k}" if parent_key else k

        # If the value is another dictionary, flatten it recursively
        if isinstance(v, dict):
            # Flatten the inner dictionary and add it to the items
            items.update(flatten_dict(v, new_key))
        else:
            # If the value is not a dictionary, store it directly
            items[new_key] = v
    return items


df_expanded = df.copy()

for col in df.columns:
    if df[col].apply(lambda x: isinstance(x, dict)).any():
        flattened = df[col].apply(lambda x: flatten_dict(x) if isinstance(x, dict) else {})
        new_cols = pd.json_normalize(flattened).add_prefix(f"{col}_")
        df_expanded = pd.concat([df_expanded, new_cols], axis=1)


df = df_expanded.drop(
    columns=[col for col in df.columns if df[col].apply(lambda x: isinstance(x, dict)).any()] # Remove the columns with dictionaries in them
)
df.head()

### Unique Values
Columns that only have 1 unique value are most likely not useful. Let's write code to see the columns with only 1 unique value.

In [None]:
# Let's see the number of unique values per column
for col in df.columns:
    print()
    uniques = set(df[col].astype(str))
    print(f"{col}: {len(uniques)}")
    if(len(uniques)==1):
        print('UNIQUE VALUE: '+str(uniques))

In [None]:
df['opening_hours_open_now']

### Drop Unnecessary Columns

In [None]:
# Dropping unnecessary columns
df = df.drop(columns=['opening_hours_open_now','geometry_viewport_northeast_lat',
                      'geometry_viewport_northeast_lng','geometry_viewport_southwest_lat',
                      'geometry_viewport_southwest_lng','opening_hours_open_now','plus_code_compound_code',
                     'plus_code_global_code']) 
# Feel free to add or change this
df.head()

### Making Binary Variables for `types`
`types` is a column in our dataset containing a list of descriptors for the establishment.

For each descriptor, we can make a binary variable equal to `1` if the establishment contains the descriptor and `0` otherwise.

In [None]:
# First, let's find all unique values within the types
unique_types = set()
for sublist in df['types']:
    unique_types = unique_types | set(sublist) # This is the union of 2 sets
unique_types

In [None]:
for restaurant_type in unique_types:
    binary_values = []
    for type_list in df['types']:
        if(restaurant_type in type_list):
            binary_values.append(1)
        else:
            binary_values.append(0)
    df[restaurant_type] = binary_values
df.head()

### Making Binary Columns Based on Reviews
The presence of certain words in a review can be an indicator.

If the review has the words `"rats"`, `"mold"`, or `"dirty"`, it can be a sign that the premises is dirty.

Our API only takes the 5 top reviews. We can say that if a key word is present in multiple reviews, we can set the indicator to `1`.

We can use the review score to determine negative sentiment to make our search more refined.

In [None]:
# Keywords
cleanliness_keywords = ['rats','roaches','mold','dirty','dirt','filthy','bugs','flies','odor','grease','grime','unsanitary','sticky','stains']
food_safety_keywords = ['food poisoning','sick','vomiting','diarrhea','undercooked','raw','spoiled','expired','bad food','stale','nauseous','ache','stomach']
safety_security_keywords = ['unsafe','dangerous','fight','fighting','assault','aggressive','threatening','stolen','theft','robbery','security','police','crime']
slip_fall_keywords = ['slipped and fell','slip and fall','fell on the floor','wet floor','no wet floor sign','slippery floor','uneven floor','broken tile','loose tile','stairs without railing','no handrail']

# These will be populated with 1s and 0s as we loop through each restaurant
cleanliness_binary = []
food_safety_binary = []
safety_binary = []
slip_fall_binary = []

review_number_theshold = 2 # 2 or more that get flagged make the binary 1
minimum_review_rating = 3 # If the rating is above a 3, the review is most likely not negative

# Each restaurant has a list of reviews
for review_list in df['reviews']:
    # How many restaurants were flagged for each type of flag
    clean_flag = 0
    food_safety_flag = 0
    safe_flag = 0
    slip_flag = 0
    
    for review in review_list:
        if(review['rating']<=minimum_review_rating and max([c in review['text'].lower() for c in cleanliness_keywords])):
            clean_flag += 1
        if(review['rating']<=minimum_review_rating and max([c in review['text'].lower() for c in food_safety_keywords])):
            food_safety_flag += 1
        if(review['rating']<=minimum_review_rating and max([c in review['text'].lower() for c in safety_security_keywords])):
            safe_flag += 1
        if(review['rating']<=minimum_review_rating and max([c in review['text'].lower() for c in slip_fall_keywords])):
            slip_flag += 1

    # If 2 or more are flagged, append 1, otherwise append 0
    if(clean_flag>=2):
        cleanliness_binary.append(1)
    else:
        cleanliness_binary.append(0)

    if(food_safety_flag>=2):
        food_safety_binary.append(1)
    else:
        food_safety_binary.append(0)

    if(safe_flag>=2):
        safety_binary.append(1)
    else:
        safety_binary.append(0)

    if(slip_flag>=2):
        slip_fall_binary.append(1)
    else:
        slip_fall_binary.append(0)

# Put the binary lists in the dataframe
df['Unclean_Flag'] = cleanliness_binary
df['Food_Safety_Flag'] = food_safety_binary
df['Crime_Flag'] = safety_binary
df['Slip_Flag'] = slip_fall_binary

In [None]:
df

In [None]:
df.to_excel('output.xlsx', index=False)