<a href="https://colab.research.google.com/github/michalinajnk/Mining_Web_Data_1/blob/main/BT4222_Google_Maps_Places_APIs_Student.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Google Maps Platform - Places API Exploration

This notebook explores the usage of the Google Maps Platform - Places API, using the retrieval of restaurant reviews as an example.

**Google Maps Platform Places API Overview** - https://developers.google.com/maps/documentation/places/web-service/overview

Version:
- Revised by Shi Yingfei shi_yingfei@u.nus.edu in January 2024.
- Revised by Wang Qiuhong in August 2024.

### Import Packages Needed

In [1]:
import requests
import json
import time
import pandas as pd

In [2]:
from google.colab import drive
drive.mount('drive')

Mounted at drive


In [3]:
# List of MRT stations for test purpose
test_mrt_stations = ["Toa Payoh", "Clementi", "Ang Mo Kio"]

In [4]:
# Test search queries for restaurants near the MRT stations
res_search_query = "restaurants in {}"
test_search_queries = [res_search_query.format(mrt) for mrt in test_mrt_stations]

print(test_search_queries)

['restaurants in Toa Payoh', 'restaurants in Clementi', 'restaurants in Ang Mo Kio']


### Generate API key

You may reference the **How to use the Places API** section through this link: https://developers.google.com/maps/documentation/places/web-service/overview#how-to-use-the-places-api

In [5]:
API_KEY = 'AIzaSyDBSZKpjVDAAPjFQlmXEH5BCzoN2TeAMEY'# Replace with your own API_KEY

### Text Search

**Text Search documentation** - https://developers.google.com/maps/documentation/places/web-service/search-text

In [6]:
def call_text_search_api(query, next_page_token=None, api_key=None):
    """Makes an API call to the Google Places API.

    Args:
        query (str): The search query to pass to the Google Places API.
        next_page_token (str): Token for the next page of results, if available.
        api_key (str): Your Google Places API key.

    Returns:
        A tuple containing a list of place IDs from the API and the next page token, if available.
    """
    # Construct the API request URL with the provided query and API key
    url = f"https://maps.googleapis.com/maps/api/place/textsearch/json?query={query}&key={api_key}"

    # If a next_page_token is provided, append it to the URL
    if next_page_token:
        url += f"&pagetoken={next_page_token}"

    # Make the API request using the constructed URL
    response = requests.get(url)

    # Check if the API request was successful (status code 200)
    if response.status_code == 200:
        # Parse the JSON response
        data = json.loads(response.content)

        # Check if the "results" key is present in the response data
        if "results" in data:
            # Extract place IDs from the results
            place_ids = [item["place_id"] for item in data["results"]]
        else:
            # Set place_ids to an empty list if no results are present
            place_ids = []

        # Extract the next page token, if available
        next_token = data.get("next_page_token", None)

        # Return the place IDs and next page token as a tuple
        return place_ids, next_token
    else:
        # Raise an exception if the API call fails
        raise Exception(f"Failed to make API call. Status code: {response.status_code}")

In [7]:
# Retrieve all place IDs based on the test search queries
all_place_ids = []

# Iterate through each search query
for query in test_search_queries:
    next_token = None

    # Continue fetching results until there is no next page token
    while True:
        # Make Text Search API call to get place IDs and the next page token
        place_ids, next_token = call_text_search_api(query, next_token, API_KEY)

        # Extend the list of place IDs with the retrieved ones
        all_place_ids.extend(place_ids)


        # If no next page token, break the loop
        if not next_token:
            break

        # Introduce a delay of 2 seconds between API calls (Google API's multi-page pacing requirements)
        time.sleep(2)

# Remove duplicates from the list of all place IDs
unique_place_ids = list(set(all_place_ids))

# Print several unique place IDs and total count
print(unique_place_ids[:10])
print(f"Number of unique place IDs: {len(unique_place_ids)}")

['ChIJz86GbZka2jERAFWm-I-74wU', 'ChIJx5tHAHwX2jERZuVzoos-cGQ', 'ChIJPRf_WwAb2jERa-Q8bxSZzJk', 'ChIJ410Ar3oX2jERATqJiwgW658', 'ChIJef0GcZUX2jERAArkEu6jCiU', 'ChIJl--U8-QX2jERgySBbJ44Zq0', 'ChIJ9y7Rbo4a2jERLC1BGHJVV_w', 'ChIJ4aryd_oX2jER8wW0zUiUhPk', 'ChIJa9-WMuYW2jERpX9iQ7SEmws', 'ChIJYZ3qqZ8b2jER456-75ycqgw']
Number of unique place IDs: 180


### Place Details

**Place Details documentation** - https://developers.google.com/maps/documentation/places/web-service/details

In [8]:
def call_place_api(url):
    """
    Makes an API call to the Google Places API.

    Args:
        url (str): The URL of the API endpoint.

    Returns:
        dict: A dictionary containing the JSON response from the API.
    """
    try:
        # Make a GET request to the specified URL
        response = requests.get(url)

        # Check if the response status code is 200 (OK)
        if response.status_code == 200:
            # Parse the JSON content of the response
            api_response = json.loads(response.content)
            return api_response
        else:
            # Raise an exception if the API call fails
            raise Exception(f"Failed to make API call. Status code: {response.status_code}")

    except Exception as e:
        # Handle and raise any exceptions that may occur during the API call
        raise Exception(f"An error occurred during the API call: {str(e)}")

In [9]:
def get_reviews(place_ids, api_key):
    """Retrieve reviews for a list of place IDs using the Google Places API.

    Args:
        place_ids (list): A list of place IDs.
        api_key (str): Your Google API key.

    Returns:
        dict: A dictionary where keys are place IDs and values are lists of reviews for each place.
    """

    all_reviews = {}

    for place_id in place_ids:
        # Construct the URL for Place Details API call
        url = f"https://maps.googleapis.com/maps/api/place/details/json?place_id={place_id}&key={api_key}"

        # Make the API call to get place details
        response = call_place_api(url)

        # Check if the response contains reviews
        if "result" in response and "reviews" in response["result"]:
            # Extract reviews and store them in the dictionary
            reviews = response["result"]["reviews"]
            all_reviews[place_id] = [{"rating": review["rating"], "text": review.get("text", "")} for review in reviews]
        else:
            # If no reviews are found, store an empty list
            all_reviews[place_id] = []

    return all_reviews

In [10]:
# Get reviews for all unique place IDs
reviews_dict = get_reviews(unique_place_ids, API_KEY)

In [11]:
# Check the list of reviews of one place_id
for place_id, reviews in reviews_dict.items():
  print(f"Reviews for Place ID: {place_id}")
  for review in reviews:
    print(review,'/n')

  break

Reviews for Place ID: ChIJz86GbZka2jERAFWm-I-74wU
{'rating': 5, 'text': 'Very quaint and cosy place hidden in the neighbourhood. Decoration of the restaurant gives a very warm and welcoming vibe. Staff’s service is good too, frequently checking on our needs and quality of food. Food serving size is big for the prices. Do pre-order Mummy’s Curry Chicken in case out of stock! Honey Mustard Wings is a must order. Crispy Fish & Fries is very nicely done and not oily. A good place for family and friends.'} /n
{'rating': 4, 'text': 'Quaint and charming western restaurant located in the quiet Faber neighbourhood.\n\nFood was unpretentious and earnestly prepared! Fish and chips was really solid; pumpkin soup was tasty and the kids meal and a bit of everything. The steak was a tad unevenly and under cooked, which was a waste. But we will certainly be back to try the rest of the menu.\n\nLoved the kids corner which had plenty of toys and books to entertain the little ones.\n\nPrice is slightly o

### Save Data

In [19]:
# Save reviews_dict as a JSON file
with open('/content/drive/MyDrive/temp/test_reviews.json', 'w') as file:
    json.dump(reviews_dict, file)

In [36]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [21]:
# Load the JSON file
with open('/content/drive/MyDrive/temp/test_reviews.json', 'r') as file:
  data = json.load(file)

In [22]:
# Flatten reviews data created from a nested dictionary (data) to a list of dictionaries
flattened_data = []

for place_id, reviews in data.items():
  for review in reviews:
        # Add the place_id to each review
        review['place_id'] = place_id
        flattened_data.append(review)

# Create a Pandas DataFrame from the flattened data
scraped_data = pd.DataFrame(flattened_data)

# Check the resulting DataFrame
print(scraped_data.shape)
scraped_data.head()

(895, 3)


Unnamed: 0,rating,text,place_id
0,5,Very quaint and cosy place hidden in the neigh...,ChIJz86GbZka2jERAFWm-I-74wU
1,4,Quaint and charming western restaurant located...,ChIJz86GbZka2jERAFWm-I-74wU
2,4,Hidden inside Faber Garden. Easy to find parki...,ChIJz86GbZka2jERAFWm-I-74wU
3,4,Had grandma curry chicken paired w fluffy brea...,ChIJz86GbZka2jERAFWm-I-74wU
4,5,Love this place and have been a patron for the...,ChIJz86GbZka2jERAFWm-I-74wU


In [39]:
# Store the scraped data into a csv file
scraped_data.to_csv("/content/drive/MyDrive/temp/test_scraped_review.csv", index = False)

In [24]:
# Check the scraped review of one place id
### [Note] Based on the Place Details documentation, the reviews contain a JSON array of up to five reviews only.
# Place - reviews:
# A JSON array of up to five reviews. By default, the reviews are sorted in order of relevance. Use the reviews_sort
# request parameter to control sorting.
# For most_relevant (default), reviews are sorted by relevance; the service will bias the results to return reviews
# originally written in the preferred language.
# For newest, reviews are sorted in chronological order; the preferred language does not affect the sort order.
# Google recommends indicating to users whether results are ordered by most_relevant or newest.

scraped_data.loc[scraped_data.place_id == unique_place_ids[0]]

Unnamed: 0,rating,text,place_id
0,5,Very quaint and cosy place hidden in the neigh...,ChIJz86GbZka2jERAFWm-I-74wU
1,4,Quaint and charming western restaurant located...,ChIJz86GbZka2jERAFWm-I-74wU
2,4,Hidden inside Faber Garden. Easy to find parki...,ChIJz86GbZka2jERAFWm-I-74wU
3,4,Had grandma curry chicken paired w fluffy brea...,ChIJz86GbZka2jERAFWm-I-74wU
4,5,Love this place and have been a patron for the...,ChIJz86GbZka2jERAFWm-I-74wU


# Questions for you
1.Could you extract “name” for each place ID and include it into the json file?

2.What is the limitation of Google Maps Places API for review retrieval? Is there any legitimate solution to avoid such a limitation?

## How to submit?

1.Download the example source code from the files folder on Canvas.
2.For programming tasks, revise the original scripts or insert yours to complete the task, with markdown to highlight the changes you made or the scripts you inserted. Show and keep your output as evidences.
3.For short-answer questions, provide your answer at the end of the example notebook file.
4.If you want to contribute something else, elaborate it and justify your contribution at the end of the example notebook file.
5.Submit your source code via the corresponding assignment on Canvas.


# Your answer

1.Yes, I could  extract “name” for each place ID and include it into the json file. Below I am attaching a code which append a name to json file as well as to csv file.

In [57]:
def get_name(place_ids, api_key):
    """Retrieve name for a list of place IDs using the Google Places API.

    Args:
        name (list): A list of place IDs.
        api_key (str): Your Google API key.

    Returns:
        dict: A dictionary where keys are place IDs and value is a name for each place.
    """

    all_names = {}

    for place_id in place_ids:
        # Construct the URL for Place Details API call
        url = f"https://maps.googleapis.com/maps/api/place/details/json?place_id={place_id}&key={api_key}"

        # Make the API call to get place details
        response = call_place_api(url)

        # Check if the response contains reviews
        if "result" in response and "name" in response["result"]:
            # Extract reviews and store them in the dictionary
            name = response["result"]["name"]
            all_names[place_id] = name
        else:
            # If no reviews are found, store an empty list
            all_names[place_id] = "NA"

    return all_names

In [58]:
names_dict = get_name(unique_place_ids, API_KEY)
df = pd.DataFrame(scraped_data)

In [59]:


# Iterate over the DataFrame rows
for index, row in df.iterrows():
    place_id = row["place_id"]
    df.at[index, "name"] = names_dict.get(place_id, "NA")

# Store the DataFrame into a CSV file
df.to_csv("/content/drive/MyDrive/temp/test_scraped_review.csv", index=False)


In [60]:
def get_reviews_with_names(place_ids, api_key):
    """Retrieve reviews and names for a list of place IDs using the Google Places API.

    Args:
        place_ids (list): A list of place IDs.
        api_key (str): Your Google API key.

    Returns:
        list: A list of dictionaries containing place_id, name, and reviews.
    """
    all_data = {}
    all_reviews = {}

    for place_id in place_ids:
        # Construct the URL for Place Details API call
        url = f"https://maps.googleapis.com/maps/api/place/details/json?place_id={place_id}&key={api_key}"

        # Make the API call to get place details
        response = call_place_api(url)

        # Check if the response contains the place name
        if "result" in response and "name" in response["result"]:
            # Extract name and store it in the dictionary
            name = response["result"]["name"]
            all_data[place_id] = name
        else:
            # If no name is found, store an empty string
            all_data[place_id] = ""

        # Check if the response contains reviews
        if "result" in response and "reviews" in response["result"]:
            # Extract reviews and store them in the dictionary
            reviews = response["result"]["reviews"]
            all_reviews[place_id] = [{"rating": review["rating"], "text": review.get("text", "")} for review in reviews]
        else:
            # If no reviews are found, store an empty list
            all_reviews[place_id] = []

    # Perform a "left join" between all_data and all_reviews
    combined_data = []
    for place_id in all_data:
        combined_entry = {
            "place_id": place_id,
            "name": all_data[place_id],
            "reviews": all_reviews.get(place_id, [])
        }
        combined_data.append(combined_entry)

    return combined_data


In [61]:
combined_data = get_reviews_with_names(unique_place_ids, API_KEY)

# Save reviews_dict as a JSON file
with open('/content/drive/MyDrive/temp/test_reviews.json', 'w') as file:
    json.dump(combined_data, file)

# Iterate over the combined data
for place in combined_data:
    print(f"Place ID: {place['place_id']}")
    print(f"Name: {place['name']}")

    # Check if there are reviewsA
    if place['reviews']:
        print("Reviews:")
        for review in place['reviews']:
            print(f"  Rating: {review['rating']}")
            print(f"  Review Text: {review['text']}")
    else:
        print("No reviews found.")

    print("-" * 40)  # Separator between places


Place ID: ChIJz86GbZka2jERAFWm-I-74wU
Name: Jovis Cafe - The Dining Place
Reviews:
  Rating: 5
  Review Text: Very quaint and cosy place hidden in the neighbourhood. Decoration of the restaurant gives a very warm and welcoming vibe. Staff’s service is good too, frequently checking on our needs and quality of food. Food serving size is big for the prices. Do pre-order Mummy’s Curry Chicken in case out of stock! Honey Mustard Wings is a must order. Crispy Fish & Fries is very nicely done and not oily. A good place for family and friends.
  Rating: 4
  Review Text: Quaint and charming western restaurant located in the quiet Faber neighbourhood.

Food was unpretentious and earnestly prepared! Fish and chips was really solid; pumpkin soup was tasty and the kids meal and a bit of everything. The steak was a tad unevenly and under cooked, which was a waste. But we will certainly be back to try the rest of the menu.

Loved the kids corner which had plenty of toys and books to entertain the lit


---

### **2. Limitations of Google Maps Places API for Review Retrieval**

**Limited Number of Reviews:**

The Google Places API only allows you to retrieve up to 5 reviews per place, even if there are many more reviews available on Google Maps. This can be a significant limitation if you're trying to gather comprehensive feedback on a location, as you'll miss out on a large portion of the user opinions that could provide valuable insights.

**Only Recent Reviews, No Historical Data:**

Another limitation is that the API only provides the most recent reviews. If you're interested in seeing how the perception of a place has changed over time or if you want to analyze older reviews, the API won't be helpful. It doesn’t offer access to a place's full review history, so you’re stuck with just the latest few opinions.

### **Avoiding the Limitation**

**Use Web Scraping (With Caution):**

To get around the limitation of only retrieving 5 reviews, one alternative is web scraping, which involves programmatically extracting data directly from the Google Maps website instead of using the API. This method could allow you to access all available reviews, not just the most recent five.

However, it's important to note that web scraping is not officially supported by Google and can be against their terms of service. This means that if Google detects that you are scraping their site, they could block your IP address, or worse, take legal action. Additionally, scraping often involves complex code and can be affected by changes in the website’s structure. Because of these risks, it's essential to consider the ethical and legal implications carefully before deciding to use web scraping. For academic purposes or legitimate data analysis, it’s better to explore other methods or to seek permission from Google.

