# Spot-Checking Places Insights Data with Functions and Sample Place IDs

### Overall Goal

This notebook demonstrates a workflow for spot-checking Places Insights data. It starts with a high-level statistical query to find restaurant density and then **directly visualizes both the high-level density and ground-truth sample locations from the city's busiest areas on a single, combined map.**

### Key Technologies Used

*   **[Places Insights](https://developers.google.com/maps/documentation/placesinsights):** To provide the Places Data and Place Count Function.
*   **[BigQuery](https://cloud.google.com/bigquery):** To run the `PLACES_COUNT_PER_H3` function, which provides aggregated place counts and `sample_place_ids`.
*   **[Google Maps Place Details API](https://developers.google.com/maps/documentation/places/web-service/place-details):** To fetch rich, detailed information (name, address, rating, and a Google Maps link) for the specific `sample_place_ids`.
*   **[Google Maps 2D Tiles](https://developers.google.com/maps/documentation/tile/2d-tiles-overview):** To use Google Maps as the basemap.
*   **Python Libraries:**
    * **[GeoPandas](https://geopandas.org/en/stable/)** for spatial data manipulation.
    * **[Folium](https://python-visualization.github.io/folium/latest/)** for creating the final interactive, layered map.

See [Google Maps Platform Pricing](https://mapsplatform.google.com/intl/en_uk/pricing/) For API costs assocated with running this notebook.

### The Step-by-Step Workflow

1.  **Query Aggregated Data:** We begin by querying BigQuery to count all highly-rated, operational restaurants across London, grouping them into H3 hexagonal cells. This query provides the statistical foundation for our analysis and, crucially, a list of `sample_place_ids` for each cell.

2.  **Identify Hotspots & Fetch Details:** The notebook then **automatically** identifies the 20 busiest H3 cells. It consolidates the `sample_place_ids` from all of these top hotspots into a single master list and uses the Places API to fetch detailed information for each one.

3.  **Create a Combined Visualization:** In the final step, we generate a single, layered map.
    *   The **base layer** is a choropleth "heatmap" showing restaurant density across the entire city.
    *   The **top layer** displays individual pins for all the sample restaurants from the top 20 hotspots, providing a direct, ground-level view of the locations that make up the aggregated counts. Each pin's popup includes a link to open the location directly in Google Maps.

### **How to Use This Notebook**

1.  ** Set Up Secrets:** Before you begin, you must configure two secrets in the Colab "Secrets" tab (the **ðŸ”‘ key icon** on the left menu):
    *   `GCP_PROJECT`: Your Google Cloud Project ID with access to Places Insights.
    *   `GMP_API_KEY`: Your Google Maps Platform API key. Ensure the **Maps Tile API** is enabled for this key in your GCP console.

2.  **Run the Cells:** Once the secrets are set, simply run the cells in order from top to bottom. Each visualization will appear as the output of its corresponding code cell.

In [None]:
# Install necessary libraries
# We use folium and its ecosystem for mapping.
!pip install google-cloud-bigquery geopandas shapely folium mapclassify xyzservices google-maps-places googlemaps

In [None]:
# Import libraries
from google.cloud import bigquery
from google.colab import auth, userdata, data_table
from google.api_core import exceptions

from google.maps import places_v1

import requests

import geopandas as gpd
import shapely
import sys

import pandas as pd

# Import the mapping libraries
import folium
import mapclassify # Used by .explore() for data classification
import xyzservices # Provides tile layers

In [None]:
# Configure GCP Authentication
# This part securely gets your GCP Project ID.
GCP_PROJECT_SECRET_KEY_NAME = "GCP_PROJECT" #@param {type:"string"}
GCP_PROJECT_ID = None

if "google.colab" in sys.modules:
    try:
        GCP_PROJECT_ID = userdata.get(GCP_PROJECT_SECRET_KEY_NAME)
        if GCP_PROJECT_ID:
            print(f"Authenticating to GCP project: {GCP_PROJECT_ID}")
            auth.authenticate_user(project_id=GCP_PROJECT_ID)
        else:
            raise ValueError(f"Could not retrieve GCP Project ID from secret named '{GCP_PROJECT_SECRET_KEY_NAME}'. "
                             "Please make sure the secret is set in your Colab environment.")
    except userdata.SecretNotFoundError:
        raise ValueError(f"Secret named '{GCP_PROJECT_SECRET_KEY_NAME}' not found. "
                         "Please create it in the 'Secrets' tab (key icon) in Colab.")

In [None]:
API_KEY_SECRET_NAME = "GMP_API_KEY" #@param {type:"string"}

# Initialize a variable to hold our key.
gmp_api_key = None

try:
  # Attempt to retrieve the secret value using its name.
  gmp_api_key = userdata.get(API_KEY_SECRET_NAME)
  print("Successfully retrieved API key.")

except userdata.SecretNotFoundError:
  raise ValueError(f"Secret named '{API_KEY_SECRET_NAME}' not found. "
                         "Please create it in the 'Secrets' tab (key icon) in Colab.")

In [None]:
# Enable interactive tables for pandas DataFrames
data_table.enable_dataframe_formatter()
client = bigquery.Client(project=GCP_PROJECT_ID)

In [None]:
restaurants_in_london_sql = """
-- Declare a variable to hold the GEOGRAPHY for London.
DECLARE london_boundary GEOGRAPHY;
-- Set the variable by dynamically loading the boundary
-- from the Overture Maps public dataset.
SET london_boundary = (
  SELECT geometry
  FROM `bigquery-public-data.overture_maps.division_area`
  WHERE names.primary = 'London' AND country = 'GB' LIMIT 1
);
-- Call the function with all parameters in a single JSON_OBJECT.
SELECT *
FROM
  `places_insights___gb.PLACES_COUNT_PER_H3`(
    JSON_OBJECT(
      -- Define the search area
      'geography', london_boundary,
      -- Set the aggregation grid size and other filters
      'h3_resolution', 8,
      'types', ['restaurant'],
      'business_status', ['OPERATIONAL'],
      'min_rating', 3.5,
      -- NEW FILTER: Only include places with 100 or more user ratings.
      'min_user_rating_count', 100
    )
  )
ORDER BY
  count DESC;
"""

In [None]:
# Step 1.2: Execute Query and Create GeoDataFrame
print("Running london query...")
df_restaurants_in_london = client.query(restaurants_in_london_sql).to_dataframe()

df_restaurants_in_london['geography'] = df_restaurants_in_london['geography'].dropna().apply(shapely.from_wkt)
gdf_restaurants_in_london = gpd.GeoDataFrame(df_restaurants_in_london, geometry='geography', crs='EPSG:4326')
print(f"Successfully processed {len(gdf_restaurants_in_london)} cells.")

### Visualizing Restaurant Density on a Heatmap

The code below uses the GeoDataFrame to generate a heatmap. Here's what it shows:

*   **Choropleth Map:** Each H3 hexagon on the map is colored based on the number of operational restaurants it contains.
*   **Color Scale:** The map uses a yellow-to-red color scale (`YlOrRd`). Lighter, yellow areas have fewer restaurants, while darker, red areas represent the densest hotspots.
*   **Interactivity:** You can hover over any hexagon to see its unique H3 index and the exact restaurant count. Clicking on a hexagon will open a popup with all the data for that cell.

This visualization gives us an immediate and intuitive understanding of where the major dining hubs are located throughout the city.

**Note:** This cell uses [2D Map Tiles](https://developers.google.com/maps/documentation/tile/2d-tiles-overview). Please review the documentation for pricing.

In [None]:
# Define the columns from your GeoDataFrame that you want to see in the tooltip
restaurant_tooltip_cols = [
    'h3_cell_index',
    'count'
]

# Verify the GMP API key exists.
if 'gmp_api_key' not in locals() or gmp_api_key is None:
    raise NameError("The 'gmp_api_key' variable is not defined. Please run the API key cell first.")


# Get Session Token
session_url = f"https://tile.googleapis.com/v1/createSession?key={gmp_api_key}"
payload = {"mapType": "roadmap", "language": "en-US", "region": "US"}
headers = {"Content-Type": "application/json"}

response_session = requests.post(session_url, json=payload, headers=headers)
response_session.raise_for_status()
session_data = response_session.json()
session_token = session_data['session']


# Get Dynamic Attribution from Viewport API
# We need to define a bounding box for the viewport request.
# We'll use the total bounds of our GeoDataFrame.
bounds = gdf_restaurants_in_london.total_bounds
viewport_url = (
    f"https://tile.googleapis.com/tile/v1/viewport?key={gmp_api_key}"
    f"&session={session_token}"
    f"&zoom=10"
    f"&north={bounds[3]}&south={bounds[1]}"
    f"&west={bounds[0]}&east={bounds[2]}"
)

response_viewport = requests.get(viewport_url)
response_viewport.raise_for_status()
viewport_data = response_viewport.json()

# Extract the mandatory copyright/attribution string.
google_attribution = viewport_data.get('copyright', 'Google') # Fallback to 'Google'

# Construct Tile URL and Display Map
google_tiles = f"https://tile.googleapis.com/v1/2dtiles/{{z}}/{{x}}/{{y}}?session={session_token}&key={gmp_api_key}"

# Create the map using the .explore() function on your GeoDataFrame
# This will create a choropleth map where the color of each H3 cell
# is based on the number of restaurants it contains.
london_restaurants_map = gdf_restaurants_in_london.explore(
    column="count", # The column to color the map by
    cmap="YlOrRd",   # A color map that's great for density (Yellow-Orange-Red)
    scheme="NaturalBreaks", # A smart way to group data into color buckets
    tooltip=restaurant_tooltip_cols, # The columns to show when you hover
    popup=True, # Show a popup with all data on click
    tiles=google_tiles,
    attr=google_attribution,
    style_kwds={"stroke": True, "color": "black", "weight": 0.2, "fillOpacity": 0.7} # Styling for the hexagons
)

# Display the map
display(london_restaurants_map)

### Identify Top Hotspots and Consolidate Place IDs

Now that we have the density data for all of London, we will focus our analysis on the 20 busiest areas.

The code below isolates these top 20 H3 cells and extracts up to 10 `sample_place_ids` from each one. It then displays a summary table of these hotspots before consolidating all the IDs into a single master list for analysis. This list will be used in the next step to fetch detailed information for each location.

In [None]:
# Ensure Colab's interactive data table formatter is enabled.
data_table.enable_dataframe_formatter()

# Isolate the top 20 H3 cells with the highest restaurant counts.
print(f"Identifying the top 20 busiest H3 cells from the {len(gdf_restaurants_in_london)} total cells...")
top_20_cells_df = gdf_restaurants_in_london.sort_values(by='count', ascending=False).head(20).reset_index(drop=True)
print("Top 20 cells identified.")

# For each of the top 20 cells, take the first 10 sample_place_ids.
# The .apply() method performs this slicing operation on each row's list individually.
print("Extracting up to 10 sample Place IDs from each of the top 20 cells...")
sliced_ids_series = top_20_cells_df['sample_place_ids'].apply(lambda id_list: id_list[:10])

# --- Create and display the summary table ---
print("Generating summary table of top hotspots...")
summary_df = pd.DataFrame({
    'H3 Cell Index': top_20_cells_df['h3_cell_index'],
    'Total Places in Cell': top_20_cells_df['count'],
    'Sample IDs to Analyze': sliced_ids_series.apply(len)
})
display(summary_df)


# Consolidate all the sliced lists into a single series.
# The .explode() function creates a new row for each Place ID.
all_place_ids_series = sliced_ids_series.explode()

# Get a final list of unique Place IDs to process.
place_ids_to_process = all_place_ids_series.unique().tolist()

print(f"\nConsolidated a total of {len(place_ids_to_process)} unique sample Place IDs to analyze.")

# Display the first 5 IDs as a sample
print("\nSample of Place IDs to be processed:")
print(place_ids_to_process[:5])

### Fetch Details for Consolidated Place IDs

With our consolidated list of unique Place IDs, we now use the Google Maps Places API to fetch rich, detailed information for each location.

The script below will loop through each ID and retrieve its name, address, user rating, and latitude/longitude coordinates. All of this information is then compiled into a single DataFrame, `details_df`, which will power our final, combined map visualization.

**Note:** This cell uses [Place Details API](https://developers.google.com/maps/documentation/places/web-service/place-details). Please review the documentation for pricing.

In [None]:
# Ensure Colab's interactive data table formatter is enabled.
data_table.enable_dataframe_formatter()

# Check if the list of Place IDs from Phase 1 exists.
if 'place_ids_to_process' in locals() and place_ids_to_process:

    places_client = places_v1.PlacesClient()
    if places_client:
        # Loop through the list of Place IDs and fetch details.
        place_details_list = []
        # Add 'googleMapsUri' to the list of fields we are requesting.
        fields_to_request = "displayName,formattedAddress,rating,userRatingCount,location,googleMapsUri"

        total_ids = len(place_ids_to_process)
        print(f"\nFetching details for {total_ids} unique Place IDs...")

        for i, place_id in enumerate(place_ids_to_process):
            if i > 0 and i % 50 == 0:
                print(f"  ...processed {i} of {total_ids} IDs.")

            try:
                request = {"name": f"places/{place_id}"}
                response = places_client.get_place(
                    request=request,
                    metadata=[("x-goog-fieldmask", fields_to_request)]
                )

                place_details_list.append({
                    "Name": response.display_name.text,
                    "Address": response.formatted_address,
                    "Rating": response.rating,
                    "Total Ratings": response.user_rating_count,
                    "Place ID": place_id,
                    "Latitude": response.location.latitude,
                    "Longitude": response.location.longitude,
                    # Add the new URI field to our collected data.
                    "Google Maps URI": response.google_maps_uri
                })
            except exceptions.GoogleAPICallError as e:
                print(f"  - Warning: Could not fetch details for Place ID '{place_id}': {e.message}")

        # Convert the list of details into a pandas DataFrame.
        if place_details_list:
            print(f"\nSuccessfully fetched details for {len(place_details_list)} places.")
            details_df = pd.DataFrame(place_details_list)

            # Define which columns we want to show in the summary table.
            columns_to_display = ["Name", "Address", "Rating", "Total Ratings", "Place ID"]

            print("Here is a sample of the retrieved data (Google Maps URI is hidden):")
            # Display only the selected columns from the head of the DataFrame.
            display(details_df[columns_to_display].head())
        else:
            print("\nCould not fetch details for any of the sample Place IDs.")
            details_df = pd.DataFrame()

else:
    print("The 'place_ids_to_process' list does not exist or is empty. "
          "Please run the previous cell to generate the list of IDs first.")

### Create the Combined Map

This is the final step where we bring all our analysis together into a single visualization.

The code below first creates the restaurant density heatmap, coloring each H3 cell based on the number of qualifying restaurants. Then, it iterates through the restaurant data we fetched in the previous step and overlays a  pin for each restaurant onto the map.

The result is a layered map that shows both the high-level "hotspots" and the ground-truth, individual places that make up those dense areas.

**Note:** This cell uses [2D Map Tiles](https://developers.google.com/maps/documentation/tile/2d-tiles-overview). Please review the documentation for pricing.

In [None]:
# Check if the required DataFrames from previous steps exist.
if 'gdf_restaurants_in_london' in locals() and not gdf_restaurants_in_london.empty and 'details_df' in locals() and not details_df.empty:

    restaurant_tooltip_cols = ['h3_cell_index', 'count']

    # Create the base choropleth map from the H3 cell data using GeoPandas .explore()
    print("Generating base choroplep map of restaurant density...")
    combined_map = gdf_restaurants_in_london.explore(
        column="count",
        cmap="YlOrRd",
        scheme="NaturalBreaks",
        tooltip=restaurant_tooltip_cols,
        popup=True,
        tiles=google_tiles,
        attr=google_attribution,
        style_kwds={"stroke": True, "color": "black", "weight": 0.2, "fillOpacity": 0.7}
    )

    # Iterate through the detailed restaurant data to add a marker for each one.
    print(f"Adding {len(details_df)} individual restaurant markers with Google Maps links to the map...")
    skipped_count = 0

    for index, row in details_df.iterrows():
        # Defensively validate that coordinates exist for the record.
        lat = row['Latitude']
        lon = row['Longitude']
        if pd.isna(lat) or pd.isna(lon):
            skipped_count += 1
            continue # Skip this record if coordinates are missing.

        # Clean and sanitize all data that will be used in the popup.
        name = str(row['Name']) if pd.notna(row['Name']) else "Unnamed Place"
        rating = row['Rating'] if pd.notna(row['Rating']) else "N/A"
        total_ratings = int(row['Total Ratings']) if pd.notna(row['Total Ratings']) else 0
        address = str(row['Address']) if pd.notna(row['Address']) else "No Address Provided"
        uri = str(row['Google Maps URI']) if pd.notna(row['Google Maps URI']) else "#"
        name = name.replace('`', "'")

        # Create the full HTML content for the marker's popup.
        popup_html = f"""
        <b>{name}</b><br>
        Rating: {rating} ({total_ratings} reviews)<br>
        <hr style="margin: 4px 0;">
        {address}<br><br>
        <a href="{uri}" target="_blank">View on Google Maps</a>
        """

        popup = folium.Popup(popup_html, max_width=300)

        # Create the marker and add it to the existing map object.
        folium.Marker(
            location=[lat, lon],
            tooltip=name,
            popup=popup,
            icon=folium.Icon(color='blue', icon='utensils', prefix='fa')
        ).add_to(combined_map)

    # Provide a summary message if any records were skipped.
    if skipped_count > 0:
        print(f"\nWarning: Skipped {skipped_count} marker(s) due to missing coordinates.")

    print("Map layers combined successfully. Displaying below.")
    display(combined_map)

else:
    print("One or more required DataFrames ('gdf_restaurants_in_london', 'details_df') do not exist or are empty.")
    print("Please ensure you have run the previous cells successfully before this one.")