# ☕ Data Extraction: Merida Coffee Shops

This data acquisition part uses two services from the Google Places API within the Google Maps Platform to collect data for coffee shops in Merida, Yucatan.  

---

## Google Places API Workflow

This project follows a two-step extraction process:

1. **Places Text Search (New)** – performs text-based queries to extract the Place IDs.
2. **Place Details** – uses the collected Place IDs to retrieve the complete metadata for each place.

---

## 1. Defining the Search Area

The Google Places API provides two parameters to geographically constrain results: **locationBias** and **locationRestriction**.  
Since this project requires strict adherence to Merida’s municipal limits, the **locationRestriction** parameter is selected. It defines a bounding box that limits all returned results to the specified area, ensuring spatial precision and consistency.

#### Gridding the City with Folium

Because the Google Places API limits each query to 20 results, Merida’s area is subdivided into smaller rectangular viewports.  
Each grid cell defines a separate locationRestriction (SW and NE coordinates) and is queried independently with the Text Search service.  
Once all Place IDs are collected, the **Place Details** service is called to extract complete metadata for each identified place.

The **Folium** library is used to visualize and verify the grid layout, ensuring full coverage of the study area and validating the data extraction process.


In [25]:
import folium

#Create the map of Merida using an approximate center point.
m = folium.Map(location=[20.9939879883004, -89.62853393602846],min_zoom=12)
delta = 0.0135 #The variable “delta” controls how “large” each rectangle is.

initial_lat, initial_lng = 20.891532412575916, -89.73272017481521 #This is the initial SW point where the loop start to create the other viewport rectangles. 


#The points generated will be stored and will be used when the API is called.
rectangles_viewports = []

#Generate each SW and NE point from each rectangle using the initial point
for i in range(15):
    for j in range(15):
        low = [initial_lat+i*delta, initial_lng+j*delta]
        high = [initial_lat+(i+1)*delta, initial_lng+(j+1)*delta]
        folium.Rectangle(
            bounds = [low, high],
            tooltip = f'({i+1},{j+1})',
            fill = True
        ).add_to(m)
        
        rectangles_viewports.append((tuple(low), tuple(high)))
m

In [26]:
#Number of calls that will be made to the API
print(len(rectangles_viewports))

225


## 2. Data Acquisition
Since two Google Place API services will be used, we first need to extract the IDs of the places using the SearchText service, and then use the Place Details service to extract all the relevant information for each place.

#### Place IDs extraction


In [27]:
import time
import requests
from dotenv import load_dotenv
import os
import pandas as pd
import json

In [28]:
#Loading the API key to make the requests
load_dotenv()
API_KEY = os.getenv('API_KEY')

In [29]:
#Define the API endpoint and headers to make the calls
search_url = "https://places.googleapis.com/v1/places:searchText"

search_headers = {
    'Content-Type' : 'application/json',
    'X-Goog-Api-Key': API_KEY,
    'X-Goog-FieldMask': 'places.id' #Since we will only extract the Place IDs in this part, only this parameter will be specified for it to be returned.
}

In [30]:
places_id_raw = [] #The raw data returned by the API will be stored here

#Extract the Place IDs by iterating through each of the 255 rectangle viewports, which will be saved in a .json file.
for low, high in rectangles_viewports:
    try: 
        payload = {
            'textQuery' : 'cafeteria',
            'includedType': 'cafe',
            'strictTypeFiltering':True,
            'pageSize': 20,
            'locationRestriction' : {
                'rectangle':{
                    'low':{
                        'latitude' : low[0],
                        'longitude' : low[1]
                    },
                    'high':{
                        'latitude': high[0],
                        'longitude': high[1]
                        
                    }
                }
            }
        }
        
        response = requests.post(url = search_url, json=payload, headers = search_headers)
        time.sleep(0.5)
        response.raise_for_status()
        
        data = response.json()
        
        places_id_raw.extend(data.get('places', []))
        
    except requests.exceptions.RequestException as e:
        print(f'ERROR!!! --> {e}')
    

with open('data/places_id_raw.json', 'w') as f:
    json.dump(places_id_raw, f, indent=4)

In [31]:
#All place IDs are stored in a JSON file for use when using "Place Details" calls
with open('data/places_id_raw.json', 'r') as file:
    ids_dict = json.load(file)

In [32]:
#Number of coffee shops found
len(ids_dict)

831

#### Places details data extraction


In [33]:
details_headers = {
    'Content-Type' : 'application/json',
    'X-Goog-Api-Key': API_KEY,
    'X-Goog-FieldMask':'displayName,formattedAddress,postalAddress,location,businessStatus,primaryTypeDisplayName,priceRange,rating,userRatingCount,postalAddress,websiteUri,regularOpeningHours'
}

In [34]:
details_data_raw = []

for place in ids_dict:
    try:
        details_url = f'https://places.googleapis.com/v1/places/{place['id']}'
        
        response = requests.get(url=details_url, headers=details_headers)
        time.sleep(0.5)
        response.raise_for_status()
        
        data = response.json()
        
        details_data_raw.append(data)
        
    except requests.exceptions.RequestException as e:
        print(f'ERROR!!! --> {e}')

pd.DataFrame(details_data_raw).to_csv('data/details_data_raw.csv')

In [35]:
pd.read_csv('data/details_data_raw.csv', index_col=0).tail()

Unnamed: 0,formattedAddress,location,rating,regularOpeningHours,businessStatus,userRatingCount,displayName,primaryTypeDisplayName,postalAddress,priceRange,websiteUri
826,"C. 49-B 926, entre 112 y 108 A, Fraccionamient...","{'latitude': 21.0818943, 'longitude': -89.6616...",5.0,"{'openNow': False, 'periods': [{'open': {'day'...",OPERATIONAL,10.0,"{'text': 'Frapplo', 'languageCode': 'es'}","{'text': 'Coffee Shop', 'languageCode': 'en-US'}","{'regionCode': 'MX', 'languageCode': 'en-US', ...","{'startPrice': {'currencyCode': 'MXN', 'units'...",https://www.instagram.com/frapplo_/
827,"Tablaje 34469sn Komchen, 97302 Mérida, Yuc., M...","{'latitude': 21.0824061, 'longitude': -89.6354...",5.0,,OPERATIONAL,1.0,"{'text': 'Barra andatti Gourmet Xcanatún', 'la...","{'text': 'Coffee Shop', 'languageCode': 'en-US'}","{'regionCode': 'MX', 'languageCode': 'en-US', ...",,
828,"Carr. Mérida - Progreso, 97302 Xcanatún, Yuc.,...","{'latitude': 21.0814901, 'longitude': -89.6352...",4.3,"{'openNow': False, 'periods': [{'open': {'day'...",OPERATIONAL,183.0,"{'text': 'Starbucks Carretera Progreso', 'lang...","{'text': 'Coffee Shop', 'languageCode': 'en-US'}","{'regionCode': 'MX', 'languageCode': 'en-US', ...","{'startPrice': {'currencyCode': 'MXN', 'units'...",
829,"97302 Chablekal, 97302 Mérida, Yuc., Mexico","{'latitude': 21.0935445, 'longitude': -89.5776...",,"{'openNow': False, 'periods': [{'open': {'day'...",OPERATIONAL,,"{'text': 'Colorín Colorado Chablekal', 'langua...","{'text': 'Coffee Shop', 'languageCode': 'en-US'}","{'regionCode': 'MX', 'languageCode': 'en-US', ...",,
830,"C. 21 entre 4, 97302 Chablekal, Yuc., Mexico","{'latitude': 21.091645099999997, 'longitude': ...",5.0,"{'openNow': False, 'periods': [{'open': {'day'...",OPERATIONAL,1.0,"{'text': 'DRAGÓN SUSHI🐉', 'languageCode': 'es'}","{'text': 'Coffee Shop', 'languageCode': 'en-US'}","{'regionCode': 'MX', 'languageCode': 'en-US', ...",,
