This project helps identify the best locations to open a dumpling restaurant in Mumbai using open-source data. It scores areas based on foot traffic, nearby competitors, and local demographics, then visualizes the results on a map. It's a practical blend of data science, geospatial analysis, and real-world business insight.

In [17]:
# Installing required libraries
!pip install folium geopandas --quiet


## 🍽️ Competitor Detection

We use the Overpass API (based on OpenStreetMap) to query all restaurants located in the South Mumbai zone, including Lower Parel, Worli, Mahalaxmi, and nearby areas.

This data helps us identify potential competitors near our target launch zones.


In [2]:
import requests
import pandas as pd

# Define bounding box for South Mumbai (Lower Parel, Worli, etc.)
bbox = (18.972, 72.820, 19.035, 72.840)

# Overpass QL query to fetch all restaurants
query = f"""
[out:json];
node["amenity"="restaurant"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
out;
"""

# Send request to Overpass API
url = "http://overpass-api.de/api/interpreter"
response = requests.get(url, params={'data': query})
data = response.json()

# Convert results to DataFrame
elements = data['elements']
restaurants = pd.DataFrame([{
    'name': el['tags'].get('name', 'Unknown'),
    'lat': el['lat'],
    'lon': el['lon'],
    'cuisine': el['tags'].get('cuisine', 'Unknown')
} for el in elements])

restaurants.head()


Unnamed: 0,name,lat,lon,cuisine
0,Tote on the Turf,18.980469,72.820376,indian
1,Light Of Bharat,19.024242,72.837649,Unknown
2,Tasting Room,18.993249,72.822934,Unknown
3,SpiceKlub,18.994491,72.825748,indian
4,Hard Rock Cafe,19.006826,72.829369,american


## 🗺️ Visualizing Competitor Restaurants on a Map

We now plot all restaurants retrieved from OpenStreetMap on an interactive map using `folium`.  
Each red marker represents a restaurant, and the popup shows its name and cuisine type.

This gives us a quick view of how saturated the area is with dining options — crucial for identifying competition.


In [3]:
import folium

mumbai_center = [19.0144, 72.8366]  # Center of South Mumbai
m = folium.Map(location=mumbai_center, zoom_start=14)

for _, row in restaurants.iterrows():
    folium.Marker(
        [row['lat'], row['lon']],
        popup=f"{row['name']} ({row['cuisine']})",
        icon=folium.Icon(color='red', icon='cutlery', prefix='fa')
    ).add_to(m)

m


## 🚶 Foot Traffic Hotspots in South Mumbai

We query OpenStreetMap (via the Overpass API) for locations that generate high foot traffic:
- 🚉 Railway stations
- 🏬 Malls, marketplaces, retail buildings
- 🏫 Schools, colleges, universities
- 🏥 Hospitals and clinics
- 🏢 Government & corporate offices

These data points help us assess how busy each zone might be during the day — an important indicator for restaurant visibility and walk-in potential.


In [6]:
import requests
import pandas as pd

# South Mumbai bounding box (same as earlier)
bbox = (18.972, 72.820, 19.035, 72.840)

# Build Overpass QL query
query = f"""
[out:json];
(
  node["highway"="pedestrian"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["railway"="station"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["shop"="mall"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["building"="retail"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["amenity"="marketplace"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});

  node["office"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["amenity"="townhall"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["amenity"="school"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["amenity"="college"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["university"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["amenity"="hospital"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
  node["healthcare"]({bbox[0]},{bbox[1]},{bbox[2]},{bbox[3]});
);
out;
"""


# Request Overpass
url = "http://overpass-api.de/api/interpreter"
response = requests.get(url, params={'data': query})
data = response.json()

# Parse results
foot_traffic = pd.DataFrame([{
    'lat': el['lat'],
    'lon': el['lon'],
    'type': list(el.get('tags', {}).values())[0]  # label with first tag type
} for el in data['elements']])

foot_traffic.head()


Unnamed: 0,lat,lon,type
0,18.99568,72.830276,5
1,19.015229,72.824135,pharmacy
2,18.994551,72.832871,no
3,18.976622,72.832794,no
4,19.024254,72.837971,laboratory


## 🗺️ Combined Map: Competitors vs. Foot Traffic

This map visualizes:

🔴 **Restaurants (Competitors)** — Marked in red  
🔵 **Foot Traffic Generators** — Marked in blue (schools, hospitals, offices, malls, etc.)

This view helps us spot high-footfall areas and analyze how saturated they are with existing food options.


In [7]:
import folium

# Start map
m = folium.Map(location=[19.0144, 72.8366], zoom_start=14)

# Add restaurants (red)
for _, row in restaurants.iterrows():
    folium.CircleMarker(
        [row['lat'], row['lon']],
        radius=5,
        color='red',
        fill=True,
        fill_opacity=0.7,
        popup=row['name']
    ).add_to(m)

# Add foot traffic indicators (blue)
for _, row in foot_traffic.iterrows():
    folium.CircleMarker(
        [row['lat'], row['lon']],
        radius=6,
        color='blue',
        fill=True,
        fill_opacity=0.5,
        popup=row['type']
    ).add_to(m)

m


## 🧮 Spatial Grid for Scoring Zones

To evaluate which areas are promising for a dumpling restaurant, we divide South Mumbai into square zones (250m x 250m).

This allows us to:
- Count the number of foot-traffic sources and competitors in each grid.
- Assign opportunity/saturation scores.


In [8]:
import geopandas as gpd
from shapely.geometry import Point, box
import numpy as np
import pandas as pd

# Convert restaurant and foot traffic data to GeoDataFrames
def to_gdf(df, lat_col='lat', lon_col='lon'):
    return gpd.GeoDataFrame(df, geometry=gpd.points_from_xy(df[lon_col], df[lat_col]), crs='EPSG:4326')

gdf_rest = to_gdf(restaurants)
gdf_traffic = to_gdf(foot_traffic)

# Project to meters (so we can use meters for grid size)
gdf_rest = gdf_rest.to_crs(epsg=3857)
gdf_traffic = gdf_traffic.to_crs(epsg=3857)

# Get bounds
bounds = gdf_rest.total_bounds
minx, miny, maxx, maxy = bounds

# Grid size in meters
cell_size = 250

# Create grid cells
cols = np.arange(minx, maxx, cell_size)
rows = np.arange(miny, maxy, cell_size)
polygons = [box(x, y, x + cell_size, y + cell_size) for x in cols for y in rows]
grid = gpd.GeoDataFrame({'geometry': polygons}, crs=gdf_rest.crs)


## 📊 Score Each Grid Cell

We now:
- Count the number of **competitors** (restaurants) in each grid cell.
- Count the number of **foot traffic sources** (offices, schools, etc.).
- Compute a simple score:
  
\[
\text{Score} = \text{Foot Traffic Count} - \text{Competitor Count}
\]

Higher scores mean more opportunity and less saturation.


In [9]:
# Count competitors per cell
rest_join = gpd.sjoin(grid, gdf_rest, how='left')
rest_counts = rest_join.groupby(rest_join.index).size()

# Count foot traffic per cell
traffic_join = gpd.sjoin(grid, gdf_traffic, how='left')
traffic_counts = traffic_join.groupby(traffic_join.index).size()

# Combine counts and compute score
grid['competitors'] = rest_counts.fillna(0)
grid['foot_traffic'] = traffic_counts.fillna(0)
grid['score'] = grid['foot_traffic'] - grid['competitors']


## 🗺️ Visualize Opportunity Zones on Map

We now visualize each 250m x 250m grid cell on a **Folium interactive map**:
- 🟩 **Green** zones = higher foot traffic than competitors.
- 🟥 **Red** zones = higher competitor saturation.
- Hovering on each cell shows:


In [10]:
import folium
from folium import Choropleth

# Convert back to lat/lon for mapping
grid = grid.to_crs(epsg=4326)

# Create folium map
m = folium.Map(location=[19.0144, 72.8366], zoom_start=14)

# Add grid scores as color
for _, row in grid.iterrows():
    folium.GeoJson(
        row['geometry'],
        style_function=lambda x, score=row['score']: {
            'fillColor': 'green' if score > 0 else 'red',
            'color': 'gray',
            'weight': 0.5,
            'fillOpacity': 0.3 + min(score / 10, 0.4)
        },
        tooltip=f"Score: {row['score']:.0f}, FT: {int(row['foot_traffic'])}, Comp: {int(row['competitors'])}"
    ).add_to(m)

m


In [12]:
!pip install geopy --quiet


## 🏙️ Reverse Geocode: Top Dumpling Zones

We take the **top 5 scoring zones** (high foot traffic, low competitor count) and use `geopy` to **reverse geocode** their coordinates into human-readable **addresses or localities**.

This helps translate abstract map zones into **real neighborhood names**.


In [15]:
from geopy.geocoders import Nominatim
from time import sleep

# Set up geocoder with a unique user agent
geolocator = Nominatim(user_agent="dumpling_zone_locator")

# Get centroid coordinates of each zone
top_zones = grid.sort_values(by="score", ascending=False).head(5).copy()
top_zones["lat"] = top_zones.geometry.centroid.y
top_zones["lon"] = top_zones.geometry.centroid.x

# Reverse geocode function
def reverse_geocode(lat, lon):
    try:
        location = geolocator.reverse((lat, lon), timeout=10)
        return location.address if location else "Unknown"
    except:
        return "Error"

# Apply reverse geocoding with polite pauses (1 second between each call)
addresses = []
for i, row in top_zones.iterrows():
    address = reverse_geocode(row.lat, row.lon)
    addresses.append(address)
    print(f"{i+1}. 📍 {address}")
    print(f"   🚶 Foot Traffic: {int(row.foot_traffic)} | 🍽️ Competitors: {int(row.competitors)} | 📈 Score: {int(row.score)}\n")
    sleep(1)  # Respect Nominatim rate limit

# Store addresses in the DataFrame if needed
top_zones["address"] = addresses



  top_zones["lat"] = top_zones.geometry.centroid.y

  top_zones["lon"] = top_zones.geometry.centroid.x


204. 📍 दादा रेगे मार्ग, दादर पश्चिम, G/N Ward, Zone 2, Mumbai City, Maharashtra, 400028, India
   🚶 Foot Traffic: 7 | 🍽️ Competitors: 2 | 📈 Score: 5

230. 📍 दादा रेगे मार्ग, दादर पश्चिम, G/N Ward, Zone 2, Mumbai City, Maharashtra, 400028, India
   🚶 Foot Traffic: 4 | 🍽️ Competitors: 1 | 📈 Score: 3

160. 📍 Om Shanti Churchgate J.D.F Mumbai Medical Centre Trust, 1, Parsi building, ground floor, Bawlawadi,opp. voltas house, Dr Babasaheb Ambedkar Marg, Byculla East, E Ward, Zone 1, Mumbai City, Maharashtra, 400012, India
   🚶 Foot Traffic: 3 | 🍽️ Competitors: 1 | 📈 Score: 2

149. 📍 Raobahadur S.K. Bole Marg, दादर पश्चिम, G/N Ward, Zone 2, Mumbai City, Maharashtra, 400028, India
   🚶 Foot Traffic: 3 | 🍽️ Competitors: 1 | 📈 Score: 2

111. 📍 Sakubai Mohite Marg, BDD Chawl, G/S Ward, Zone 2, Mumbai City, Maharashtra, 400013, India
   🚶 Foot Traffic: 3 | 🍽️ Competitors: 1 | 📈 Score: 2

