# Main sandbox for R&D

## Initialization
<p> All imports goes here </p>

In [2]:
import pandas as pd
import osmnx as ox
import folium
import geopandas as gpd
from shapely.geometry import Point, Polygon
import plotly.express as px
import hdbscan
import math

ox.config(log_console=True, use_cache=True)

  ox.config(log_console=True, use_cache=True)


## Raw Data Visualization
<p> First off we are going to take a peek at data on the map and see what we are dealing with </p>

In [3]:
# Reading the given csv file
coord_df = pd.read_csv("../data/rides-data.csv", index_col=0).reset_index(drop=True)
coord_df

Unnamed: 0,id,origin_lat,origin_lng,destination_lat,destination_lng
0,8050084674,35.769844,51.366798,35.761581,51.403202
1,8055547548,35.698536,51.490612,35.658817,51.397949
2,8052893242,35.729149,51.554649,35.742706,51.565334
3,8067231026,35.685909,51.419937,35.745754,51.420338
4,8051092783,35.742085,51.438961,35.603508,51.400173
...,...,...,...,...,...
99995,8074092556,35.787411,51.503502,35.795544,51.498772
99996,8064670176,35.763897,51.348629,35.775799,51.347256
99997,8050188722,35.685860,51.414677,35.784729,51.353745
99998,8071806219,35.738853,51.467167,35.745167,51.398777


### Reshaping the data for viewing.
<p> At first glance we don't need the type of the coordinates (Origin, Destination) as so if people are going to the crowded places, namely, universities and malls they probably are going to also use Snapp to get back. So overall we want to see the density over these places whether they are going to them or coming from them </p>

In [4]:
# Reshaping the DataFrame
df_long = pd.melt(
    coord_df,
    id_vars=["id"],
    value_vars=["origin_lat", "origin_lng", "destination_lat", "destination_lng"],
    var_name="type",
    value_name="coordinate",
)

# Split 'type' into two columns 'location_type' and 'coord_type'
df_long["location_type"] = df_long["type"].apply(lambda x: x.split("_")[0])
df_long["coord_type"] = df_long["type"].apply(lambda x: x.split("_")[1])

# Pivot the table to get 'lat' and 'long' in separate columns
df_points = df_long.pivot_table(
    index=["id", "location_type"],
    columns="coord_type",
    values="coordinate",
    aggfunc="first",
).reset_index()

# Rename columns for clarity
df_points.columns = ["id", "coord_type", "lat", "long"]

df_points.to_csv("../data/exploded_rides_data.csv")

<p> So by looking at map specially the places mentioned in the hint part we see that there is a high density of points there. We can assume that the task of finding entrances of these crowded places is an unspervised learning problem,
specifcally we need clustering mecahnism for points around these and the centroids of the clusters probably can give us a good estimates on the entrances. </p>

## Point Filtering

<p>
So now we want to try and find the entrances of these places mentioned in the hint part. Let's sit back and think a little about how we can do so. As mentioned we want to perform clustering on the data.
But before doing so on which points should we do the clustering?

The obvious exlusion is that we should not use all available points to us, because doing so leads to great error. Also not only malls and universities are crowded. There are other crowded places like squares (Enghelab, Azadi), intersections (valiasr) and more which don't have entrances.

So first we need to filter the points. But how can we do so?

One solution that comes to mind is to find the center of the place and then set a radius for example 100 meters and filter the points from the rides using this and then perform clustering.

There is a princial flaw to this method: The circle we choose as boundary may not contain relevent points or worst case any points at all depending on the structure of the place.
Let's illustrate this by some images.

<img src="../images/bad-boundary1.svg" width="400" height="400">
<br>
<small>Bad Boundary Example: The building is too long and the circle boundary does not contain relevent points</small>
<br><br>
<img src="../images/bad-boundary2.svg" width="400" height="400"><br>
<small>Bad Boundary Example. The center is outside the building structure itself.</small>

We may increase the 100 meters radius mentioned but this may lead to finding extra points that not relevent to that specific place, like if two malls are close to each other if the defined radius threshold is too high the boundary of one mall may contain points from other mall (Near San'at square exists several malls that are near each other like Setin, Milad Noor and Lidoma) and that leads to error in clustering.

Also some points that are inside of the building gets included by this approach (entrances are not placed inside a place).
As you see this method will not work for this case.

So what other options do we have?

How about choosing points that reside at the boundries of the place with a certain distance (Buffered Polygon)?
This is a more intuitive and logical approach as it is shape-agnostic and does not depened on the structure of the building. Also it reduces the chance of overlapping for places that are near each other (This of course needs a good distance value from polygon boundaries and is a heuristic that needs to be calculated carefully).
Another merit of this approach is that it does not contain the points inside of the building.<br>
<b>NOTE:</b> By looking at the points visualization there are not many points inside of a building and these can probably be removed using outlier detection methods, but this is prone to error and buffered polyon can be assumed as a safer and more accurate approach.


So overall by using this we can filter more accurate points that can be used in clustering for finding entrances.


<img src="../images/iranmall-buffer.png" width="488" height="348">
<br>
<small>Buffered Iran Mall Region</small>

So to summarize we discussed two methods for point filtering:
<ul>
    <li>
        <b>Circular Boundary From The Center</b><br>
        <small>Disadvantages</small>
        <ul>
            <li>May not contain all points</li>
            <li>May not contain any points at all</li>
            <li>May overlap with points of other places</li>
            <li>Contains points inside of a place</li>
        </ul>
        <small>Advantages</small>
        <ul>
            <li>Easy to implement and understand</li>
        </ul>
    </li>
    <li>
        <b>Buffered Polygon</b><br>
        <small>Disadvantages</small>
        <ul>
            <li>Harder to implement</li>
        </ul>
        <small>Advantages</small>
        <ul>
            <li>More intuitive</li>
            <li>Contains relevent points</li>
            <li>Exludes points inside of a place</li>
        </ul>
    </li>
</ul>
</p>

<p> Now let's dive into coding </p>

### Finding polygons of places

In [5]:
"""
# Use queries to find polygons
place_query = "South Terminal, Tehran, Iran"

# Fetch the geometries data
results = ox.features_from_place(place_query, tags={"place": True})
results
"""

'\n# Use queries to find polygons\nplace_query = "South Terminal, Tehran, Iran"\n\n# Fetch the geometries data\nresults = ox.features_from_place(place_query, tags={"place": True})\nresults\n'

<p>
These sources where used for finding the OSM Ids:
<ul>
    <li><a href=https://www.openstreetmap.org>Open Street Map </a></li>
    <li><a href=https://nominatim.openstreetmap.org> Nominatim API </a></li>
<ul>
</p>

In [6]:
# Obtained by searching through google maps and OSM results.
# Buffer is a heuristic defined by looking at map and trial and error
places_meta = [
    {"place": "Opal", "osm_query": "W498492266", "buffer": 5},
    {"place": "Koroush", "osm_query": "W320902874", "buffer": 30},
    {"place": "Iran Mall", "osm_query": "R8129683", "buffer": 50},
    {"place": "Paladium", "osm_query": "W678453222", "buffer": 10},
    {"place": "Mehr Abad", "osm_query": "W175770954", "buffer": -50},
    {"place": "West Terminal", "osm_query": "W182016096", "buffer": 50},
    {"place": "Imam Khomeini Hospital", "osm_query": "W191445129", "buffer": 10},
    {"place": "Shariati Hospital", "osm_query": "W438148006", "buffer": 10},
    {
        "place": "Technical Faculties of Tehran University",
        "osm_query": "W385628505",
        "buffer": 10,
    },
]

In [7]:
# South Terminal polygon was not available in OSM so I draw it manualy on https://geojson.io/
south_terminal_geojson = {
    "type": "FeatureCollection",
    "features": [
        {
            "type": "Feature",
            "properties": {},
            "geometry": {
                "coordinates": [
                    [
                        [51.41681851494866, 35.65187874692117],
                        [51.41744830103832, 35.64760059502382],
                        [51.42087064573448, 35.64719545756819],
                        [51.421273776391985, 35.64733883570075],
                        [51.42128511578716, 35.65056043270057],
                        [51.41681851494866, 35.65187874692117],
                    ]
                ],
                "type": "Polygon",
            },
        }
    ],
}
# Extract coordinates
coordinates = south_terminal_geojson["features"][0]["geometry"]["coordinates"][0]
ST_GDF = gpd.GeoDataFrame(
    geometry=[Polygon(coordinates)]
)  # South Terminal Geo Dataframe
ST_GDF = ST_GDF.set_crs(epsg=4326)

### Visualizing buffered polygons

In [8]:
# The function for handling south terminal buffered polygon.
def handle_south_terminal(m) -> gpd.geopandas.GeoDataFrame:
    global ST_GDF
    """Function for handling the south terminal polygon.
    Args:
        m: The folium map object.

    Returns:
        A GeoDataFrame containing the south terminal polygon and its buffered version.
    """
    # Project to UTM zone appropriate for Tehran (for accurate distance measurements)
    polygon = ST_GDF.to_crs(epsg=32639)
    # Buffer the polygon by meters
    buffered_polygon = polygon["geometry"].buffer(20)

    # Project back to WGS84 for mapping
    polygon = polygon.to_crs(epsg=4326)
    buffered_polygon = buffered_polygon.to_crs(epsg=4326)

    # Adding the polygon to the map
    folium.GeoJson(polygon, name="Polygons").add_to(m)
    # Adding the buffered polygon to the map
    folium.GeoJson(
        buffered_polygon,
        name="Buffers",
        style_function=lambda x: {
            "color": "red",
            "fillColor": "red",
            "fillOpacity": 0.1,
        },
    ).add_to(m)

    return gpd.GeoDataFrame(
        {
            "name": ["South Terminal"],
            "original_polygon": [polygon.geometry.iloc[0]],
            "buffered_polygon": [buffered_polygon.geometry.iloc[0]],
        },
        geometry="original_polygon",
        crs="EPSG:4326",
    )

In [9]:
# Initialize the map with Azadi square
m = folium.Map(location=[35.699704, 51.337433], zoom_start=15)
entries = []

for place_meta in places_meta:
    # Fetch geometries from OSM
    polygon = ox.geocode_to_gdf(place_meta["osm_query"], by_osmid=True)
    # Project to UTM zone appropriate for Tehran (for accurate distance measurements)
    polygon = polygon.to_crs(epsg=32639)
    # Buffer the polygon by meters
    buffered_polygon = polygon["geometry"].buffer(place_meta["buffer"])

    # Project back to WGS84 for mapping
    polygon = polygon.to_crs(epsg=4326)
    buffered_polygon = buffered_polygon.to_crs(epsg=4326)

    # Adding the polygon to the map
    folium.GeoJson(polygon, name="Polygons").add_to(m)
    # Adding the buffered polygon to the map
    folium.GeoJson(
        buffered_polygon,
        name="Buffers",
        style_function=lambda x: {
            "color": "red",
            "fillColor": "red",
            "fillOpacity": 0.1,
        },
    ).add_to(m)
    # Append data to the GeoDataFrame
    temp_gdf = gpd.GeoDataFrame(
        {
            "name": [place_meta["place"]],
            "original_polygon": [polygon.geometry.iloc[0]],
            "buffered_polygon": [buffered_polygon.geometry.iloc[0]],
        },
        geometry="original_polygon",
        crs="EPSG:4326",
    )
    entries.append(temp_gdf)

# For the sake of readability and cleaner code we defined a function for handling South Terminal
south_terminal_gdf = handle_south_terminal(m)
entries.append(south_terminal_gdf)
# The geoDataFrame containing original and buffered polygons for each place.
gdf = pd.concat(entries, ignore_index=True)

# Visualizing buffered polygons
folium.LayerControl().add_to(m)
m.save("/home/reza/Desktop/snapp-task/buffered_polygons.html")
m.save("../data/buffered_polygons.html")

In [10]:
gdf

Unnamed: 0,name,original_polygon,buffered_polygon
0,Opal,"POLYGON ((51.35069 35.77713, 51.35075 35.77685...","POLYGON ((51.35067872746119 35.77717418172322,..."
1,Koroush,"POLYGON ((51.31350 35.73830, 51.31369 35.73827...","POLYGON ((51.31317420547965 35.73834576203107,..."
2,Iran Mall,"POLYGON ((51.18893 35.75469, 51.18955 35.75195...",POLYGON ((51.188825132572084 35.75512866221741...
3,Paladium,"POLYGON ((51.41324 35.79645, 51.41335 35.79616...",POLYGON ((51.41316458343045 35.796513166194565...
4,Mehr Abad,"POLYGON ((51.26210 35.68252, 51.26216 35.68238...",POLYGON ((51.262671789664985 35.68255298474467...
5,West Terminal,"POLYGON ((51.33123 35.70714, 51.33157 35.70185...","POLYGON ((51.33085766113199 35.70747283366887,..."
6,Imam Khomeini Hospital,"POLYGON ((51.37803 35.70742, 51.37980 35.70737...",POLYGON ((51.377922560887576 35.70741795313682...
7,Shariati Hospital,"POLYGON ((51.38520 35.71957, 51.38697 35.71986...","POLYGON ((51.38509296800884 35.7195772130028, ..."
8,Technical Faculties of Tehran University,"POLYGON ((51.38450 35.72639, 51.38453 35.72506...","POLYGON ((51.38448632080887 35.72647678402527,..."
9,South Terminal,"POLYGON ((51.41682 35.65188, 51.41745 35.64760...",POLYGON ((51.416893643085324 35.65204832742704...


### Assigning Points

In [11]:
# Converting points to Geo Dataframe
points_geodf = gpd.GeoDataFrame(
    {
        "geometry": [
            Point(i["long"], i["lat"])
            for i in df_points[["long", "lat"]].to_dict("records")
        ]
    },
    crs="EPSG:4326",
)
points_geodf

Unnamed: 0,geometry
0,POINT (51.44328 35.71973)
1,POINT (51.39507 35.73000)
2,POINT (51.15490 35.61023)
3,POINT (51.16530 35.60750)
4,POINT (51.48682 35.63324)
...,...
199995,POINT (51.33881 35.72066)
199996,POINT (51.49934 35.70968)
199997,POINT (51.45044 35.67175)
199998,POINT (51.31819 35.73754)


In [12]:
# Finding each point belong to which place based on buffered polygon
points_geodf["belongs_to"] = None
for idx, row in gdf.iterrows():
    # A point must be inside the buffered polygon
    within_mask = points_geodf.within(row["buffered_polygon"])
    # # But outside the original polygon (Mehra Abad is the exclusion because of bad polygon)
    # exclusion_mask = (
    #     ~points_geodf.within(row["original_polygon"])
    #     if row["name"] != "Mehr Abad"
    #     else pd.Series([True] * len(points_geodf))
    # )
    # combined_mask = within_mask & exclusion_mask
    points_geodf.loc[
        within_mask & points_geodf["belongs_to"].isnull(), "belongs_to"
    ] = row["name"]
points_geodf = (
    points_geodf[["geometry", "belongs_to"]]
    .dropna(subset=["belongs_to"])
    .reset_index(drop=True)
)
points_geodf.to_csv("../data/points_geodf.csv", index=False)
points_geodf

Unnamed: 0,geometry,belongs_to
0,POINT (51.19345 35.75567),Iran Mall
1,POINT (51.35123 35.77675),Opal
2,POINT (51.33088 35.68904),Mehr Abad
3,POINT (51.41695 35.65041),South Terminal
4,POINT (51.32251 35.69165),Mehr Abad
...,...,...
2772,POINT (51.32184 35.69231),Mehr Abad
2773,POINT (51.31413 35.73798),Koroush
2774,POINT (51.32312 35.69128),Mehr Abad
2775,POINT (51.32174 35.69170),Mehr Abad


### Ride Points Visualization
<p> Based on buffered polygon </p>

In [13]:
"""
# Visualizing the points on map
for idx, point in points_geodf.iterrows():
    folium.Marker(
        location=[point.geometry.y, point.geometry.x],
        popup=f"{point["belongs_to"]}",
        icon=folium.Icon(color="green", icon="info-sign"),
    ).add_to(m)
# m.save("/home/reza/Desktop/snapp-task/points_inside_buffer.html")
m.save("../data/points_inside_buffer.html")
"""

'\n# Visualizing the points on map\nfor idx, point in points_geodf.iterrows():\n    folium.Marker(\n        location=[point.geometry.y, point.geometry.x],\n        popup=f"{point["belongs_to"]}",\n        icon=folium.Icon(color="green", icon="info-sign"),\n    ).add_to(m)\n# m.save("/home/reza/Desktop/snapp-task/points_inside_buffer.html")\nm.save("../data/points_inside_buffer.html")\n'

### Density of each place
<p> Let's view the density of each place </p>

In [14]:
# Count the occurrences of each place
counts = points_geodf["belongs_to"].value_counts().reset_index()
counts.columns = ["belongs_to", "count"]

# Create a bar chart
fig = px.bar(
    counts,
    x="belongs_to",
    y="count",
    title="Count of Points by Location",
    text="count",
    color_continuous_scale="Viridis",
)
fig.update_layout(
    xaxis_title="Location",
    yaxis_title="Count",
    xaxis={"categoryorder": "total descending"},
)
fig.show()

## ML

<p> This is where the magic happens! </p>
<p>
So let's talk about clustering algorithm!

At first when clustering term is heard the first algorithm that comes to mind is K-means.
But let's first think about the nature of our data.
It is a density based data, meaning if there exists density of points in a place then probably there is an entrance there.
Also the density between two entrances may vary, meaning one entrance maybe more crowded than the other one. For example the Tehran Pardis entrance near Kargar St. is more crowded than the opposite entrance.


Also we don't know the exact number of entrances. It varies depending on the place.


One other thing is that our clusters my not be spherical necessary, the points may span over streets near the entrances.


With these in mind, let's compare some algorithms.

<ul>
    <li><a href="https://en.wikipedia.org/wiki/K-means_clustering">K-Means</a>
        <ul>
            <li>Use Cases
                <ul>
                    <li>When the number of clusters k is known a priori.</li>
                    <li>Suitable for clusters that are roughly <u>spherical and of similar size</u>.</li>
                </ul>
            </li>
            <li>Limitations
                <ul>
                    <li>The need to <u>specify k in advance</u>.</li>
                    <li>Sensitivity to initial centroid placement.</li>
                    <li>Not suitable for identifying clusters with non-spherical shapes or varying densities.</li>
                </ul>
            </li>
        </ul>
    </li>
    <li><a href="https://en.wikipedia.org/wiki/DBSCAN">DBSCAN</a>
        <ul>
            <li>Use Cases
                <ul>
                    <li>Suitable for data containing <u>clusters of similar density</u>.</li>
                    <li>Works well with large datasets.</li>
                    <li>Good for applications where the cluster structure is not known in advance and can vary in shape.</li>
                </ul>
            </li>
            <li>Limitations
                <ul>
                    <li><u>Sensitivity to the eps and minPts parameters,</u> which can be hard to tune.</li>
                    <li>Does not perform well with data of <u>varying densities</u> or when there are large differences in cluster sizes.</li>
                </ul>
            </li>
        </ul>
    </li>
    <li><a href="https://hdbscan.readthedocs.io/en/latest/how_hdbscan_works.html">HDBSCAN</a>
        <ul>
            <li>Use Cases
                <ul>
                    <li>When the <u>number of clusters is unknown</u> or there is variance in the density of clusters.</li>
                    <li>Effective at handling noise and outliers.</li>
                    <li>Capable of identifying clusters of arbitrary shapes.</li>
                </ul>
            </li>
            <li>Limitations
                <ul>
                    <li>Higher computational complexity, especially with very large datasets.</li>
                    <li><u>Performance can be sensitive to the parameters</u> used for defining density (min_samples, min_cluster_size).</li>
                    <li>Can be harder to interpret and tune compared to more straightforward algorithms like k-means.</li>
                </ul>
            </li>
        </ul>
    </li>
    <li><a href="https://en.wikipedia.org/wiki/Spectral_clustering">Spectral Clustering</a>
        <ul>
            <li>Use Cases
                <ul>
                    <li>Works well for <u>non-convex clusters</u> and where the cluster structure is connected but not necessarily compact or separated by linear boundaries.</li>
                    <li>Effective in scenarios where the data <u>can be represented as a graph</u>,, e.g: Social Network.</li>
                </ul>
            </li>
            <li>Limitations
                <ul>
                    <li>Computationally intensive</li>
                    <li><u>Choosing the number of clusters ahead of time</u> is required, similar to k-means.</li>
                    <li>Performance heavily depends on the quality of the similarity graph.</li>
                </ul>
            </li>
        </ul>
    </li>
</ul>

By viewing the algorithms and their pros and cons <code>HDBSCAN</code> seems to be the optimal choice for our use case.


<b>Reasons</b>
<ul>
    <li>It does not assume cluster are spherical.</li>
    <li>It does not need the number of clusters to be defined beforehand.</li>
    <li>Handles outliers.</li>
    <li>Does not need domain knowledge for clustering (Something neeeded in <code>DBSCAN</code> to tune paramters for better results).</li>
</ul>

<p>So with all of that reasoning we have our algorithm, let's delve into the code</p>
</p>


In [15]:
"""
What is happening in this cell?
┌───────────────┐      ┌─────────┐      ┌──────────────┐     ┌────────────────────┐
│               │      │         │      │              │     │                    │
│ Process Data  ├──────► Cluster ├──────►Find Centroids├─────►  Visualize On Map  │
│               │      │         │      │              │     │                    │
└───────────────┘      └─────────┘      └──────────────┘     └────────────────────┘
"""

places = points_geodf["belongs_to"].unique().tolist()
place_entrances = {}
# We are going to perform clustering for each region
for place in places:
    # ---------------------------------- Process Data ----------------------------------
    place_gdf = points_geodf[points_geodf["belongs_to"] == place]  # Filtering points

    # Extracting longitude and latitude
    place_gdf["long"] = [p.centroid.x for p in place_gdf["geometry"].to_list()]
    place_gdf["lat"] = [p.centroid.y for p in place_gdf["geometry"].to_list()]
    place_gdf = place_gdf[["long", "lat"]].reset_index(drop=True)

    # Convert to numpy for usage in clustering
    coords = place_gdf[["long", "lat"]].values

    # ---------------------------------- Clustering ----------------------------------
    clusterer = hdbscan.HDBSCAN(
        min_cluster_size=int(math.sqrt(len(coords))), metric="euclidean"
    )  # min_cluster_size is a heuristic calculated based on trial and error.
    cluster_labels = clusterer.fit_predict(coords)

    # Assigning cluster labels to the DataFrame
    place_gdf["cluster"] = cluster_labels

    # ---------------------------------- Finding centroids ----------------------------------
    centroids = []
    for label in set(cluster_labels):
        members = place_gdf[place_gdf["cluster"] == label][["long", "lat"]]
        centroid = members.mean(axis=0)
        centroids.append((centroid["long"], centroid["lat"], label))

    # Convert centroids to a DataFrame
    centroids_df = pd.DataFrame(centroids, columns=["Longitude", "Latitude", "label"])

    entrances = []
    # ---------------------------------- Visualization on map ----------------------------------
    for idx, point in centroids_df.iterrows():
        # If there is only one cluster (HDBSCAN marks all as outerliers i.e: -1) we want that entrance too,
        # But if there are more than one cluster we don't want the centroid of outliers.
        if (point["label"] == -1 and len(centroids_df) == 1) or (
            len(centroids_df) >= 2 and point["label"] != -1
        ):
            entrances.append((point["Latitude"], point["Longitude"]))
            folium.Marker(
                location=[point["Latitude"], point["Longitude"]],
                popup=f"Entrance",
                icon=folium.Icon(color="red", icon="star"),
            ).add_to(m)
    place_entrances[place] = entrances
# m.save("/home/reza/Desktop/snapp-task/test_center.html")
m.save("./entrances-final.html")



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy



A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/

<p>
And Voilà with that we have our entrances!
There are some errors of course for example the polygon for <u>Mehr Abad</u> was defined inaccurately and so we have some extra entrances, which is due to the filtering part.
</p>

In [18]:
# Saving entrances per polygon for insertion into postgres
gdf["entrances"] = gdf["name"].map(lambda name: place_entrances.get(name, None))

gdf.to_csv("../data/places_meta.csv")
gdf

Unnamed: 0,name,original_polygon,buffered_polygon,entrances
0,Opal,"POLYGON ((51.35069 35.77713, 51.35075 35.77685...","POLYGON ((51.35067872746119 35.77717418172322,...","[(35.776886962180946, 51.35172280599907), (35...."
1,Koroush,"POLYGON ((51.31350 35.73830, 51.31369 35.73827...","POLYGON ((51.31317420547965 35.73834576203107,...","[(35.739664579692636, 51.31443144145764), (35...."
2,Iran Mall,"POLYGON ((51.18893 35.75469, 51.18955 35.75195...",POLYGON ((51.188825132572084 35.75512866221741...,"[(35.75570514385517, 51.19295988816483), (35.7..."
3,Paladium,"POLYGON ((51.41324 35.79645, 51.41335 35.79616...",POLYGON ((51.41316458343045 35.796513166194565...,"[(35.79658985137938, 51.41338109970094)]"
4,Mehr Abad,"POLYGON ((51.26210 35.68252, 51.26216 35.68238...",POLYGON ((51.262671789664985 35.68255298474467...,"[(35.68895283867331, 51.33064978262957), (35.6..."
5,West Terminal,"POLYGON ((51.33123 35.70714, 51.33157 35.70185...","POLYGON ((51.33085766113199 35.70747283366887,...","[(35.701796454352305, 51.33142120773727), (35...."
6,Imam Khomeini Hospital,"POLYGON ((51.37803 35.70742, 51.37980 35.70737...",POLYGON ((51.377922560887576 35.70741795313682...,"[(35.70791048322404, 51.38310372488839), (35.7..."
7,Shariati Hospital,"POLYGON ((51.38520 35.71957, 51.38697 35.71986...","POLYGON ((51.38509296800884 35.7195772130028, ...","[(35.722119418057545, 51.38687942967272), (35...."
8,Technical Faculties of Tehran University,"POLYGON ((51.38450 35.72639, 51.38453 35.72506...","POLYGON ((51.38448632080887 35.72647678402527,...","[(35.72532963752746, 51.38824319839477), (35.7..."
9,South Terminal,"POLYGON ((51.41682 35.65188, 51.41745 35.64760...",POLYGON ((51.416893643085324 35.65204832742704...,"[(35.64745114234185, 51.418958208637854), (35...."
