I've prepared a small subset of the FourSquare Open Source Places dataset (just based off a rough coordinate box, having looked at the other options and found them likely to lose data).

## Initial mapping

Firstly, here is a simple map of London. Uncomment the final lines to show the map (you may also need to `.head()` the `london_places` variable assignment to lower the load on your CPU).

I used this to determine the coordinates I wanted to consider as 'London' (a rectangle going from Putney in the West to Greenwich in the East, and Tottenham in the North to Streatham in the South).

In [1]:
import folium
import polars as pl
from london_analysis import load_london_places, load_places_geojson
from folium.plugins import MarkerCluster

london_places = load_london_places().select(pl.exclude("geom", "bbox"))
geojson_data = load_places_geojson(london_places)

def make_markers(places: pl.DataFrame) -> folium.GeoJson:
    return folium.GeoJsonTooltip(
        fields=list(places.columns),  # Display the "name" property on hover
        aliases=[k.title().replace("Po_", "PO_").replace("_", " ") for k in places.columns],
    )

def make_map():
    m = folium.Map(location=[51.5, -0.06], zoom_start=12)
    # Add the GeoJSON layer to the map
    folium.GeoJson(
        geojson_data,
        name="London Places",
        tooltip=make_markers(london_places),
    ).add_to(m)
    return m

# m = make_map()
# m

## Filtering by category

The all important feature in this dataset is the category labels. From this we can filter down the dataset to one more modest (thus possible to view as a human and with a limited amount of browser CPU power) as well as to tell how complete the dataset actually is.

Note here that it's important to handle the null case separately (in this dataset, or at least how Polars loads it, `null` is used to represent no category labels instead of an empty list).

Clicking on the drop down box will update the map to show only the places in that category (80% have 1 category label, 11% have none, 6% have 2, under 2% have 3).

In [6]:
from ipywidgets import interact, Dropdown

# print(london_places.columns)

def update_map(category):
    m = folium.Map(location=[51.5, -0.06], zoom_start=12)
    # Filter data by category
    cat_col = pl.col("fsq_category_labels")
    if category == "None":
        mask = cat_col.is_null()
    else:
        mask = cat_col.list.contains(category)
    filtered_places = london_places.filter(mask)
    print(f"{len(filtered_places):,} places")
    filtered_geojson = load_places_geojson(filtered_places)
    folium.GeoJson(
        filtered_geojson,
        name="Filtered London Places",
        tooltip=make_markers(filtered_places),
    ).add_to(m)
    return m

unique_categories = london_places["fsq_category_labels"].explode().unique().sort().fill_null("None").to_list()
# Interactive dropdown for category selection
#  interact(
#     update_map,
#     category=Dropdown(
#         options=unique_categories, 
#         description="Category:", 
#         value=unique_categories[1]
#     )
# )
# m

I've left that commented out for now as we can do even better: we can search the categories in a text box, to help find the ones we want. I do that here and make a small UI:

In [8]:
import folium
import polars as pl
from ipywidgets import interact, Dropdown, Text, VBox, HBox, Output
from IPython.display import display, HTML
from london_analysis import load_london_places, load_places_geojson

# Load and preprocess the data
london_places = load_london_places().select(pl.exclude("geom", "bbox"))

# Outputs for the map and the DataFrame
map_output = Output()
df_output = Output()

# Function to update both map and DataFrame
def update_map_and_df(category):
    with map_output:
        map_output.clear_output()  # Clear previous map
        m = folium.Map(location=[51.5, -0.06], zoom_start=12)
        
        cat_col = pl.col("fsq_category_labels")
        if category == "None":
            mask = cat_col.is_null()
        else:
            mask = cat_col.list.contains(category)
        filtered_places = london_places.filter(mask)
        
        # Display filtered places count
        print(f"{len(filtered_places):,} places")

        filtered_geojson = load_places_geojson(filtered_places)
        folium.GeoJson(
            filtered_geojson,
            name="Filtered London Places",
            tooltip=make_markers(filtered_places),
        ).add_to(m)
        display(m)  # Display map in the output widget
    
    with df_output:
        df_output.clear_output()  # Clear previous DataFrame
        display(HTML(filtered_places.to_pandas().to_html(index=False)))

# Create a list of unique categories
unique_categories = london_places["fsq_category_labels"].explode().unique().sort().fill_null("None").to_list()

# Create a text input widget for searching categories
search_box = Text(
    placeholder="Type to search...",
    description="Search:",
    continuous_update=True
)

# Function to filter categories based on search input
def filter_categories(change):
    search_text = change["new"].lower()
    filtered_options = [cat for cat in unique_categories if search_text in cat.lower()]
    category_dropdown.options = filtered_options

# Create a dropdown widget to select categories
category_dropdown = Dropdown(
    options=unique_categories,
    description="Category:",
    value=unique_categories[1]  # Default value
)

# Attach the filter function to the search box input
search_box.observe(filter_categories, names="value")

# Interactive widget for map and DataFrame
interact(update_map_and_df, category=category_dropdown)

# Arrange widgets and outputs in a layout
ui = VBox([search_box, category_dropdown, VBox([map_output, df_output])])

# Display the UI
display(ui)


interactive(children=(Dropdown(description='Category:', index=1, options=('None', 'Arts and Entertainment', 'A…

VBox(children=(Text(value='', description='Search:', placeholder='Type to search...'), Dropdown(description='C…