In [None]:
%load_ext autoreload
%autoreload 2

<div class="main-title">
<h1>Market share analysis</h1>
<p>Based on fast food restaurants in Prague, Czechia.<p>
</div>

The example will show how to analyse the local market share, defined by the nearest restaurant to a particular Prague resident.

The analysis uses demographic data from the Czech Statistical Office, with residential buildings and fast food restaurant positions downloaded from OpenStreetMap.

In [None]:
from srai.regionalizers import geocode_to_region_gdf

prague_area = geocode_to_region_gdf('Praha, CZ')
prague_area.explore(height=600)

## Load demographic data

In [None]:
import geopandas as gpd

cadastral_data = gpd.read_file('data/cadastral_data.geojson')
cadastral_data.explore(column='population', tiles="CartoDB positron", style_kwds=dict(opacity=0.25), height=600)

## Load residential buildings from OpenStreetMap

Data that we need is defined by [`building=residential`](https://wiki.openstreetmap.org/wiki/Tag:building%3Dresidential) tag in OSM.

Additionally, we want to use `building:flats` information to add weight to each building.

Later we will parse numer of flats per building to a number (OSM tags values are strings).

In [None]:
from srai.loaders import OSMOnlineLoader
from utils import map_flats

loader = OSMOnlineLoader()
buildings = loader.load(
    prague_area, {"building": "residential", "building:flats": True}
)
buildings = buildings[
    (buildings["building"] == "residential") & (buildings["building:flats"].notna())
]

In [None]:
buildings["building:flats"] = buildings["building:flats"].apply(map_flats)
buildings.geometry = buildings.geometry.apply(lambda geometry: geometry.centroid)

buildings.head()

## Population interpolation

Using cadastral information and exact buildings positions, we will interpolate the population over each building using flats number as a weight.

In [None]:
from utils import interpolate_spatial_data

interpolate_spatial_data(
    regions=cadastral_data,
    features=buildings,
    weight_column="building:flats",
    result_column="population",
)

buildings.head()

Plotting buildings with population

In [None]:
from utils import plot_population

plot_population(buildings)

## Loading data about fast food restaurants

Those features are defined in OSM with [`amenity=fast_food`](https://wiki.openstreetmap.org/wiki/Tag:amenity%3Dfast_food) tag.

From those, we will filter out `KFC` and `McDonald's` to simplify the analysis.

In [None]:
brands = ["KFC", "McDonald's"]

In [None]:
pois = loader.load(prague_area, {"amenity": "fast_food", "brand": True})
pois = pois[
    (pois["amenity"] == "fast_food") & (pois["brand"].isin(brands))
]
pois.head()

Cafes

In [None]:
# brands = ["Starbucks", "Costa"]
# pois = loader.load(prague_area, {"amenity": "cafe", "brand": True})
# pois = pois[
#     (pois["amenity"] == "cafe") & (pois["brand"].isin(brands))
# ]
# pois.head()

Shops

In [None]:
# brands = ["Albert", "Billa", "Lidl", "PENNY", "Kaufland", "Tesco"]
# pois = loader.load(prague_area, {"shop": "supermarket", "brand": True})
# pois = pois[
#     (pois["shop"] == "supermarket") & (pois["brand"].isin(brands))
# ]
# pois.head()

In [None]:
pois.geometry = pois.geometry.apply(lambda geometry: geometry.centroid)
pois.head()

In [None]:
pois.brand.value_counts()

## Segmenting the area

Using `VoronoiRegionalizer` from `srai` library, we can divide the geospatial space into regions using Voronoi diagram.

Here we will be using restaurants as seeds to segment the Prague.

In [None]:
from srai.regionalizers import VoronoiRegionalizer

voronoi_regions = VoronoiRegionalizer(seeds=pois).transform(gdf=prague_area)
voronoi_regions.head()

In [None]:
from srai.plotting import plot_regions

plot_regions(voronoi_regions)

Now we can join buildings with population into those generated regions. This way, we can assign the closest restaurant to each building.

In [None]:
population_in_regions = (
    voronoi_regions.sjoin(buildings).groupby("region_id")["population"].sum()
)
regions_with_population = (
    voronoi_regions.join(pois[["brand"]]).join(population_in_regions).fillna(0)
)
regions_with_population.head()

Using simple grouping operation, we can see what is the Prague's market share between those two brands.

In [None]:
brand_population = (
    regions_with_population.groupby("brand")
    .agg({"population": "sum", "geometry": "count"})
    .reset_index()
)
brand_population.rename(columns={"geometry": "locations"}, inplace=True)
brand_population["percentage"] = (
    100 * brand_population["population"] / brand_population["population"].sum()
)
brand_population

## Map plotting

To analyse this market further, we will plot the regions in two distinc gradients based on brand's color.

In [None]:
from utils import plot_market_share

plot_market_share(regions_with_population, pois)