## Load and Explore Global Streetscapes data

This notebook explores a subset of the [global streetscapes](https://ual.sg/project/global-streetscapes/) dataset saved in parquet format. How to plot and use the data was derived from the [notebooks](https://github.com/ualsg/global-streetscapes/blob/main/notebooks/visualise_dataset.ipynb) provided by the [Urban Analytics Lab (UAL)](https://ual.sg/).

### Load Parquet dataset

In [1]:
# --------------------------------------
import osmnx as ox

# --------------------------------------
import geopandas as gp

# --------------------------------------
import matplotlib.pyplot as plt

# --------------------------------------
import pandas as pd

# --------------------------------------
from streetscapes import conf

In [2]:
df = pd.read_parquet(conf.OUTPUT_DIR / "streetscapes-data.parquet")

### Subset dataset

First, let's subset to a city of interest, in this case Amsterdam, but this workflow can be used for any city in the dataset. Then, create a geodataframe. 

In [3]:
city = df[df["city"] == "Amsterdam"]

In [4]:
city_pts = gp.GeoDataFrame(
    city, geometry=gp.points_from_xy(city.lon, city.lat), crs="EPSG:4326"
)

We can now subset the data further for images that were taken during the day and that have a viewing direction from the side.

In [5]:
city_day = city_pts[city_pts["lighting_condition"] == "day"]
city_side = city_day[city_day["view_direction"] == "side"]

We can take this further and calculate the fraction of pixels of each image that contain a certain object of interest. In this example at least 25% building or wall. 

In [None]:
city_day['building_fraction'] = city_day.apply(lambda x: x['Building']/x['Total'], axis=1)
city_day['wall_fraction'] = city_day.apply(lambda x: x['Wall']/x['Total'], axis=1)
city_walls = city_day[(city_day["wall_fraction"] > 0.25) | (city_day["building_fraction"] > 0.25)]
city_walls.head(5)

### Create plot

Now that we have subset the data, we can create a plot to visualise where these images are. 

In [None]:
# Create bounding box and background for figure
G = ox.graph_from_bbox(city_pts.total_bounds[3], city_pts.total_bounds[1], city_pts.total_bounds[2], city_pts.total_bounds[0], network_type='all_private', simplify=True, retain_all=True, truncate_by_edge=False)
G2 = ox.get_undirected(G)
df_G = ox.graph_to_gdfs(G2)
df_lines = df_G[1].copy()
df_lines_proj = ox.project_gdf(df_lines).reset_index().reset_index()

In [8]:
# Get points for all dataframes for plotting
all_pts = ox.project_gdf(city_pts)
side_pts = ox.project_gdf(city_side)
wall_pts = ox.project_gdf(city_walls)

In [None]:
# create figure
fig, ax = plt.subplots(figsize=(8,8))
ax.set_facecolor('#EEEEEE')

# plot place area on Axis ax
df_lines_proj.plot(ax=ax, color='white', zorder=5)

# plot all points
all_pts.plot(ax=ax, marker='o', color='#998cc0', markersize=2, alpha=0.5, zorder=10)

# plot side view points
side_pts.plot(ax=ax, marker='o', color='red', markersize=2, alpha=0.7, zorder=15)

# plot points with walls and buildings
wall_pts.plot(ax=ax, marker='o', color='blue', markersize=2, alpha=0.7, zorder=15)