### Social Availability Index

The **Social Availability Index** shows how long and evenly each district stays active throughout the day.
It measures the diversity of human presence over time — districts with activity spread across many hours score higher, reflecting continuous social availability.
High scores represent lively, all-day neighborhoods, while lower ones indicate areas that “sleep” outside peak hours.
This index captures the temporal rhythm of social life across the city.

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.read_csv("./data/hackplay_warszawa_with_districts.csv")

In [4]:
df["start_dttm"] = pd.to_datetime(df["start_dttm"], errors="coerce")
df["hour"] = df["start_dttm"].dt.hour

In [5]:
hourly = (
    df.groupby(["district", "hour"], as_index=False)["user_id"]
      .nunique()
      .rename(columns={"user_id": "unique_users"})
)

In [7]:
hourly["norm_activity"] = (
    hourly.groupby("district")["unique_users"]
          .transform(lambda x: x / x.max() if x.max() > 0 else 0)
)

In [8]:
threshold = 0.3
active_hours = (
    hourly.groupby("district")["norm_activity"]
          .apply(lambda x: (x > threshold).sum())
          .reset_index(name="active_hours")
)

In [9]:
vals = active_hours["active_hours"]
active_hours["social_availability_score"] = ((vals - vals.min()) / (vals.max() - vals.min()) * 100).round(1)

In [12]:
active_hours = active_hours.sort_values("social_availability_score", ascending=False)
active_hours.head()
active_hours.to_csv("./data/warsaw_social_availability.csv", index=False)