###  District Rhythm Index

The **District Rhythm Index** captures the daily heartbeat of each Warsaw district.
It’s derived from the hourly patterns of anonymized user activity, showing how presence and movement vary throughout the day.
Districts with strong fluctuations between morning, noon, and evening receive higher scores, reflecting dynamic, time-dependent urban life.
This index reveals when each part of the city truly comes alive.

In [1]:
import pandas as pd
import numpy as np

In [3]:
df = pd.read_csv("./data/hackplay_warszawa_with_districts.csv")

In [4]:
df["start_dttm"] = pd.to_datetime(df["start_dttm"], errors="coerce")
df = df.dropna(subset=["start_dttm"])
df["hour"] = df["start_dttm"].dt.hour

In [5]:
hourly = (
    df.groupby(["district", "hour"], as_index=False)["user_id"]
      .nunique()
      .rename(columns={"user_id": "unique_users"})
)


In [6]:
hourly["activity_norm"] = (
    hourly.groupby("district")["unique_users"]
    .transform(lambda x: (x - x.min()) / (x.max() - x.min() + 1e-9) * 100)
)

In [7]:
rhythm = (
    hourly.groupby("district")
    .agg(
        peak_hour=("activity_norm", lambda x: x.idxmax() % 24),
        activity_amplitude=("activity_norm", lambda x: x.max() - x.min()),
        avg_activity=("activity_norm", "mean"),
    )
    .reset_index()
)

In [8]:
rhythm["rhythm_score"] = (
    0.5 * rhythm["activity_amplitude"] + 0.5 * rhythm["avg_activity"]
).round(1)

In [11]:
rhythm = rhythm.sort_values("rhythm_score", ascending=False)
rhythm.to_csv("./data/warsaw_district_rhythm.csv", index=False)