In [14]:
import pandas as pd
import geopandas
from sklearn.preprocessing import StandardScaler

# Estimating Activity based on Mobility Traces

By aggregating mobility data, it is possible to estimate of the number of devices (and people) detected within each tile and for each time period. However, note that the mobility panel only a subset of the total population in an area; and does not represent the total population density.

<iframe width="100%" height="500px" src="https://studio.foursquare.com/public/55af1cba-9659-4f10-811b-f7f08dfe2ed8/embed" frameborder="0" allowfullscreen></iframe>

## Data

In this step, we import clipping boundary defined by **area(s) of interest** below. 

In [2]:
AOI = geopandas.read_file("../../data/interim/tessellation/SYRTUR_tessellation.gpkg")

And the *activity* generated by f theiltered down panel of data points located within the **area of interest**.

In [3]:
ACTIVITY = pd.read_csv("../../data/interim/SYRTUR_activity_index.csv")

In [4]:
ACTIVITY["date"] = pd.to_datetime(ACTIVITY["date"])
ACTIVITY["weekday"] = ACTIVITY["date"].dt.weekday

## Methodology

| Feature      | Description |
| ----------- | ----------- |
| Population sample      | Counts the number of devices that were captured in the Veraset Movemnt panel.        |
| Spatial aggregation   | H3 level 6         |
| Temporal aggregation   | Daily UTC        |

### Calculate `BASELINE`

In this experiment, we choose the baseline to be the 4-week period spanning January 2, 2023 to January 29, 2023.

In [5]:
BASELINE = ACTIVITY[ACTIVITY["date"].between("2023-01-02", "2023-01-29")]

Creating `StandardScaler` for each `hex_id`,

In [6]:
scalers = {}

for hex_id in BASELINE["hex_id"].unique():
    scaler = StandardScaler()
    scaler.fit(BASELINE[BASELINE["hex_id"] == hex_id][["count"]])

    scalers[hex_id] = scaler

In [7]:
BASELINE = BASELINE.groupby(["hex_id", "weekday"]).agg({"count": ["mean", "std"]})
BASELINE.columns = BASELINE.columns.map(".".join)

Taking a sneak peek, 

In [8]:
BASELINE[BASELINE.index.get_level_values("hex_id").isin(["862da898fffffff"])]

Unnamed: 0_level_0,Unnamed: 1_level_0,count.mean,count.std
hex_id,weekday,Unnamed: 2_level_1,Unnamed: 3_level_1
862da898fffffff,0,5819.75,2285.557901
862da898fffffff,1,6675.25,1918.023527
862da898fffffff,2,7020.0,2137.928281
862da898fffffff,3,6586.0,2345.257484
862da898fffffff,4,5671.5,2838.52949
862da898fffffff,5,6300.0,2516.413718
862da898fffffff,6,6891.75,2462.698029


### Calculate `Z-Score`

Joining with `AOI`,

In [9]:
ACTIVITY = ACTIVITY.merge(AOI, how="left", on="hex_id").drop(["geometry"], axis=1)

Joining with `BASELINE`,

In [10]:
ACTIVITY = pd.merge(ACTIVITY, BASELINE, on=["hex_id", "weekday"], how="left")

Preparing columns, 

In [11]:
ACTIVITY["n_baseline"] = ACTIVITY["count.mean"]
ACTIVITY["n_difference"] = ACTIVITY["count"] - ACTIVITY["n_baseline"]

ACTIVITY["activity"] = ACTIVITY["log_count"]
ACTIVITY["percent_change"] = 100 * (ACTIVITY["count"] / (ACTIVITY["n_baseline"]) - 1)

In [12]:
for hex_id, scaler in scalers.items():
    try:
        predicate = ACTIVITY["hex_id"] == hex_id
        activity = scaler.transform(ACTIVITY[predicate][["count"]])
        ACTIVITY.loc[predicate, "z_score"] = activity
    except:
        pass

Taking a sneak peek, 

In [13]:
ACTIVITY[ACTIVITY["n_baseline"] > 10]

Unnamed: 0,hex_id,date,count,scaled_count,log_count,weekday,ADM0_PCODE,ADM1_PCODE,ADM2_PCODE,count.mean,count.std,n_baseline,n_difference,activity,percent_change,z_score
4,862c3056fffffff,2023-01-01,54,9.369189,0.971702,6,TR,TUR021,TUR021015,47.25,11.026483,47.25,6.75,0.971702,14.285714,0.956822
5,862c3056fffffff,2023-01-02,47,8.263825,0.917181,0,TR,TUR021,TUR021015,45.00,4.082483,45.00,2.00,0.917181,4.444444,0.203661
24,862c34477ffffff,2023-01-02,20,4.000275,0.602090,0,TR,TUR021,TUR021008,31.50,18.806027,31.50,-11.50,0.602090,-36.507937,-0.957021
25,862c34637ffffff,2023-01-01,6,1.789546,0.252743,6,TR,TUR021,TUR021006,11.50,8.582929,11.50,-5.50,0.252743,-47.826087,-0.716242
26,862c34897ffffff,2023-01-01,1470,232.968657,2.367297,6,TR,TUR023,TUR023005,977.75,305.672346,977.75,492.25,2.367297,50.345180,1.022632
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
58676,862dae807ffffff,2023-03-09,51,8.895461,0.949168,3,TR,TUR031,TUR031013,227.00,40.603777,227.00,-176.00,0.949168,-77.533040,-3.743512
58677,862dae807ffffff,2023-03-10,140,22.949383,1.360771,4,TR,TUR031,TUR031013,240.75,34.654245,240.75,-100.75,1.360771,-41.848390,-1.944663
58678,862dae807ffffff,2023-03-11,218,35.266303,1.547360,5,TR,TUR031,TUR031013,234.25,87.914257,234.25,-16.25,1.547360,-6.937033,-0.368143
58679,862dae807ffffff,2023-03-12,209,33.845120,1.529496,6,TR,TUR031,TUR031013,256.25,68.080222,256.25,-47.25,1.529496,-18.439024,-0.550049
