# 01 — Build features (MVP)

This notebook populates `data/seed/features_template_v0_0_2.csv` for the 75 towns.

**It does:**
- Load `places_seed_v0_0_1.csv` and `features_template_v0_0_2.csv`
- (You) join in IMD, crime/ASB, charities & grants, and micro‑survey baseline
- Save the updated `features_template_v0_0_2.csv`

> Keep the columns exactly as in the template for the MVP scripts.


In [None]:
from pathlib import Path
import pandas as pd
import numpy as np

REPO = Path('..').resolve()
SEED = REPO / 'data' / 'seed'

places_path = SEED / 'places_seed_v0_0_1.csv'
features_path = SEED / 'features_template_v0_0_2.csv'
places = pd.read_csv(places_path)
features = pd.read_csv(features_path)

print('places:', places.shape, places_path)
print('features:', features.shape, features_path)
display(places.head(3))
display(features.head(3))

## TODO: Attach open data

Fill these fields from your sources (same shape as the template):

- `imd2019_decile` → join from IMD (LSOA level)
- `crime_rate_per_1k`, `asb_trend_12m` → aggregate Police.uk (rolling 12m + slope)
- `charities_active_count`, `recent_grants_24m_gbp`, `grant_funding_3y_gbp` → Charity Commission + 360Giving
- `belonging_baseline_value` → micro‑survey (CLS wording) or neutral prior
- Flags: `lighting_presence_flag`, `grime_hotspot_flag`, `youth_facility_within_800m`


In [None]:
# Example neutral priors / dtype hygiene (remove once real data is joined)
features['belonging_baseline_value'] = (
    pd.to_numeric(features.get('belonging_baseline_value'), errors='coerce')
    .fillna(55)
)
num_cols = [
    'imd2019_decile','pop_density_per_km2','pct_under_18','pct_65_plus','pct_social_rent',
    'crime_rate_per_1k','asb_trend_12m','charities_active_count','recent_grants_24m_gbp','grant_funding_3y_gbp'
]
for c in num_cols:
    if c in features.columns:
        features[c] = pd.to_numeric(features[c], errors='coerce')

features.to_csv(features_path, index=False)
features_path

## Quick peek at results

In [None]:
features.sample(min(5, len(features)))