# ðŸ”„ Habitual Analysis: The Mirroring Model

Welcome to the core of our study. This notebook implements the **Behavioral Mirroring Model**. Our goal is to identify 'Mirror Stations'â€”locations where casual riders behave like members. 

We calculate a **Routine Score (RS)** for every station, which is a weighted composite of three key behavioral pillars:
1.  **Commute Alignment**: The percentage of trips occurring during rush hours on weekdays.
2.  **Duration Consistency**: How closely the casual ride lengths match the member average.
3.  **Destination Purpose**: The inverse of 'Leisure Loops' (trips that end where they started).

### 1. Essential Libraries and Data
We load our processed and geospatial data to begin the cross-segment analysis.

In [1]:
import pandas as pd
from pathlib import Path

In [2]:
DATA_DIR = Path("../data/processed")
df = pd.read_csv(DATA_DIR / "fact_trips.csv")
df_geo = pd.read_csv(DATA_DIR / "fact_trips_geo.csv")
df = df.merge(df_geo, on='ride_id')

### 2. The Behavioral Mirroring Engine
We've built a custom analyzer to process each station. The `Routine Score` is calculated using the following weights:
- **Commute Weight (50%)**: Emphasis on weekday peak usage.
- **Duration Weight (30%)**: Alignment with typical commuter trip lengths.
- **Purpose Weight (20%)**: Absence of recreational loop-riding.

In [3]:
class BehavioralAnalyzer:
    def __init__(self, data):
        self.data = data
        self.W_COMMUTE = 0.50
        self.W_DURATION = 0.30
        self.W_PURPOSE = 0.20
        
    def calculate_station_metrics(self):
        results = []
        for station, group in self.data.groupby('start_station_name'):
            casuals = group[group['member_casual'] == 'casual']
            members = group[group['member_casual'] == 'member']
            
            if len(casuals) < 50 or len(members) < 50:
                continue
                
            commute_pct = casuals['is_commute'].mean()
            
            c_dur = casuals['ride_length'].median()
            m_dur = members['ride_length'].median()
            dur_score = 1 - min(abs(c_dur - m_dur) / m_dur, 1)
            
            purpose_score = 1 - casuals['is_leisure_loop'].mean()
            
            rs = (commute_pct * self.W_COMMUTE) + \
                 (dur_score * self.W_DURATION) + \
                 (purpose_score * self.W_PURPOSE)
            
            results.append({
                'start_station_name': station,
                'routine_score': rs,
                'commute_alignment': commute_pct,
                'duration_consistency': dur_score,
                'destination_purpose': purpose_score,
                'casual_volume': len(casuals)
            })
        return pd.DataFrame(results)

### 3. Execution and Export
We run the analysis across all stations and save the results. These metrics will identify the specific locations where our marketing campaign will have the highest impact.

In [4]:
analyzer = BehavioralAnalyzer(df)
results = analyzer.calculate_station_metrics()

output_path = DATA_DIR / "habitual_metrics.csv"
results.sort_values('routine_score', ascending=False).to_csv(output_path, index=False)
print(f"\u2705 SUCCESS: Habitual metrics saved to {output_path}")
results.head(10)

âœ… SUCCESS: Habitual metrics saved to ..\data\processed\habitual_metrics.csv


Unnamed: 0,start_station_name,routine_score,commute_alignment,duration_consistency,destination_purpose,casual_volume
116,University Ave & 57th St,0.542289,0.329432,0.650882,0.913043,2638
2364,E 57th St & S Shore Dr,0.536761,0.316832,0.661157,0.892743,303
1640,Chicago State University,0.533276,0.392157,0.552941,0.852941,102
3488,Loomis St & 89th St,0.529815,0.342105,0.62766,0.855263,76
106,Ellis Ave & 58th St,0.526279,0.297449,0.6865,0.857657,3526
552,Woodlawn Ave & 55th St,0.521852,0.301385,0.655977,0.872576,722
1394,Yale Ave & 119th St,0.521361,0.396552,0.52381,0.827586,58
119,University Ave & 58th St,0.519965,0.29051,0.672691,0.864539,1623
111,Kimbark Ave & 53rd St,0.518884,0.291771,0.669888,0.860349,802
1540,Ellis Ave & 60th St,0.516766,0.270923,0.722645,0.822262,1100
