# ðŸ“‚ Station Portfolio Segmentation

Where should we focus our marketing dollars? In this notebook, we segment our stations into four distinct portfolio quadrants based on their Routine Score and casual rider volume. This classification allows us to differentiate between high-volume tourist hubs and the 'Elite Anchor' stations that are ripe for member conversion.

### 1. Preparation
We load our refined behavioral scores to begin the final segmentation process.

In [1]:
import pandas as pd
from pathlib import Path

In [2]:
DATA_DIR = Path("../data/processed")
input_path = DATA_DIR / "refined_behavioral_scores.csv"
output_path = DATA_DIR / "station_portfolio_segments.csv"

if not input_path.exists():
    raise FileNotFoundError("\u274c refined_behavioral_scores.csv not found.")

df = pd.read_csv(input_path)

### 2. Defining the Portfolio Quadrants
We categorize each station into one of four key groups:
1.  **Anchor (Priority 1)**: High reliability and high volume. These are our best conversion targets.
2.  **Emerging (Priority 2)**: High reliability but lower volume. Great for organic growth.
3.  **Transit Hub (Volume Only)**: Lower reliability but massive volume. Good for general brand awareness.
4.  **Peripheral**: Low reliability and low volume. Minimal marketing focus needed.

In [3]:
def segment_station(row):
    rs = row['routine_score']
    vol = row['casual_volume']
    
    if rs >= 0.45 and vol >= 1000:
        return "Anchor (Priority 1)"
    elif rs >= 0.45 and vol < 1000:
        return "Emerging (Priority 2)"
    elif rs < 0.45 and vol >= 2000:
        return "Transit Hub (Volume Only)"
    else:
        return "Peripheral"

df['final_status'] = df.apply(segment_station, axis=1)

### 3. Final Portfolio Export
We finalize our dataset and export the segmentation. This portfolio distribution provides a clear roadmap for the Cyclistic marketing team.

In [4]:
final_df = df[['start_station_name', 'routine_score', 'casual_volume', 'final_status']]
final_df.to_csv(output_path, index=False)

print("-" * 50)
print(f"\u2705 SUCCESS: Segments saved to {output_path}")
print("\nPortfolio Distribution:")
print(final_df["final_status"].value_counts())

--------------------------------------------------
âœ… SUCCESS: Segments saved to ..\data\processed\station_portfolio_segments.csv

Portfolio Distribution:
final_status
Peripheral                  3249
Transit Hub (Volume Only)     164
Emerging (Priority 2)          68
Anchor (Priority 1)             8
Name: count, dtype: int64
