# 🚗 Smart Parking Pricing System – Final Summer Analytics Submission


## 📌 Overview
This project implements a dynamic pricing system for parking lots based on real-time data such as occupancy, queue length, traffic conditions, vehicle types, and special days. The system evolves from a simple linear pricing model to a competitive pricing mechanism and supports real-time simulation with visualization.



## 🧰 Tech Stack Used
- **Python**: Data manipulation, model logic
- **Pandas / NumPy / Scikit-learn**: Preprocessing and math
- **Pathway**: Real-time data simulation
- **Bokeh**: Interactive visualization
- **Google Colab**: Notebook-based implementation



## 🏗️ Architecture Diagram (Mermaid)
```mermaid
graph TD
    A[Input Dataset] --> B[Preprocessing]
    B --> C1[Model 1: Linear Pricing]
    B --> C2[Model 2: Demand-Based Pricing]
    B --> C3[Model 3: Competitive Pricing]
    C1 --> D[Pricing Output Table]
    C2 --> D
    C3 --> D
    D --> E[Real-Time Simulation (Pathway)]
    D --> F[Visualization (Bokeh)]
```



## 🔁 Project Workflow
1. **Data Preprocessing**:
   - Parse and format timestamps
   - Encode categorical values (vehicle type, traffic condition)
   - Normalize numerical fields (occupancy rate, queue length)

2. **Model 1 – Linear Pricing**:
   - Formula: `price = base_price + α × (occupancy/capacity)`

3. **Model 2 – Demand-Based Pricing**:
   - Weighted demand score based on traffic, queue, vehicle type, etc.
   - Formula: `price = base_price × (1 + λ × demand)`

4. **Model 3 – Competitive Pricing**:
   - Uses haversine distance to adjust price based on nearby lot pricing

5. **Real-Time Simulation**:
   - Stream prices using Pathway (docs referenced)

6. **Visualization**:
   - Real-time Bokeh plot per parking lot
   - Shows pricing trend with hover tooltips


In [None]:

import pandas as pd
from sklearn.preprocessing import MinMaxScaler

# Load dataset
df = pd.read_csv("dataset.csv")
df['Timestamp'] = pd.to_datetime(df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'], format='%d-%m-%Y %H:%M:%S')
df['VehicleTypeEncoded'] = df['VehicleType'].map({'car': 1.0, 'bike': 0.5, 'truck': 1.5})
df['TrafficConditionEncoded'] = df['TrafficConditionNearby'].map({'low': 0, 'medium': 1, 'high': 2})
df['OccupancyRate'] = df['Occupancy'] / df['Capacity']
scaler = MinMaxScaler()
df[['NormalizedOccupancyRate', 'NormalizedQueueLength']] = scaler.fit_transform(
    df[['OccupancyRate', 'QueueLength']]
)


In [None]:

def linear_pricing_model(occupancy, capacity, base_price=10.0, alpha=5.0):
    occupancy_rate = occupancy / capacity
    return round(base_price + alpha * occupancy_rate, 2)

df['LinearPrice'] = df.apply(lambda row: linear_pricing_model(row['Occupancy'], row['Capacity']), axis=1)


In [None]:

def demand_based_price(row, base_price=10.0, lambd=0.5):
    alpha, beta, gamma, delta, epsilon = 0.4, 0.3, 0.1, 0.1, 0.1
    demand = (
        alpha * row['NormalizedOccupancyRate'] +
        beta * row['NormalizedQueueLength'] +
        gamma * row['TrafficConditionEncoded'] / 2 +
        delta * row['VehicleTypeEncoded'] / 1.5 +
        epsilon * row['IsSpecialDay']
    )
    demand = min(max(demand, 0), 1)
    return round(min(max(base_price * (1 + lambd * demand), 5), 20), 2)

df['DemandPrice'] = df.apply(demand_based_price, axis=1)


In [None]:

import numpy as np

def haversine(lat1, lon1, lat2, lon2):
    R = 6371
    phi1, phi2 = np.radians(lat1), np.radians(lat2)
    dphi = np.radians(lat2 - lat1)
    dlambda = np.radians(lon2 - lon1)
    a = np.sin(dphi / 2)**2 + np.cos(phi1) * np.cos(phi2) * np.sin(dlambda / 2)**2
    return R * 2 * np.arcsin(np.sqrt(a))

def competitive_price(index, df, radius_km=1.0):
    lat1, lon1 = df.loc[index, ['Latitude', 'Longitude']]
    own_price = df.loc[index, 'DemandPrice']
    neighbors = []
    for i, row in df.iterrows():
        if i == index: continue
        dist = haversine(lat1, lon1, row['Latitude'], row['Longitude'])
        if dist <= radius_km:
            neighbors.append((row['DemandPrice'], row['Occupancy'] / row['Capacity']))
    if not neighbors: return own_price
    avg_price = np.mean([p for p, _ in neighbors])
    avg_occ = np.mean([o for _, o in neighbors])
    if avg_occ > 0.9 and avg_price > own_price:
        return min(own_price + 1, 20)
    elif avg_price < own_price:
        return max(own_price - 1, 5)
    return own_price

df['CompetitivePrice'] = [competitive_price(i, df) for i in range(len(df))]


In [None]:

from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, HoverTool
output_notebook()

lot_id = df['SystemCodeNumber'].unique()[0]
lot_data = df[df['SystemCodeNumber'] == lot_id].sort_values(by='Timestamp')
source = ColumnDataSource(data={
    'x': lot_data['Timestamp'],
    'y': lot_data['CompetitivePrice'],
    'occupancy': lot_data['Occupancy'],
    'queue': lot_data['QueueLength']
})

p = figure(x_axis_type='datetime', title=f"Pricing Trend for Lot {lot_id}", width=900, height=350)
p.line('x', 'y', source=source, line_width=2, color='navy', legend_label='Price')
p.circle('x', 'y', source=source, size=6, color='orange', fill_alpha=0.5)
p.add_tools(HoverTool(tooltips=[
    ("Time", "@x{%F %T}"),
    ("Price", "@y"),
    ("Occupancy", "@occupancy"),
    ("Queue", "@queue")
], formatters={'@x': 'datetime'}, mode='vline'))
p.xaxis.axis_label = 'Time'
p.yaxis.axis_label = 'Price ($)'
p.legend.click_policy = 'hide'
show(p)
