#  Smart Parking Dynamic Pricing – Capstone Project

##  Student Information

* **Name**: Dikshita Sarma
* **Course**: Bachelors in Computer Science
* **Date**: 7th July 2025

---

##  Objective

This project focuses on building a dynamic pricing engine for smart parking systems.
The idea is to adjust parking slot prices intelligently based on real-time factors like:

* Occupancy levels
* Traffic and queue conditions
* Special event days
* Live streaming simulation

---

##  What I Implemented

* **Model 1**: Simple linear pricing based on occupancy
* **Model 2**: Demand-based pricing using traffic, queue length, and vehicle type
* **Model 3**: Competitive pricing based on nearby parking lots (optional)
* **Real-Time Streaming**: Simulated using Pathway for Model 1 and Model 2

---

## Step 1: Data Cleaning & Preprocessing

* Converted timestamps into usable datetime format
* Calculated occupancy ratio
* Mapped vehicle type and traffic level to weights

---

## Step 2: Model 1  Linear Pricing (Occupancy Based)

* Price = 10 + 2  occupancy ratio
* Visualized pricing using Bokeh plot

---

## Step 3: Model 2  Demand-Based Pricing

* Formula based on multiple features: traffic, vehicle type, queue length, special days
* Normalized demand score and applied dynamic pricing
* Visualized using Bokeh plot

---

## Step 4: Model 3  Competitive Pricing (Optional)

* Compared nearby lots at same time
* Adjusted price using weighted average based on nearby demand

---

## Step 5: Real-Time Simulation  Model 1 (Pathway)

* Used mini dataset of 20 rows
* Calculated occupancy ratio on stream
* Applied Model 1 logic
* Output written to `pathway_output.jsonl`

---

## Step 6: Real-Time Simulation  Model 2 (Pathway)

* Used additional features like queue, traffic, vehicle type
* Computed demand score and normalized it
* Output written to `pathway_output_model2.jsonl`

---

##  Conclusion

This capstone project helped me apply data analysis, modeling, and real-time simulation concepts to solve a real-world parking problem. I learned how to work with multiple models and integrate streaming logic to mimic dynamic pricing. Pathway helped me simulate a smart parking system that reacts to real-time demand effectively.

Thank you for going through my project!


In [None]:
import pandas as pd
import numpy as np


df = pd.read_csv("dataset.csv")


df.info()
df.head()


In [None]:

print("Missing values:\n", df.isnull().sum())


print("Unique vehicle types:", df['VehicleType'].unique())
print("Unique traffic levels:", df['TrafficConditionNearby'].unique())
print("Special day flags:", df['IsSpecialDay'].unique())

#  Checing  for invalid values
print("Negative queue lengths:", df[df['QueueLength'] < 0].shape[0])
print("Negative occupancy values:", df[df['Occupancy'] < 0].shape[0])
print("Occupancy > Capacity:", df[df['Occupancy'] > df['Capacity']].shape[0])


In [None]:

df.columns = df.columns.str.strip()

df['Timestamp'] = pd.to_datetime(df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'], dayfirst=True)


df.drop(['LastUpdatedDate', 'LastUpdatedTime'], axis=1, inplace=True)


df['OccupancyRatio'] = df['Occupancy'] / df['Capacity']


df['VehicleType'] = df['VehicleType'].str.strip().str.lower()
vehicle_weights = {'bike': 0.5, 'car': 1.0, 'truck': 1.5}
df['VehicleWeight'] = df['VehicleType'].map(vehicle_weights)


df['TrafficConditionNearby'] = df['TrafficConditionNearby'].str.strip().str.lower()
traffic_weights = {'low': 0.3, 'medium': 0.6, 'high': 1.0}
df['TrafficLevel'] = df['TrafficConditionNearby'].map(traffic_weights)


df.sort_values(by=['SystemCodeNumber', 'Timestamp'], inplace=True)
df.reset_index(drop=True, inplace=True)


df.head()


model 1


In [None]:

alpha = 2.0
base_price = 10.0


df['Price_Model1'] = np.nan


for lot_id in df['SystemCodeNumber'].unique():
    lot_df = df[df['SystemCodeNumber'] == lot_id].copy()
    lot_df = lot_df.sort_values('Timestamp')


    previous_price = base_price

    for idx in lot_df.index:
        occupancy_ratio = df.loc[idx, 'OccupancyRatio']
        new_price = previous_price + alpha * occupancy_ratio


        new_price = max(5, min(20, new_price))

        df.at[idx, 'Price_Model1'] = new_price
        previous_price = new_price


In [None]:
df[['SystemCodeNumber', 'Timestamp', 'OccupancyRatio', 'Price_Model1']].head(10)


In [None]:
!pip install bokeh


In [None]:
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import HoverTool
from bokeh.io import output_notebook
output_notebook()


lot_id = df['SystemCodeNumber'].unique()[0]
lot_df = df[df['SystemCodeNumber'] == lot_id]


p = figure(x_axis_type="datetime", title=f"Model 1 Pricing for Parking Lot {lot_id}",
           width=800, height=400)

# Adding line for pricing
p.line(lot_df['Timestamp'], lot_df['Price_Model1'], line_width=2, color='navy', legend_label="Price")


hover = HoverTool(
    tooltips=[
        ("Time", "@x{%F %H:%M}"),
        ("Price", "@y{$0.00}")
    ],
    formatters={'@x': 'datetime'},
    mode='vline'
)

p.add_tools(hover)
p.xaxis.axis_label = "Time"
p.yaxis.axis_label = "Price"
p.legend.location = "top_left"

show(p)


In [None]:
# Step 4: Model 2 – Demand-Based Pricing #

# Set base price and coefficients
base_price = 10
alpha = 2.0       # occupancy weight
beta = 0.5        # queue weight
gamma = 1.0       # traffic penalty
delta = 1.5       # special day boost
epsilon = 0.8     # vehicle weight
lam = 0.3         # strength of pricing adjustment

# Calculateing  raw demand score
df['DemandScore'] = (
    alpha * df['OccupancyRatio'] +
    beta * df['QueueLength'] -
    gamma * df['TrafficLevel'] +
    delta * df['IsSpecialDay'] +
    epsilon * df['VehicleWeight']
)


from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler()
df['NormalizedDemand'] = scaler.fit_transform(df[['DemandScore']])


df['Price_Model2'] = base_price * (1 + lam * df['NormalizedDemand'])


df['Price_Model2'] = df['Price_Model2'].clip(lower=5, upper=20)

# Preview
df[['SystemCodeNumber', 'Timestamp', 'NormalizedDemand', 'Price_Model2']].head(10)


In [None]:
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import HoverTool
output_notebook()

# Picking a parking lot again
lot_id = df['SystemCodeNumber'].unique()[0]
lot_df = df[df['SystemCodeNumber'] == lot_id]

# Create figure
p2 = figure(x_axis_type="datetime", title=f"Model 2 Pricing for Parking Lot {lot_id}",
            width=800, height=400)

# Model 2 price line
p2.line(lot_df['Timestamp'], lot_df['Price_Model2'], line_width=2, color='green', legend_label="Model 2 Price")

# Hover tool
hover2 = HoverTool(
    tooltips=[
        ("Time", "@x{%F %H:%M}"),
        ("Price", "@y{$0.00}")
    ],
    formatters={'@x': 'datetime'},
    mode='vline'
)

p2.add_tools(hover2)
p2.xaxis.axis_label = "Time"
p2.yaxis.axis_label = "Price"
p2.legend.location = "top_left"

show(p2)


In [None]:
# Haversine distance calculation
from math import radians, sin, cos, sqrt, atan2

def haversine(lat1, lon1, lat2, lon2):
    R = 6371000  # Earth radius in meters
    phi1 = radians(lat1)
    phi2 = radians(lat2)
    delta_phi = radians(lat2 - lat1)
    delta_lambda = radians(lon2 - lon1)

    a = sin(delta_phi/2.0)**2 + cos(phi1) * cos(phi2) * sin(delta_lambda/2.0)**2
    c = 2 * atan2(sqrt(a), sqrt(1 - a))
    return R * c  # in meters


In [None]:
# Testing  on small sample (1 day of data)
test_df = df[df['Timestamp'].dt.date == df['Timestamp'].dt.date.min()].copy()
test_df.reset_index(drop=True, inplace=True)

test_df['Price_Model3'] = test_df['Price_Model2']


for idx, row in test_df.iterrows():
    lat1, lon1 = row['Latitude'], row['Longitude']
    time = row['Timestamp']
    lot_id = row['SystemCodeNumber']
    price2 = row['Price_Model2']
    capacity = row['Capacity']
    occupancy = row['Occupancy']

    # Nearby competitors at same time
    nearby_df = test_df[(test_df['Timestamp'] == time) & (test_df['SystemCodeNumber'] != lot_id)]

    nearby_full = 0
    cheaper_exists = False

    for _, other in nearby_df.iterrows():
        dist = haversine(lat1, lon1, other['Latitude'], other['Longitude'])
        if dist <= 300:
            if other['Occupancy'] >= other['Capacity']:
                nearby_full += 1
            if other['Price_Model2'] < price2:
                cheaper_exists = True

    # Applying  competitive pricing logic
    if occupancy >= capacity and cheaper_exists:
        price3 = price2 - 0.5
    elif occupancy < capacity and nearby_full > 0:
        price3 = price2 + 0.5
    else:
        price3 = price2

    test_df.at[idx, 'Price_Model3'] = min(max(price3, 5), 20)

# Preview result
test_df[['SystemCodeNumber', 'Timestamp', 'Price_Model2', 'Price_Model3']].head(10)


In [None]:
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import HoverTool
output_notebook()


lot_id = test_df['SystemCodeNumber'].unique()[0]
lot_df = test_df[test_df['SystemCodeNumber'] == lot_id]

# Create figure
p3 = figure(x_axis_type="datetime", title=f"Model 3 Competitive Pricing for Lot {lot_id}",
            width=800, height=400)

# Line plot for Model 3 price
p3.line(lot_df['Timestamp'], lot_df['Price_Model3'], line_width=2, color='orange', legend_label="Model 3 Price")


hover3 = HoverTool(
    tooltips=[
        ("Time", "@x{%F %H:%M}"),
        ("Price", "@y{$0.00}")
    ],
    formatters={'@x': 'datetime'},
    mode='vline'
)

p3.add_tools(hover3)
p3.xaxis.axis_label = "Time"
p3.yaxis.axis_label = "Price"
p3.legend.location = "top_left"

show(p3)


In [None]:
!pip install pathway


In [None]:
# Export just 20 rows for super-fast Pathway testing
df[['SystemCodeNumber', 'Capacity', 'Occupancy', 'QueueLength', 'VehicleType',
    'TrafficConditionNearby', 'IsSpecialDay', 'Latitude', 'Longitude', 'Timestamp']] \
    .head(20).to_csv('mini_dataset.csv', index=False)


In [None]:
import pathway as pw

# Define schema
class ParkingData(pw.Schema):
    SystemCodeNumber: str
    Capacity: int
    Occupancy: int
    QueueLength: int
    VehicleType: str
    TrafficConditionNearby: str
    IsSpecialDay: int
    Latitude: float
    Longitude: float
    Timestamp: str

# Read stream
input_stream = pw.io.csv.read(
    "mini_dataset.csv",
    schema=ParkingData,
    mode="streaming"
)

# Calculating  occupancy ratio
with_occupancy = input_stream.select(
    **input_stream,
    OccupancyRatio=pw.apply(
        lambda occ, cap: occ / cap if cap != 0 else 0,
        input_stream.Occupancy,
        input_stream.Capacity
    )
)

# Apply Model 1 pricing
pricing_stream = pw.apply(
    lambda occ_ratio: min(20, max(5, 10 + 2 * occ_ratio)),
    with_occupancy.OccupancyRatio
)

# Combine
final_output = with_occupancy.select(
    with_occupancy.SystemCodeNumber,
    with_occupancy.Timestamp,
    with_occupancy.OccupancyRatio,
    Price=pricing_stream
)

# Write to file
pw.io.jsonlines.write(final_output, filename="pathway_output.jsonl")
pw.run()


In [None]:
with open("pathway_output.jsonl", "r") as file:
    for line in file.readlines():
        print(line.strip())


In [None]:
import pathway as pw

# 1. Define schema
class ParkingData(pw.Schema):
    SystemCodeNumber: str
    Capacity: int
    Occupancy: int
    QueueLength: int
    VehicleType: str
    TrafficConditionNearby: str
    IsSpecialDay: int
    Latitude: float
    Longitude: float
    Timestamp: str

# 2. Load stream
input_stream = pw.io.csv.read(
    "mini_dataset.csv",
    schema=ParkingData,
    mode="streaming"
)

# 3. Feature engineering
with_features = input_stream.select(
    **input_stream,

    # Occupancy Ratio
    OccupancyRatio=pw.apply(
        lambda occ, cap: occ / cap if cap else 0,
        input_stream.Occupancy,
        input_stream.Capacity
    ),

    # Vehicle Weight
    VehicleWeight=pw.apply(
        lambda vt: {'bike': 0.5, 'car': 1.0, 'truck': 1.5}.get(vt.strip().lower(), 1.0),
        input_stream.VehicleType
    ),

    # Traffic Weight
    TrafficLevel=pw.apply(
        lambda t: {'low': 0.3, 'medium': 0.6, 'high': 1.0}.get(t.strip().lower(), 0.5),
        input_stream.TrafficConditionNearby
    )
)

# 4. Compute Demand Score
with_demand = with_features.select(
    **with_features,

    DemandScore=pw.apply(
        lambda occ, q, tr, sd, vw: (
            2.0 * occ +               # α
            0.5 * q -                 # β
            1.0 * tr +                # γ
            1.5 * sd +                # δ
            0.8 * vw                  # ε
        ),
        with_features.OccupancyRatio,
        with_features.QueueLength,
        with_features.TrafficLevel,
        with_features.IsSpecialDay,
        with_features.VehicleWeight
    )
)


min_d = -1
max_d = 6
with_scaled = with_demand.select(
    **with_demand,
    NormalizedDemand=pw.apply(
        lambda d: (d - min_d) / (max_d - min_d),
        with_demand.DemandScore
    )
)

# 6. Compute Final Price (Model 2)
pricing_stream = pw.apply(
    lambda nd: min(20, max(5, 10 * (1 + 0.3 * nd))),  # base = 10, λ = 0.3
    with_scaled.NormalizedDemand
)

# 7. Final output
final_output_m2 = with_scaled.select(
    with_scaled.SystemCodeNumber,
    with_scaled.Timestamp,
    with_scaled.OccupancyRatio,
    with_scaled.DemandScore,
    with_scaled.NormalizedDemand,
    Price_Model2=pricing_stream
)

# 8. Write output to file
pw.io.jsonlines.write(final_output_m2, filename="pathway_output_model2.jsonl")
pw.run()


In [None]:
with open("pathway_output_model2.jsonl", "r") as file:
    for line in file.readlines():
        print(line.strip())


In [None]:
from google.colab import files
files.download('pathway_output.jsonl')


In [None]:
from google.colab import files
files.download('pathway_output_model2.jsonl')
