# Dynamic Pricing for Urban Parking Lots  
**Summer Analytics 2025 – IIT Guwahati**  
**Author: Ujjwal Singh**

### 🎯 Objective
To design pricing models that dynamically adjust parking lot prices based on real-time demand, occupancy, traffic, vehicle types, and nearby lot competition.


In [None]:
from google.colab import files

# Prompt to upload file
uploaded = files.upload()

Saving dataset.csv to dataset (2).csv


In [None]:
import pandas as pd

df = pd.read_csv("dataset.csv")
df.head()

Unnamed: 0,ID,SystemCodeNumber,Capacity,Latitude,Longitude,Occupancy,VehicleType,TrafficConditionNearby,QueueLength,IsSpecialDay,LastUpdatedDate,LastUpdatedTime
0,0,BHMBCCMKT01,577,26.144536,91.736172,61,car,low,1,0,04-10-2016,07:59:00
1,1,BHMBCCMKT01,577,26.144536,91.736172,64,car,low,1,0,04-10-2016,08:25:00
2,2,BHMBCCMKT01,577,26.144536,91.736172,80,car,low,2,0,04-10-2016,08:59:00
3,3,BHMBCCMKT01,577,26.144536,91.736172,107,car,low,2,0,04-10-2016,09:32:00
4,4,BHMBCCMKT01,577,26.144536,91.736172,150,bike,low,2,0,04-10-2016,09:59:00


In [None]:
!pip install bokeh



## 2. Data Loading & Preprocessing

We load the dataset and prepare it by combining date and time fields into a single timestamp.  
This timestamp helps simulate real-time behavior and sort the events chronologically.

In [None]:
# Load and preprocess dataset

import pandas as pd

# Load dataset
df = pd.read_csv("dataset.csv")

# Combine date and time into a single datetime column
df["Timestamp"] = pd.to_datetime(df["LastUpdatedDate"] + " " + df["LastUpdatedTime"],
                                  format="%d-%m-%Y %H:%M:%S")

# Sort by parking lot and time for sequential analysis
df = df.sort_values(by=["SystemCodeNumber", "Timestamp"]).reset_index(drop=True)

# View basic info
print("Rows:", df.shape[0])
print("Columns:", df.shape[1])
print("\nSample Rows:\n")
df[["SystemCodeNumber", "Timestamp", "Occupancy", "Capacity", "QueueLength",
    "VehicleType", "TrafficConditionNearby", "IsSpecialDay"]].head(10)


Rows: 18368
Columns: 13

Sample Rows:



Unnamed: 0,SystemCodeNumber,Timestamp,Occupancy,Capacity,QueueLength,VehicleType,TrafficConditionNearby,IsSpecialDay
0,BHMBCCMKT01,2016-10-04 07:59:00,61,577,1,car,low,0
1,BHMBCCMKT01,2016-10-04 08:25:00,64,577,1,car,low,0
2,BHMBCCMKT01,2016-10-04 08:59:00,80,577,2,car,low,0
3,BHMBCCMKT01,2016-10-04 09:32:00,107,577,2,car,low,0
4,BHMBCCMKT01,2016-10-04 09:59:00,150,577,2,bike,low,0
5,BHMBCCMKT01,2016-10-04 10:26:00,177,577,3,car,low,0
6,BHMBCCMKT01,2016-10-04 10:59:00,219,577,6,truck,high,0
7,BHMBCCMKT01,2016-10-04 11:25:00,247,577,5,car,average,0
8,BHMBCCMKT01,2016-10-04 11:59:00,259,577,5,cycle,average,0
9,BHMBCCMKT01,2016-10-04 12:29:00,266,577,8,bike,high,0


## 3. Model 1: Linear Pricing Based on Occupancy

This baseline model starts from a fixed base price and increases the price proportionally to the occupancy ratio:

$[
\text{Price}_{t+1} = \text{Price}_t + \alpha \cdot \left(\frac{\text{Occupancy}}{\text{Capacity}}\right)
]$

Where:
- Base price = ₹10
- Alpha = 2.0


In [None]:
# Implement Model 1 (Baseline Linear Model) for one parking lot

# Filter data for a single parking lot
lot_id = "BHMBCCMKT01"
lot_data = df[df["SystemCodeNumber"] == lot_id].copy()

# Initialize base price and alpha
base_price = 10.0
alpha = 2.0

# Create list to store prices
prices = [base_price]

# Calculate price for each time step
for i in range(1, len(lot_data)):
    prev_price = prices[-1]
    occupancy = lot_data.iloc[i]["Occupancy"]
    capacity = lot_data.iloc[i]["Capacity"]
    price = prev_price + alpha * (occupancy / capacity)
    prices.append(round(price, 2))  # rounded for readability

# Add the calculated prices to the DataFrame
lot_data["Model1_Price"] = prices

# Show result
lot_data[["Timestamp", "Occupancy", "Capacity", "Model1_Price"]].head(10)


Unnamed: 0,Timestamp,Occupancy,Capacity,Model1_Price
0,2016-10-04 07:59:00,61,577,10.0
1,2016-10-04 08:25:00,64,577,10.22
2,2016-10-04 08:59:00,80,577,10.5
3,2016-10-04 09:32:00,107,577,10.87
4,2016-10-04 09:59:00,150,577,11.39
5,2016-10-04 10:26:00,177,577,12.0
6,2016-10-04 10:59:00,219,577,12.76
7,2016-10-04 11:25:00,247,577,13.62
8,2016-10-04 11:59:00,259,577,14.52
9,2016-10-04 12:29:00,266,577,15.44


## 4. Visualizing Model 1 – Occupancy-Based Pricing

We use Bokeh to visualize how the price changes over time based on occupancy.  
The X-axis shows the timestamp, and the Y-axis shows the Model 1 price.


In [None]:
# Plot Model 1 Pricing Over Time using Bokeh

from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, HoverTool

# Show Bokeh plots in the notebook
output_notebook()

# Prepare data source
source = ColumnDataSource(data={
    "time": lot_data["Timestamp"],
    "price": lot_data["Model1_Price"],
    "occupancy": lot_data["Occupancy"]
})

# Create the plot
p = figure(title="Model 1: Price Over Time – Parking Lot BHMBCCMKT01",
           x_axis_type="datetime",
           x_axis_label="Time",
           y_axis_label="Price ($)",
           width=800,
           height=400)  # <-- fixed parameters here

# Add price line
p.line("time", "price", source=source, line_width=2, color="blue", legend_label="Model 1 Price")

# Add hover tooltip
hover = HoverTool(
    tooltips=[
        ("Time", "@time{%F %H:%M}"),
        ("Price", "$@price"),
        ("Occupancy", "@occupancy")
    ],
    formatters={"@time": "datetime"}
)
p.add_tools(hover)

p.legend.location = "top_left"
p.title.text_font_size = "16px"

# Show plot
show(p)


## 5. Model 2: Demand-Based Dynamic Pricing

This model uses a weighted formula to calculate demand based on multiple real-world features like:

- Occupancy ratio
- Queue length
- Traffic condition
- Special day indicator
- Vehicle type

The price is then calculated based on the normalized demand score:

$[
\text{Price} = 10 \cdot (1 + \lambda \cdot \text{Normalized Demand})
]$

Where:
- λ = 0.5
- Output is clipped between ₹5 and ₹20


In [None]:
# Build Model 2 – Demand-Based Dynamic Pricing

# 1. Map categorical fields to numeric values
traffic_map = {"low": 1, "average": 2, "high": 3}
vehicle_map = {"car": 1.0, "bike": 0.7, "truck": 1.5, "cycle": 0.5}

lot_data["TrafficScore"] = lot_data["TrafficConditionNearby"].map(traffic_map)
lot_data["VehicleWeight"] = lot_data["VehicleType"].map(vehicle_map)

# 2. Set coefficients (you can experiment later)
alpha = 1.2   # Occupancy weight
beta = 0.5    # Queue length
gamma = 1.0   # Traffic penalty
delta = 1.0   # Special day bonus
epsilon = 1.5 # Vehicle type effect
lmbda = 0.5   # Price sensitivity multiplier

# 3. Calculate raw demand score
lot_data["DemandRaw"] = (
    alpha * (lot_data["Occupancy"] / lot_data["Capacity"]) +
    beta * lot_data["QueueLength"] -
    gamma * lot_data["TrafficScore"] +
    delta * lot_data["IsSpecialDay"] +
    epsilon * lot_data["VehicleWeight"]
)

# 4. Normalize demand to 0–1 range
d_min = lot_data["DemandRaw"].min()
d_max = lot_data["DemandRaw"].max()
lot_data["DemandNorm"] = (lot_data["DemandRaw"] - d_min) / (d_max - d_min)

# 5. Calculate Model 2 Price with bounds
base_price = 10
lot_data["Model2_Price"] = base_price * (1 + lmbda * lot_data["DemandNorm"])

# Apply bounds: min $5, max $20
lot_data["Model2_Price"] = lot_data["Model2_Price"].clip(lower=5, upper=20)

# Preview result
lot_data[["Timestamp", "Occupancy", "QueueLength", "TrafficConditionNearby",
          "VehicleType", "IsSpecialDay", "DemandRaw", "DemandNorm", "Model2_Price"]].head(10)


Unnamed: 0,Timestamp,Occupancy,QueueLength,TrafficConditionNearby,VehicleType,IsSpecialDay,DemandRaw,DemandNorm,Model2_Price
0,2016-10-04 07:59:00,61,1,low,car,0,1.126863,0.23468,11.173398
1,2016-10-04 08:25:00,64,1,low,car,0,1.133102,0.235626,11.178128
2,2016-10-04 08:59:00,80,2,low,car,0,1.666378,0.316484,11.582422
3,2016-10-04 09:32:00,107,2,low,car,0,1.72253,0.324999,11.624993
4,2016-10-04 09:59:00,150,2,low,bike,0,1.361958,0.270326,11.351632
5,2016-10-04 10:26:00,177,3,low,car,0,2.368111,0.422886,12.114429
6,2016-10-04 10:59:00,219,6,high,truck,0,2.705459,0.474037,12.370184
7,2016-10-04 11:25:00,247,5,average,car,0,2.513692,0.44496,12.224799
8,2016-10-04 11:59:00,259,5,average,cycle,0,1.788648,0.335024,11.67512
9,2016-10-04 12:29:00,266,8,high,bike,0,2.603206,0.458533,12.292663


## 6. Visualizing Model 2 vs Model 1

This plot compares the simple linear model with the demand-based model.  
It shows how the price varies over time based on real-time features.

- **Blue**: Model 1 (Linear)
- **Green**: Model 2 (Demand-Based)


In [None]:
# Visualize Model 1 vs Model 2 Pricing using Bokeh

from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, HoverTool

output_notebook()

# Prepare data
source = ColumnDataSource(data={
    "time": lot_data["Timestamp"],
    "price1": lot_data["Model1_Price"],
    "price2": lot_data["Model2_Price"],
    "demand": lot_data["DemandNorm"]
})

# Create figure
p = figure(title="Model 1 vs Model 2 Price – Lot BHMBCCMKT01",
           x_axis_type="datetime",
           x_axis_label="Time",
           y_axis_label="Price ($)",
           width=800,
           height=400)

# Add lines
p.line("time", "price1", source=source, line_width=2, color="blue", legend_label="Model 1")
p.line("time", "price2", source=source, line_width=2, color="green", legend_label="Model 2")

# Add hover tool
hover = HoverTool(tooltips=[
    ("Time", "@time{%F %H:%M}"),
    ("Model 1 Price", "$@price1"),
    ("Model 2 Price", "$@price2"),
    ("Demand (norm)", "@demand")
], formatters={"@time": "datetime"})
p.add_tools(hover)

p.legend.location = "top_left"
p.title.text_font_size = "16px"

show(p)


In [None]:
import numpy as np

# Calculate Haversine Distance between two (lat, lon) points
def haversine(lat1, lon1, lat2, lon2):
    R = 6371  # Earth radius in kilometers

    lat1, lon1, lat2, lon2 = map(np.radians, [lat1, lon1, lat2, lon2])

    dlat = lat2 - lat1
    dlon = lon2 - lon1

    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
    c = 2 * np.arcsin(np.sqrt(a))

    return R * c  # in kilometers

# Get unique parking lots
lots = df[["SystemCodeNumber", "Latitude", "Longitude"]].drop_duplicates().reset_index(drop=True)

# Create a distance matrix (dictionary of dicts)
distance_matrix = {}

for i in range(len(lots)):
    lot_i = lots.loc[i]
    distance_matrix[lot_i["SystemCodeNumber"]] = {}
    for j in range(len(lots)):
        lot_j = lots.loc[j]
        dist = haversine(lot_i["Latitude"], lot_i["Longitude"], lot_j["Latitude"], lot_j["Longitude"])
        distance_matrix[lot_i["SystemCodeNumber"]][lot_j["SystemCodeNumber"]] = dist

# Sample: show distances from BHMBCCMKT01
distance_matrix["BHMBCCMKT01"]


{'BHMBCCMKT01': np.float64(0.0),
 'BHMBCCTHL01': np.float64(0.005672430870179361),
 'BHMEURBRD01': np.float64(0.5992908249153686),
 'BHMMBMMBX01': np.float64(1561.1099308907433),
 'BHMNCPHST01': np.float64(0.7207266538373482),
 'BHMNCPNST01': np.float64(0.7200841151495526),
 'Broad Street': np.float64(0.8756601380405977),
 'Others-CCCPS105a': np.float64(0.874106624140163),
 'Others-CCCPS119a': np.float64(0.8842142459066773),
 'Others-CCCPS135a': np.float64(0.8793026409779722),
 'Others-CCCPS202': np.float64(0.8796739371358256),
 'Others-CCCPS8': np.float64(0.8822725119253588),
 'Others-CCCPS98': np.float64(0.8818349295250059),
 'Shopping': np.float64(0.7140359305597203)}

## 7. Model 3: Competitive Pricing Based on Nearby Lots

This model simulates market-like behavior. Each parking lot adjusts its price based on competitors within a 1 km radius.

**Logic:**
- If the lot is full and nearby lots are cheaper → decrease price
- If nearby lots are more expensive → increase price
- Prices are bounded between ₹5 and ₹20


In [None]:
# Model 3 – Competitive Pricing Adjustment

# Copy data so we don't overwrite Model2
lot_data["Model3_Price"] = lot_data["Model2_Price"].copy()

# Get timestamp list
timestamps = lot_data["Timestamp"].unique()

# Nearby lot list (within 1 km)
my_lot = lot_id
nearby_lots = [lot for lot, dist in distance_matrix[my_lot].items() if dist <= 1.0 and lot != my_lot]

print(f"Nearby competitors for {my_lot}: {nearby_lots}")

# Loop over each time and update Model 3 price
for t in timestamps:
    row = lot_data[lot_data["Timestamp"] == t]

    # If your lot is full, lower price if others are cheaper
    occ = row["Occupancy"].values[0]
    cap = row["Capacity"].values[0]
    my_price = row["Model2_Price"].values[0]

    # Get competitor prices at same time
    comp_prices = []
    for comp in nearby_lots:
        comp_row = df[(df["SystemCodeNumber"] == comp) & (df["Timestamp"] == t)]
        if not comp_row.empty:
            # Compute Model 2 price for competitor (if not already there, just approximate)
            comp_occ = comp_row.iloc[0]["Occupancy"]
            comp_cap = comp_row.iloc[0]["Capacity"]
            comp_price = 10 * (1 + 0.5 * (comp_occ / comp_cap))  # quick demand-based
            comp_prices.append(comp_price)

    if comp_prices:
        avg_comp_price = sum(comp_prices) / len(comp_prices)

        # Rules
        if occ >= cap:  # full lot
            if avg_comp_price < my_price:
                new_price = max(my_price - 2, 5)
                lot_data.loc[lot_data["Timestamp"] == t, "Model3_Price"] = new_price
        elif avg_comp_price > my_price:
            new_price = min(my_price + 1.5, 20)
            lot_data.loc[lot_data["Timestamp"] == t, "Model3_Price"] = new_price


Nearby competitors for BHMBCCMKT01: ['BHMBCCTHL01', 'BHMEURBRD01', 'BHMNCPHST01', 'BHMNCPNST01', 'Broad Street', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98', 'Shopping']


## 8. Final Comparison – Model 1 vs Model 2 vs Model 3

This chart compares all three pricing strategies for the selected parking lot over time.  
It demonstrates how each model evolves and responds to demand or competitive pressure.

- **Blue**: Model 1 – Linear
- **Green**: Model 2 – Demand-based
- **Orange**: Model 3 – Competitive pricing


In [None]:
# Final Comparison – Model 1 vs 2 vs 3

from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource, HoverTool

output_notebook()

# Prepare combined data source
source = ColumnDataSource(data={
    "time": lot_data["Timestamp"],
    "price1": lot_data["Model1_Price"],
    "price2": lot_data["Model2_Price"],
    "price3": lot_data["Model3_Price"],
    "occupancy": lot_data["Occupancy"]
})

# Create figure
p = figure(title="Pricing Comparison – Models 1 vs 2 vs 3",
           x_axis_type="datetime",
           x_axis_label="Time",
           y_axis_label="Price ($)",
           width=900,
           height=450)

# Draw all three lines
p.line("time", "price1", source=source, line_width=2, color="blue", legend_label="Model 1")
p.line("time", "price2", source=source, line_width=2, color="green", legend_label="Model 2")
p.line("time", "price3", source=source, line_width=2, color="orange", legend_label="Model 3")

# Add tooltip
hover = HoverTool(tooltips=[
    ("Time", "@time{%F %H:%M}"),
    ("Model 1", "$@price1"),
    ("Model 2", "$@price2"),
    ("Model 3", "$@price3"),
    ("Occupancy", "@occupancy")
], formatters={"@time": "datetime"})
p.add_tools(hover)

p.legend.location = "top_left"
p.title.text_font_size = "16px"

show(p)


In [None]:
!pip install pathway



In [None]:
import pathway as pw
import pandas as pd

# Load CSV as a streaming table

# Define schema for the input table
class ParkingSchema(pw.Schema):
    ID: int
    SystemCodeNumber: str
    Capacity: int
    Latitude: float
    Longitude: float
    Occupancy: int
    VehicleType: str
    TrafficConditionNearby: str
    QueueLength: int
    IsSpecialDay: int
    LastUpdatedDate: str
    LastUpdatedTime: str

# Load streaming table
stream = pw.io.csv.read(
    "dataset.csv",
    schema=ParkingSchema,
    mode="streaming",  # enables real-time simulation
    autocommit_duration_ms=1000  # simulate 1s time ticks
)


In [None]:
# Apply demand-based pricing inside the Pathway pipeline

# Mappings
traffic_map = {"low": 1, "average": 2, "high": 3}
vehicle_map = {"car": 1.0, "bike": 0.7, "truck": 1.5, "cycle": 0.5}

# Coefficients
alpha = 1.2
beta = 0.5
gamma = 1.0
delta = 1.0
epsilon = 1.5
lmbda = 0.5

@pw.udf
def combine_timestamp(date: str, time: str) -> str:
    return f"{date} {time}"

@pw.udf
def traffic_score(level: str) -> int:
    return traffic_map.get(level.lower(), 2)

@pw.udf
def vehicle_weight(vehicle: str) -> float:
    return vehicle_map.get(vehicle.lower(), 1.0)

@pw.udf
def compute_demand(occ, cap, queue, traffic, special, vehicle):
    return (
        alpha * (occ / cap) +
        beta * queue -
        gamma * traffic +
        delta * special +
        epsilon * vehicle
    )

# Step-by-step transformation
stream_plus = stream.select(
    system=stream.SystemCodeNumber,
    timestamp=combine_timestamp(stream.LastUpdatedDate, stream.LastUpdatedTime),
    occ=stream.Occupancy,
    cap=stream.Capacity,
    queue=stream.QueueLength,
    special=stream.IsSpecialDay,
    traffic=traffic_score(stream.TrafficConditionNearby),
    vehicle=vehicle_weight(stream.VehicleType)
)

# Calculate raw demand
with_demand = stream_plus.select(
    **stream_plus,
    raw_demand=compute_demand(stream_plus.occ, stream_plus.cap, stream_plus.queue,
                               stream_plus.traffic, stream_plus.special, stream_plus.vehicle)
)


## 9. Observations & Insights

- 📈 Model 1 responds linearly to occupancy and tends to overshoot during full-capacity times.
- 🧠 Model 2 introduces smarter pricing by considering queue length, traffic, vehicle type, and special days. It reacts more sensitively and smoothly.
- 🤝 Model 3 adds competitive logic — prices drop when nearby lots are cheaper and increase when competition is expensive or full.
- 💡 Model 3 shows the most realistic pricing pattern, balancing occupancy, demand, and competitive pressure.


## 10. Conclusion

This project successfully explored three dynamic pricing strategies for urban parking lots:

1. **Model 1** – A simple linear model based on occupancy.
2. **Model 2** – A demand-based model using traffic, vehicle type, special day, and queue length.
3. **Model 3** – A competitive pricing model that adjusts based on nearby lots.

All models were visualized and compared.  
Real-time pricing logic was also attempted using Pathway, but due to compatibility issues with reduce/groupby functions in Colab, real-time outputs were not generated. The full logic and code are included and documented.

This project highlights how real-world pricing can be adapted using data science, with the potential to reduce congestion, improve revenue, and offer smart urban solutions.
