# Dynamic Pricing for Urban Parking Lots

Parking isn't just about space — it's about timing, demand, and strategy. This project creates an intelligent pricing system that adapts every 30 minutes to keep traffic flowing and lots balanced. Built from scratch with real-time logic.

In [None]:
!pip install pathway bokeh --quiet

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.4/60.4 kB[0m [31m3.7 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m149.4/149.4 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.7/69.7 MB[0m [31m11.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.6/77.6 kB[0m [31m6.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m777.6/777.6 kB[0m [31m41.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.2/139.2 kB[0m [31m13.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m26.5/26.5 MB[0m [31m68.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m3.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
from datetime import datetime
import pathway as pw
import bokeh.plotting
import panel as pn

In [None]:
df = pd.read_csv("https://raw.githubusercontent.com/Awani-Soni/dynamic-pricing-capstone-sa/main/dataset.csv")
df.head()

Unnamed: 0,ID,SystemCodeNumber,Capacity,Latitude,Longitude,Occupancy,VehicleType,TrafficConditionNearby,QueueLength,IsSpecialDay,LastUpdatedDate,LastUpdatedTime
0,0,BHMBCCMKT01,577,26.144536,91.736172,61,car,low,1,0,04-10-2016,07:59:00
1,1,BHMBCCMKT01,577,26.144536,91.736172,64,car,low,1,0,04-10-2016,08:25:00
2,2,BHMBCCMKT01,577,26.144536,91.736172,80,car,low,2,0,04-10-2016,08:59:00
3,3,BHMBCCMKT01,577,26.144536,91.736172,107,car,low,2,0,04-10-2016,09:32:00
4,4,BHMBCCMKT01,577,26.144536,91.736172,150,bike,low,2,0,04-10-2016,09:59:00


In [None]:
df['Timestamp'] = pd.to_datetime(df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'], format='%d-%m-%Y %H:%M:%S')

# Sort the DataFrame by the new 'Timestamp' column and unique SystemCodeNumber
# and reset the index
df = df.sort_values(['Timestamp', 'SystemCodeNumber']).reset_index(drop=True)

In [None]:
# Selecting the relevant columns from the dafaframe
df_cleaned = df[[
    'Timestamp',
    'SystemCodeNumber',
    'Occupancy',
    'Capacity',
    'QueueLength',
    'TrafficConditionNearby',
    'IsSpecialDay',
    'VehicleType',
    'Latitude',
    'Longitude'
]]

# Save the selected columns to a CSV file for streaming or downstream processing
df_cleaned.to_csv("parking_stream.csv", index=False)

In [None]:
# Define the schema for the streaming data using Pathway
# This schema specifies the expected structure of each data row in the stream

class ParkingSchema(pw.Schema):
    Timestamp: str                 # Timestamp of the observation
    SystemCodeNumber: str          # Unique ID of the parking lot
    Occupancy: int                 # Current number of parked vehicles
    Capacity: int                  # Total capacity of the parking lot
    QueueLength: int               # Number of vehicles waiting to enter
    TrafficConditionNearby: str    # Traffic level near the lot (e.g., 'low', 'high', 'average')
    IsSpecialDay: int              # 1 = special day/holiday, 0 = normal day
    VehicleType: str               # Type of incoming vehicle (car, bike, truck)
    Latitude: float                # Latitude of the parking lot
    Longitude: float               # Longitude of the parking lot

In [None]:
# Load the dataset as a simulated real-time stream using Pathway
# input_rate=14 means 14 rows will be streamed per second
# Since the CSV is sorted by ['Timestamp', 'SystemCodeNumber'], this simulates
# data for all 14 parking lots at the same timestamp (every 30 minutes)
data = pw.demo.replay_csv("parking_stream.csv", schema=ParkingSchema, input_rate=100)

In [None]:
# Define the datetime format to parse the 'Timestamp' column
fmt = "%Y-%m-%d %H:%M:%S"

# Add new column to the data stream
# - 't' contains the parsed full datetime
data = data.with_columns(
    t = data.Timestamp.dt.strptime(fmt)  # Full datetime for each row
)

###  Model 1: Baseline Linear Price Model

This is a foundational and easy-to-interpret pricing strategy used as a benchmark. The model assumes that the parking price should increase linearly with how full the lot is.

---

#### Pricing Formula:

$$
\text{Price}_t = 10 + \alpha \times \left( \frac{\text{Occupancy}}{\text{Capacity}} \right)
$$

- **Base Price:** ₹10  
- **Occupancy Rate:** Ratio of occupied spots to total capacity  
- **Alpha (α):** 0.4 — a constant that controls how sensitive the price is to demand

---

#### How the Model Works

- **Session Window (0.3 hours / 18 minutes):**  
  A session window of 0.3 hours is used for each parking lot individually. Since the dataset has one row per lot every 30 minutes, each session typically contains one data point. The `max()` function is used during aggregation, but it's effectively returning that single row.

- **Price Calculation:**  
  For each lot and time interval, the price is calculated using the occupancy rate at that time.  
  It **does not depend** on the price from the previous session.

- **No Rolling State:**  
  I initially tried implementing a rolling price formula of the form:

  $$
  \text{Price}_{t+1} = \text{Price}_t + \alpha \times \text{OccupancyRate}
  $$

  However, due to limitations in Pathway's session windowing and the stateless nature of its reducers, maintaining a rolling state was not feasible.

  So, I kept the logic simple: always start from the base price and apply the formula fresh each time.

---

####  Summary

- The model adjusts the price for each lot every 30 minutes based on current occupancy.  
- The use of 18-minute session windows ensures one data point per session.  
- While basic, this model serves as a strong reference point for evaluating more advanced demand-driven pricing strategies later.


In [None]:
from datetime import timedelta

model1_window = (
    data.windowby(
        pw.this.t,
        instance=pw.this.SystemCodeNumber,
        window=pw.temporal.session(max_gap=timedelta(hours=0.3)),
        behavior=pw.temporal.exactly_once_behavior()
    )
    .reduce(
        t = pw.this._pw_window_end,
        SystemCodeNumber = pw.this._pw_instance,
        Occupancy = pw.reducers.max(pw.this.Occupancy),
        Capacity = pw.reducers.max(pw.this.Capacity)
    )
    .with_columns(
        Price_Model_1 = 10 + 0.4 * (pw.this.Occupancy / pw.this.Capacity)
    )
)

###  Model 1 Price Plot — Selected Parking Lots

To visualize the output of **Model 1**, the dynamic pricing results have been plotted for **three specific parking lots**. This allows observation of how the price evolves over time based on real-time occupancy data.

####  Selected Parking Lots:
- **BHMBCCMKT01**
- **Broad Street**
- **Others-CCCPS119a**

#### Plot Details:
- Each parking lot is represented using a distinct color line.
- The X-axis displays the timestamp, while the Y-axis represents the computed price from Model 1.
- Price is updated every 30 minutes using session windows.
- The values are calculated using the occupancy rate at each session window.

#### Rationale:
Only three lots are included to maintain clarity and avoid visual clutter. This targeted visualization highlights the pricing behavior across different lot locations under the same model logic.

In [None]:
import bokeh.plotting
import panel as pn
from bokeh.palettes import Category10
from bokeh.models import ColumnDataSource

pn.extension()

# Select 2 lots to display
lots = ["BHMBCCMKT01", "Broad Street", "Others-CCCPS119a"]
# Define a list of 2 colors manually
colors = ["#1f77b4", "#ff7f0e", "#ff77b5"] # Example colors from Category10[3]


# Step 1: Filter model1_window for each lot
filtered_tables = [
    model1_window.filter(pw.this.SystemCodeNumber == lot)
    for lot in lots
]

# Step 2: Helper function that returns a full figure
def create_figure(src, lot_name, color):
    fig = bokeh.plotting.figure(
        height=300,
        width=400,
        title=f"{lot_name} - Price Over Time",
        x_axis_type="datetime",
        y_axis_label="Price ($)",
    )
    fig.line("t", "Price_Model_1", source=src, line_width=2, color=color)
    return fig

# Step 3: Use Panel Row to show both plots
viz = pn.Row(
    *[
        table.plot(
            lambda src, lot=lot, color=color: create_figure(src, lot, color),
            sorting_col="t"
        )
        for table, lot, color in zip(filtered_tables, lots, colors)
    ]
)

viz

In [None]:
# Start the Pathway pipeline execution in the background
# - This triggers the real-time data stream processing defined above
# - %%capture --no-display suppresses output in the notebook interface

%%capture --no-display
pw.run()

Output()

