# CAPSTONE PROJECT SUBMISSION

#INTRODUCTION
This project presents a comprehensive solution for implementing Dynamic Pricing for Urban Parking Lots using Pathway's real-time streaming framework and Bokeh visualizations. Unlike the sample notebook, which focused on a single parking spot, this project extends the solution to handle multiple parking spaces simultaneously, reflecting a more realistic urban scenario.

The dataset consists of parking occupancy and capacity data collected at 30-minute intervals for each parking lot. This project leverages live data streams to compute and visualize dynamic parking prices in real-time based on current occupancy levels, without relying on pre-aggregated statistics.

The pricing model implemented here is a direct demand-based model where prices are calculated from real-time occupancy and capacity at each time interval. Additionally, the generated dynamic prices are integrated back into the original dataset for downstream processing and comparative analysis across different parking spots.

The key features of this solution include:

Real-time streaming for multiple parking lots.

Dynamic price computation without artificial aggregation windows.

Simultaneous visualization of multiple parking spots in Bokeh.

Enrichment of original datasets with the dynamically generated prices for further study.

This work serves as a foundation for building more advanced and adaptive pricing models that could incorporate additional factors such as traffic flow, weather conditions, special events, and historical trends.

In [None]:
#installing pathway
!pip install pathway bokeh --quiet # This cell may take a few seconds to execute.


[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.4/60.4 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m149.4/149.4 kB[0m [31m6.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m69.7/69.7 MB[0m [31m10.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.6/77.6 kB[0m [31m5.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m777.6/777.6 kB[0m [31m38.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m139.2/139.2 kB[0m [31m9.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m26.5/26.5 MB[0m [31m61.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m45.5/45.5 kB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [None]:
#install all the dependencies
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
from datetime import datetime
import pathway as pw
import bokeh.plotting
import panel as pn
import seaborn as sns

# IMPORTING AND PREPROCESSING THE DATA

In [None]:
df = pd.read_csv("/content/dataset.csv")


# you can find the dataset here : https://drive.google.com/file/d/1FN9vzycUHBb5MNq0jzzJSj5AQckrsYzZ/view?usp=drive_link

In [52]:
# Combine the 'LastUpdatedDate' and 'LastUpdatedTime' columns into a single datetime column
df['Timestamp'] = pd.to_datetime(df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'],
                                  format='%d-%m-%Y %H:%M:%S')

# Sort the DataFrame by the new 'Timestamp' column and reset the index
df = df.sort_values('Timestamp').reset_index(drop=True)
df.head(10)

Unnamed: 0,ID,SystemCodeNumber,Capacity,Latitude,Longitude,Occupancy,VehicleType,TrafficConditionNearby,QueueLength,IsSpecialDay,LastUpdatedDate,LastUpdatedTime,Timestamp
0,0,BHMBCCMKT01,577,26.144536,91.736172,61,car,low,1,0,04-10-2016,07:59:00,2016-10-04 07:59:00
1,15744,Others-CCCPS98,3103,26.1475,91.727978,588,car,average,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00
2,13120,Others-CCCPS202,2937,26.147491,91.727997,547,bike,low,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00
3,11808,Others-CCCPS135a,3883,26.147499,91.728005,1081,car,low,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00
4,10496,Others-CCCPS119a,2803,26.147541,91.72797,195,car,low,1,0,04-10-2016,07:59:00,2016-10-04 07:59:00
5,9184,Others-CCCPS105a,2009,26.147473,91.728049,709,car,low,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00
6,14432,Others-CCCPS8,1322,26.147549,91.727995,445,bike,average,3,0,04-10-2016,07:59:00,2016-10-04 07:59:00
7,1312,BHMBCCTHL01,387,26.144495,91.736205,120,car,low,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00
8,2624,BHMEURBRD01,470,26.14902,91.739503,117,car,low,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00
9,17056,Shopping,1920,26.150504,91.733531,614,cycle,low,2,0,04-10-2016,07:59:00,2016-10-04 07:59:00


In [None]:
df[["SystemCodeNumber", "Timestamp", "Occupancy","Capacity"]].to_csv("model1dataset_stream.csv", index=False)
# here we are selecting the features to be used for the evaluation of dynamic price using model 1

In [None]:
df_sorted = df.sort_values(by=['SystemCodeNumber','Timestamp']).copy()
#sorting the dataset based on SystemCodeNumber and the timestamp

In [None]:
# This allows us to simulate real-time data streaming separately for each parking lot in later steps.

parking_lot_csv=[]

for lot_id , lot_data in df_sorted.groupby("SystemCodeNumber"):
  parking_lot_csv.append(lot_data[["Timestamp","Occupancy","Capacity"]])
  lot_data[["Timestamp","Occupancy","Capacity"]].to_csv(f"{lot_id}.csv",index=False)



In [None]:
# Define the schema for the streaming data using Pathway
# This schema specifies the expected structure of each data row in the stream

class ParkingSchema(pw.Schema):

  Timestamp : str
  Occupancy: int
  Capacity : int
  model1_price : float




In [None]:
# Create an empty list to store all unique parking lot IDs

lot_id_list=[]
# Group the dataset by 'SystemCodeNumber' (which represents each unique parking lot)
# Print each lot ID and add it to the list for later selection
for lot_id , lot_data in df_sorted.groupby("SystemCodeNumber"):
  print(lot_id)
  lot_id_list.append(lot_id)

# Prompt the user to input a parking lot ID from the printed list

# Check if the entered ID is valid
# If valid, store it in 'selected_lot'; otherwise, print an error message
user_input=input("enter the lot_id from above:")
if user_input in lot_id_list:
  selected_lot = user_input
else:
  print("wrong id ")




BHMBCCMKT01
BHMBCCTHL01
BHMEURBRD01
BHMMBMMBX01
BHMNCPHST01
BHMNCPNST01
Broad Street
Others-CCCPS105a
Others-CCCPS119a
Others-CCCPS135a
Others-CCCPS202
Others-CCCPS8
Others-CCCPS98
Shopping
enter the lot_id from above:Others-CCCPS105a


# MODEL 1 PRICE CALCULATION

#BASELINE LINEAR MODEL:-
A simple model where the next price is a function of the previous price and current
occupancy:
• Linear price increase as occupancy increases
• Acts as a reference point
Example:

      price(at time t+1) = price (at time t)+ alpha x (Occupancy/Capacity)



In [None]:
selected_csv = pd.read_csv(f"{selected_lot}.csv")

# Calculate the dynamic price using the Model 1 formula:
# price = 10 + (Occupancy / Capacity)
# Add this as a new column called 'model1_price' in the DataFrame
selected_csv["model1_price"]=10+selected_csv["Occupancy"]/selected_csv["Capacity"]



selected_csv.to_csv(f"{selected_lot}.csv",index=False)






# PATHWAY STREAMING

In [None]:
# Load the data as a simulated stream using Pathway's replay_csv function
# This replays the CSV data at a controlled input rate to mimic real-time streaming
# input_rate=1000 means approximately 1000 rows per second will be ingested into the stream.

data1 = pw.demo.replay_csv(f"{selected_lot}.csv", schema=ParkingSchema, input_rate=100)

# df2=pd.read_csv(f"{selected_lot}.csv")
# df2

In [None]:
# Define the datetime format to parse the 'Timestamp' column
fmt = "%Y-%m-%d %H:%M:%S"

# Add new columns to the data stream:
# - 't' contains the parsed full datetime
# - 'day' extracts the date part and resets the time to midnight (useful for day-level aggregations)
data_with_time1 = data1.with_columns(
    t = data1.Timestamp.dt.strptime(fmt),
    day = data1.Timestamp.dt.strptime(fmt).dt.strftime("%Y-%m-%dT00:00:00")
)

In [None]:
# Define a daily tumbling window over the data stream using Pathway
# This block performs temporal aggregation and computes a dynamic price for each day

import datetime

# delta_window = (
#     data_with_time.windowby(
#         pw.this.t,
#         instance=pw.this.t.dt.strftime("%Y-%m-%dT%H:%M:00"),  # ✅ Unique per time slot
#         window=pw.temporal.tumbling(datetime.timedelta(minutes=30)),
#         behavior=pw.temporal.exactly_once_behavior()
#     )
#     .reduce(
#         t = pw.reducers.max(pw.this.t),
#         occ_max = pw.reducers.max(pw.this.Occupancy),
#         occ_min = pw.reducers.min(pw.this.Occupancy),
#         cap = pw.reducers.max(pw.this.Capacity),
#     )
#     .with_columns(
#         price = 10 + (pw.this.occ_max - pw.this.occ_min) / pw.this.cap
#     )
# )

# Add a new column 'price1' to the streaming data by directly referencing the precomputed 'model1_price'
# This allows the already calculated price from the original dataset to be included in the Pathway stream for visualization or further processing
data_with_price1 = data_with_time1.with_columns(
        price1 = data_with_time1.model1_price
    )

# BOKEH VISUALIZATION

In [None]:
# Activate the Panel extension to enable interactive visualizations

pn.extension()

# Define a custom Bokeh plotting function that takes a data source (from Pathway) and returns a figure
def price_plotter(source):
    # Create a Bokeh figure with datetime x-axis
    fig = bokeh.plotting.figure(
        height=400,
        width=800,
        title="Pathway: Daily Parking Price",
        x_axis_type="datetime",  # Ensure time-based data is properly formatted on the x-axis
    )
    # Plot a line graph showing how the price evolves over time
    fig.line("t", "price1", source=source, line_width=2, color="navy")

    # Overlay red circles at each data point for better visibility
    fig.circle("t", "price1", source=source, size=6, color="red")

    return fig

# Use Pathway's built-in .plot() method to bind the data stream (delta_window) to the Bokeh plot
# - 'price_plotter' is the rendering function
# - 'sorting_col="t"' ensures the data is plotted in time order
viz1 = data_with_price1.plot(price_plotter, sorting_col="t")

# Create a Panel layout and make it servable as a web app
# This line enables the interactive plot to be displayed when the app is served
pn.Column(viz1).servable()



# Model 1 output

In [None]:
# Start the Pathway pipeline execution in the background
# - This triggers the real-time data stream processing defined above
# - %%capture --no-display suppresses output in the notebook interface

%%capture --no-display


pw.run()

Output()



# Model 2: Demand-Based Price Function

A more advanced model where you:
• Construct a mathematical demand function using key features:

– Occupancy rate

– Queue length

– Traffic level

– Special day

– Vehicle type

Example demand function:

Demand = alpha x (Occupancy/Capacity) + beta x QueueLength - gamma x Traffic                   + delta x IsSpecialDay + epsilon x VehicalTypeWeight

In [16]:
df_sorted2 = df.sort_values(by=['SystemCodeNumber','Timestamp']).copy()

In [17]:
df_sorted2["VehicleType"].unique()

array(['car', 'bike', 'truck', 'cycle'], dtype=object)

In [18]:
# Define weight factors for different vehicle types.
# These factors represent how much each vehicle type contributes to parking space utilization.
# For example: trucks take up more space (1.5), bikes take less (0.7).


vehicle_depends = {
    "car":1,
    "bike":0.7,
    "truck":1.5,
    "cycle":0.5

}
# Define numerical values for different traffic levels.
# These factors can be used to adjust pricing based on the overall traffic conditions in the area.
# Higher traffic implies greater demand and can justify higher parking prices.
traffic_levels ={
    "low":0,
    "medium":1.5,
    "high":2,
    "average":1
}

In [19]:
df_sorted2["VehicleTypeWeight"]=df_sorted2["VehicleType"].map(vehicle_depends)

df_sorted2["TrafficLevel"]=df_sorted2["TrafficConditionNearby"].map(traffic_levels)



In [20]:
parking_lot_csv2=[]

for lot_id , lot_data in df_sorted2.groupby("SystemCodeNumber"):
  parking_lot_csv.append(lot_data[["Timestamp","Occupancy","Capacity"]])
  lot_data[["Timestamp","Occupancy","Capacity","VehicleTypeWeight","TrafficLevel","QueueLength","IsSpecialDay"]].to_csv(f"model2{lot_id}.csv",index=False)





In [21]:
lot_id_list2=[]
for lot_id , lot_data in df_sorted2.groupby("SystemCodeNumber"):
  print(lot_id)
  lot_id_list2.append(lot_id)


user_input=input("enter the lot_id from above:")
if user_input in lot_id_list:
  selected_lot = user_input
else:
  print("wrong id ")

BHMBCCMKT01
BHMBCCTHL01
BHMEURBRD01
BHMMBMMBX01
BHMNCPHST01
BHMNCPNST01
Broad Street
Others-CCCPS105a
Others-CCCPS119a
Others-CCCPS135a
Others-CCCPS202
Others-CCCPS8
Others-CCCPS98
Shopping
enter the lot_id from above:Others-CCCPS135a


# Price Calculation using Model2

In [22]:
selected_csv = pd.read_csv(f"model2{selected_lot}.csv")
# df3 = pd.read_csv(f"model2{selected_lot}.csv")
# print(df3)
#params
# model2_price = A x (occupancy/capacity) + B x queuelenght - C x trafficlevel + D x isspecialday + E x vehicletypeweight

A= 1.0
B= 0.5
C = 0.3
D = 1.0
E = 0.2

selected_csv["model2_price"]=10+A*(selected_csv["Occupancy"]/selected_csv["Capacity"]) + B*(selected_csv["QueueLength"]) - C*(selected_csv["TrafficLevel"]) + D*(selected_csv["IsSpecialDay"])+E*(selected_csv["VehicleTypeWeight"])



selected_csv.to_csv(f"model2{selected_lot}.csv",index=False)

In [23]:
class ParkingSchema_two(pw.Schema):

  Timestamp : str
  Occupancy: int
  Capacity : int
  VehicleTypeWeight : float
  TrafficLevel : float
  QueueLength : int
  IsSpecialDay : int
  model2_price : float


# PATHWAY STREAMING MODEL 2

In [24]:
data2 = pw.demo.replay_csv(f"model2{selected_lot}.csv", schema=ParkingSchema_two, input_rate=100)

# df2=pd.read_csv(f"{selected_lot}.csv")
# df2

In [25]:
fmt = "%Y-%m-%d %H:%M:%S"

# Add new columns to the data stream:
# - 't' contains the parsed full datetime
# - 'day' extracts the date part and resets the time to midnight (useful for day-level aggregations)
data_with_time2 = data2.with_columns(
    t = data2.Timestamp.dt.strptime(fmt),
    day = data2.Timestamp.dt.strptime(fmt).dt.strftime("%Y-%m-%dT00:00:00")
)

In [26]:
import datetime

# delta_window = (
#     data_with_time.windowby(
#         pw.this.t,
#         instance=pw.this.t.dt.strftime("%Y-%m-%dT%H:%M:00"),  # ✅ Unique per time slot
#         window=pw.temporal.tumbling(datetime.timedelta(minutes=30)),
#         behavior=pw.temporal.exactly_once_behavior()
#     )
#     .reduce(
#         t = pw.reducers.max(pw.this.t),
#         occ_max = pw.reducers.max(pw.this.Occupancy),
#         occ_min = pw.reducers.min(pw.this.Occupancy),
#         cap = pw.reducers.max(pw.this.Capacity),
#     )
#     .with_columns(
#         price = 10 + (pw.this.occ_max - pw.this.occ_min) / pw.this.cap
#     )
# )
data_with_price2 = data_with_time2.with_columns(
        price2 = data_with_time2.model2_price
    )

# BOKEH VISUALIZATION FOR MODEL 2

In [27]:
pn.extension()

# Define a custom Bokeh plotting function that takes a data source (from Pathway) and returns a figure
def price_plotter(source):
    # Create a Bokeh figure with datetime x-axis
    fig = bokeh.plotting.figure(
        height=400,
        width=800,
        title="Pathway: Daily Parking Price",
        x_axis_type="datetime",  # Ensure time-based data is properly formatted on the x-axis
    )
    # Plot a line graph showing how the price evolves over time
    fig.line("t", "price2", source=source, line_width=2, color="navy")

    # Overlay red circles at each data point for better visibility
    fig.circle("t", "price2", source=source, size=6, color="red")

    return fig

# Use Pathway's built-in .plot() method to bind the data stream (delta_window) to the Bokeh plot
# - 'price_plotter' is the rendering function
# - 'sorting_col="t"' ensures the data is plotted in time order
viz2 = data_with_price2.plot(price_plotter, sorting_col="t")

# Create a Panel layout and make it servable as a web app
# This line enables the interactive plot to be displayed when the app is served
pn.Column(viz2).servable()



# MODEL 2 OUTPUT RESULTS

In [28]:
# Start the Pathway pipeline execution in the background
# - This triggers the real-time data stream processing defined above
# - %%capture --no-display suppresses output in the notebook interface

%%capture --no-display


pw.run()

Output()



# MODEL 3 - COMPETITIVE PRICING MODEL

This model adds location intelligence and simulates real-world competition:
• Calculate geographic proximity of nearby parking spaces using lat-long.
• Determine competitor prices and factor them into your own pricing.
Competitive logic:
• If your lot is full and nearby lots are cheaper → suggest rerouting or reduce price
• If nearby lots are expensive → your price can increase while still being attractive
This model encourages creativity and business thinking along with technical skills.

Competitive Pricing Logic:

The final Model 3 price is calculated as:

Model3Price
=
0.6
X
Model2Price
+
0.4
X
NearbyAvgPrice
+
β
X
NearbyAvgOccupancyRatio

β = 2.0 (tunable weight for competitor occupancy pressure)

Final price is clipped between 5 -  20

In [29]:
df_sorted3 = df.sort_values(by=['SystemCodeNumber','Timestamp']).copy()

In [30]:
df_sorted3["VehicleType"].unique()


array(['car', 'bike', 'truck', 'cycle'], dtype=object)

In [31]:
vehicle_depends = {
    "car":1,
    "bike":0.7,
    "truck":1.5,
    "cycle":0.5

}
traffic_levels ={
    "low":0,
    "medium":1.5,
    "high":2,
    "average":1
}

In [32]:
df_sorted3["VehicleTypeWeight"]=df_sorted3["VehicleType"].map(vehicle_depends)

df_sorted3["TrafficLevel"]=df_sorted3["TrafficConditionNearby"].map(traffic_levels)


In [33]:

# df3 = pd.read_csv(f"model2{selected_lot}.csv")
# print(df3)
#params
# model2_price = A x (occupancy/capacity) + B x queuelenght - C x trafficlevel + D x isspecialday + E x vehicletypeweight

A= 1.0
B= 0.5
C = 0.3
D = 1.0
E = 0.2

df_sorted3["model2_price"]=10+A*(df_sorted3["Occupancy"]/df_sorted3["Capacity"]) + B*(df_sorted3["QueueLength"]) - C*(df_sorted3["TrafficLevel"]) + D*(df_sorted3["IsSpecialDay"])+E*(df_sorted3["VehicleTypeWeight"])



df_sorted3.to_csv(f"finalcsvformodel3.csv",index=False)



# DISTANCE CALCULATION BETWEEN PARKING LOTS

In [34]:
def haversine(lat1, lon1, lat2, lon2):

  # used for calculate distance between two points using
  # using the latitude and longitude of the points

  RadiusEarth = 6372
  phi1 , phi2 = np.radians(lat1) , np.radians(lat2)
  delta_phi = np.radians(lat2-lat1)
  delta_lambda = np.radians(lon2-lon1)

  a = np.sin(delta_phi / 2)**2 + np.cos(phi1)*np.cos(phi2)*np.sin(delta_lambda / 2)**2
  return RadiusEarth * 2 * np.arcsin(np.sqrt(a))




In [35]:
# Group the dataset by 'SystemCodeNumber' to get the unique parking lot IDs
# For each parking lot, extract the first available pair of coordinates (Latitude and Longitude)
# This ensures that each parking lot is represented by a single set of coordinates (even if the lot appears multiple times in the dataset)


lot_coords = df_sorted3.groupby("SystemCodeNumber")[["Latitude","Longitude"]].first().reset_index()

# Display the resulting DataFrame containing each unique parking lot and its corresponding coordinates
print(lot_coords)

    SystemCodeNumber   Latitude  Longitude
0        BHMBCCMKT01  26.144536  91.736172
1        BHMBCCTHL01  26.144495  91.736205
2        BHMEURBRD01  26.149020  91.739503
3        BHMMBMMBX01  20.000035  78.000003
4        BHMNCPHST01  26.140014  91.731000
5        BHMNCPNST01  26.140048  91.730972
6       Broad Street  26.137958  91.740994
7   Others-CCCPS105a  26.147473  91.728049
8   Others-CCCPS119a  26.147541  91.727970
9   Others-CCCPS135a  26.147499  91.728005
10   Others-CCCPS202  26.147491  91.727997
11     Others-CCCPS8  26.147549  91.727995
12    Others-CCCPS98  26.147500  91.727978
13          Shopping  26.150504  91.733531


In [36]:
# Define the radius (in kilometers) within which parking lots are considered neighbors
radius_in_km = 1

# Create an empty dictionary to store neighboring parking lots for each parking lot
neighbour_plot = {}


# Loop through each parking lot as the reference point (row1)
for i , row1 in lot_coords.iterrows():
  neighbours = []   # List to hold neighboring lots for the current parking lot

  # Loop through every other parking lot to compare distances (row2)
  for j , row2 in lot_coords.iterrows():

    # Skip comparison with itself
    if row1["SystemCodeNumber"] == row2["SystemCodeNumber"]:
      continue

    # Calculate the geographic distance between the two parking lots using the haversine formula
    distance = haversine(row1["Latitude"],row1["Longitude"], row2["Latitude"], row2["Longitude"])

    # If the distance is within the specified radius, consider them neighbors
    if distance <= radius_in_km:
      # print(distance)
      neighbours.append(row2["SystemCodeNumber"])


  # Map the current parking lot to its list of neighboring lots
  neighbour_plot[row1["SystemCodeNumber"]]= neighbours


# Print the resulting dictionary showing each parking lot and its immediate neighbors within the specified radius
print(neighbour_plot)



{'BHMBCCMKT01': ['BHMBCCTHL01', 'BHMEURBRD01', 'BHMNCPHST01', 'BHMNCPNST01', 'Broad Street', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98', 'Shopping'], 'BHMBCCTHL01': ['BHMBCCMKT01', 'BHMEURBRD01', 'BHMNCPHST01', 'BHMNCPNST01', 'Broad Street', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98', 'Shopping'], 'BHMEURBRD01': ['BHMBCCMKT01', 'BHMBCCTHL01', 'Shopping'], 'BHMMBMMBX01': [], 'BHMNCPHST01': ['BHMBCCMKT01', 'BHMBCCTHL01', 'BHMNCPNST01', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98'], 'BHMNCPNST01': ['BHMBCCMKT01', 'BHMBCCTHL01', 'BHMNCPHST01', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98'], 'Broad Street': ['BHMBCCMKT01', 'BHMBCCTHL01'], 'Others-CCCPS105a': ['BHMBCCMKT01', 'BHMBCCTHL01', 'BHMNCPHST01', 'BHMNCPNST01', 'O

In [37]:
# Loop through the first 14 parking lots in the 'neighbour_plot' dictionary
# For each parking lot, print the number of nearby competing parking lots (within the defined radius)
# along with the list of their SystemCodeNumbers


for lot_id in list(neighbour_plot.keys())[:14]:
    print(f"Lot {lot_id} has {len(neighbour_plot[lot_id])} competitors: {neighbour_plot[lot_id]}")

Lot BHMBCCMKT01 has 12 competitors: ['BHMBCCTHL01', 'BHMEURBRD01', 'BHMNCPHST01', 'BHMNCPNST01', 'Broad Street', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98', 'Shopping']
Lot BHMBCCTHL01 has 12 competitors: ['BHMBCCMKT01', 'BHMEURBRD01', 'BHMNCPHST01', 'BHMNCPNST01', 'Broad Street', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98', 'Shopping']
Lot BHMEURBRD01 has 3 competitors: ['BHMBCCMKT01', 'BHMBCCTHL01', 'Shopping']
Lot BHMMBMMBX01 has 0 competitors: []
Lot BHMNCPHST01 has 9 competitors: ['BHMBCCMKT01', 'BHMBCCTHL01', 'BHMNCPNST01', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98']
Lot BHMNCPNST01 has 9 competitors: ['BHMBCCMKT01', 'BHMBCCTHL01', 'BHMNCPHST01', 'Others-CCCPS105a', 'Others-CCCPS119a', 'Others-CCCPS135a', 'Others-CCCPS202', 'Others-CCCPS8', 'Others-CCCPS98']
Lot Broad Street

In [38]:
from tqdm import tqdm

# Initialize two new columns in the DataFrame to store:
# 1. The average price of neighboring parking lots at the same timestamp.
# 2. The average occupancy ratio (Occupancy/Capacity) of neighboring parking lots at the same timestamp.

df_sorted3["NearbyAvgPrice"] = np.nan
df_sorted3["NearbyAvgOccupancyRatio"] = np.nan


# Loop through each row in the DataFrame using tqdm for progress visualization
for idx in tqdm(df_sorted3.index):
  row = df_sorted3.loc[idx]
  lot_id = row["SystemCodeNumber"]
  timestamp = row["Timestamp"]

  # Retrieve the list of neighboring parking lots for the current lot
  neighbours = neighbour_plot.get(lot_id,[])


  # Filter the main DataFrame to find competitor lots that:
  # - Are in the list of neighbors
  # - Have data recorded at the exact same timestamp
  competitors = df_sorted3[
      (df_sorted3["SystemCodeNumber"].isin(neighbours))&
      (df_sorted3["Timestamp"]== timestamp)
  ]
  # If competitors exist for this timestamp:
  if not competitors.empty:
    # Compute and assign the average price of neighboring lots
    df_sorted3.at[idx , "NearbyAvgPrice"] = competitors["model2_price"].mean()

    # Compute and assign the average occupancy ratio of neighboring lots
    df_sorted3.at[idx , "NearbyAvgOccupancyRatio"]=(competitors["Occupancy"]/competitors["Capacity"]).mean()


  # If no competitors found (no neighbors or missing data):
  else:
    # Use the lot's own price and occupancy ratio
    df_sorted3.at[idx, 'NearbyAvgPrice'] = row['model2_price']
    df_sorted3.at[idx, 'NearbyAvgOccupancyRatio'] = row['Occupancy'] / row['Capacity']







100%|██████████| 18368/18368 [00:43<00:00, 421.92it/s]


In [39]:
df_sorted3[['SystemCodeNumber', 'Timestamp', 'model2_price', 'NearbyAvgPrice', 'Occupancy', 'NearbyAvgOccupancyRatio']].head(10)





Unnamed: 0,SystemCodeNumber,Timestamp,model2_price,NearbyAvgPrice,Occupancy,NearbyAvgOccupancyRatio
0,BHMBCCMKT01,2016-10-04 07:59:00,10.805719,11.398408,61,0.271742
27,BHMBCCMKT01,2016-10-04 08:25:00,10.810919,11.510019,64,0.330019
37,BHMBCCMKT01,2016-10-04 08:59:00,11.338648,11.715994,80,0.41766
50,BHMBCCMKT01,2016-10-04 09:32:00,11.385442,12.001826,107,0.508493
67,BHMBCCMKT01,2016-10-04 09:59:00,11.399965,12.092911,150,0.572911
74,BHMBCCMKT01,2016-10-04 10:26:00,12.006759,12.628162,177,0.619829
93,BHMBCCMKT01,2016-10-04 10:59:00,13.079549,12.976048,219,0.666048
100,BHMBCCMKT01,2016-10-04 11:25:00,12.828076,13.566926,247,0.681926
114,BHMBCCMKT01,2016-10-04 11:59:00,12.748873,13.580352,259,0.692018
128,BHMBCCMKT01,2016-10-04 12:29:00,14.001005,14.854048,266,0.695715


# MODEL 3 PRICE CALCULATION

In [40]:
beta = 2

df_sorted3["model3_price"]=(
    0.8*df_sorted3["model2_price"]+
    0.4*df_sorted3["NearbyAvgPrice"]+
    beta*df_sorted3["NearbyAvgOccupancyRatio"]
)

df_sorted3["model3_price"]=df_sorted3["model3_price"].clip(lower=5 , upper=20)

df_sorted3.head(10)



Unnamed: 0,ID,SystemCodeNumber,Capacity,Latitude,Longitude,Occupancy,VehicleType,TrafficConditionNearby,QueueLength,IsSpecialDay,LastUpdatedDate,LastUpdatedTime,Timestamp,VehicleTypeWeight,TrafficLevel,model2_price,NearbyAvgPrice,NearbyAvgOccupancyRatio,model3_price
0,0,BHMBCCMKT01,577,26.144536,91.736172,61,car,low,1,0,04-10-2016,07:59:00,2016-10-04 07:59:00,1.0,0.0,10.805719,11.398408,0.271742,13.747422
27,1,BHMBCCMKT01,577,26.144536,91.736172,64,car,low,1,0,04-10-2016,08:25:00,2016-10-04 08:25:00,1.0,0.0,10.810919,11.510019,0.330019,13.91278
37,2,BHMBCCMKT01,577,26.144536,91.736172,80,car,low,2,0,04-10-2016,08:59:00,2016-10-04 08:59:00,1.0,0.0,11.338648,11.715994,0.41766,14.592637
50,3,BHMBCCMKT01,577,26.144536,91.736172,107,car,low,2,0,04-10-2016,09:32:00,2016-10-04 09:32:00,1.0,0.0,11.385442,12.001826,0.508493,14.92607
67,4,BHMBCCMKT01,577,26.144536,91.736172,150,bike,low,2,0,04-10-2016,09:59:00,2016-10-04 09:59:00,0.7,0.0,11.399965,12.092911,0.572911,15.102959
74,5,BHMBCCMKT01,577,26.144536,91.736172,177,car,low,3,0,04-10-2016,10:26:00,2016-10-04 10:26:00,1.0,0.0,12.006759,12.628162,0.619829,15.89633
93,6,BHMBCCMKT01,577,26.144536,91.736172,219,truck,high,6,0,04-10-2016,10:59:00,2016-10-04 10:59:00,1.5,2.0,13.079549,12.976048,0.666048,16.986156
100,7,BHMBCCMKT01,577,26.144536,91.736172,247,car,average,5,0,04-10-2016,11:25:00,2016-10-04 11:25:00,1.0,1.0,12.828076,13.566926,0.681926,17.053082
114,8,BHMBCCMKT01,577,26.144536,91.736172,259,cycle,average,5,0,04-10-2016,11:59:00,2016-10-04 11:59:00,0.5,1.0,12.748873,13.580352,0.692018,17.015276
128,9,BHMBCCMKT01,577,26.144536,91.736172,266,bike,high,8,0,04-10-2016,12:29:00,2016-10-04 12:29:00,0.7,2.0,14.001005,14.854048,0.695715,18.533853


In [41]:
parking_lot_csv3=[]

for lot_id , lot_data in df_sorted3.groupby("SystemCodeNumber"):
  parking_lot_csv.append(lot_data[["Timestamp","Occupancy","Capacity"]])
  lot_data[["Timestamp","Occupancy","Capacity","Latitude","Longitude","VehicleTypeWeight","TrafficLevel","QueueLength","IsSpecialDay","model2_price","model3_price"]].to_csv(f"model3{lot_id}.csv",index=False)

In [42]:
lot_id_list3=[]
for lot_id , lot_data in df_sorted3.groupby("SystemCodeNumber"):
  print(lot_id)
  lot_id_list3.append(lot_id)


user_input=input("enter the lot_id from above:")
if user_input in lot_id_list:
  selected_lot = user_input
else:
  print("wrong id ")

BHMBCCMKT01
BHMBCCTHL01
BHMEURBRD01
BHMMBMMBX01
BHMNCPHST01
BHMNCPNST01
Broad Street
Others-CCCPS105a
Others-CCCPS119a
Others-CCCPS135a
Others-CCCPS202
Others-CCCPS8
Others-CCCPS98
Shopping
enter the lot_id from above:Broad Street


In [44]:
selected_csv = pd.read_csv(f"model3{selected_lot}.csv")



In [45]:
class ParkingSchema_three(pw.Schema):

  Timestamp : str
  Occupancy: int
  Capacity : int
  VehicleTypeWeight : float
  TrafficLevel : float
  QueueLength : int
  IsSpecialDay : int
  model2_price : float
  model3_price : float

# PATHWAY STREAMING FOR MODEL 3

In [47]:
data3 = pw.demo.replay_csv(f"model3{selected_lot}.csv", schema=ParkingSchema_three, input_rate=100)

In [48]:
fmt = "%Y-%m-%d %H:%M:%S"

# Add new columns to the data stream:
# - 't' contains the parsed full datetime
# - 'day' extracts the date part and resets the time to midnight (useful for day-level aggregations)
data_with_time3 = data3.with_columns(
    t = data3.Timestamp.dt.strptime(fmt),
    day = data3.Timestamp.dt.strptime(fmt).dt.strftime("%Y-%m-%dT00:00:00")
)

In [49]:
import datetime

data_with_price3 = data_with_time3.with_columns(
        price3 = data_with_time3.model3_price
    )

# BOKEH VISUALIZATION FOR MODEL 3

In [50]:
pn.extension()

# Define a custom Bokeh plotting function that takes a data source (from Pathway) and returns a figure
def price_plotter(source):
    # Create a Bokeh figure with datetime x-axis
    fig = bokeh.plotting.figure(
        height=400,
        width=800,
        title="Pathway: Daily Parking Price",
        x_axis_type="datetime",  # Ensure time-based data is properly formatted on the x-axis
    )
    # Plot a line graph showing how the price evolves over time
    fig.line("t", "price3", source=source, line_width=2, color="navy")

    # Overlay red circles at each data point for better visibility
    fig.circle("t", "price3", source=source, size=6, color="red")

    return fig

# Use Pathway's built-in .plot() method to bind the data stream (delta_window) to the Bokeh plot
# - 'price_plotter' is the rendering function
# - 'sorting_col="t"' ensures the data is plotted in time order
viz3 = data_with_price3.plot(price_plotter, sorting_col="t")

# Create a Panel layout and make it servable as a web app
# This line enables the interactive plot to be displayed when the app is served
pn.Column(viz3).servable()



# MODEL 3 OUTPUT RESULTS

In [51]:
# Start the Pathway pipeline execution in the background
# - This triggers the real-time data stream processing defined above
# - %%capture --no-display suppresses output in the notebook interface

%%capture --no-display

pw.run()

Output()



# Summary

In this project, we developed a comprehensive solution for implementing dynamic pricing for urban parking lots using Pathway’s real-time streaming framework and Bokeh visualizations. The solution processes live parking data from multiple locations, computes dynamic prices based on real-time occupancy and capacity, and visualizes these prices through interactive dashboards.

Additionally, the project enriches the original datasets by:

Generating dynamic prices using multiple pricing models.

Calculating the influence of neighboring parking lots through geographic proximity analysis.

Incorporating competitor pricing and occupancy trends into each parking lot’s data.

This approach enables:

Real-time price updates tailored to individual parking spots.

Competitive market analysis based on spatial proximity.

The foundation for building more advanced, data-driven pricing algorithms considering external factors such as traffic, vehicle types, and demand patterns.

The methods and visualizations developed here are scalable, interpretable, and ready to be extended into more complex urban mobility solutions.