<a href="https://colab.research.google.com/github/AshickaS/SACapstone2025/blob/main/dynamic_parking_pricing_notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Dynamic Pricing for Urban Parking Lots

## 1. Background and Motivation

Urban parking spaces are a limited and highly demanded resource. Prices that remain static throughout the day can lead to inefficiencies — either overcrowding or underutilization. To improve utilization, dynamic pricing based on demand, competition, and real-time
conditions is crucial.

This project simulates such a system: participants will create an intelligent, data-driven pricing engine for 14 parking spaces using real-time data streams, basic economic theory, and ML models built from scratch, using only ```numpy```, ```pandas``` libraries.

## 2. Project Objective
Our goal is to build a dynamic pricing model for each parking space such that:

- The price is realistically updated in real-time based on:
 - Historical occupancy patterns
 - Queue length
 - Nearby traffic
 - Special events
 - Vehicle type
 - Competitor parking prices
- It starts from a base price of $10
- The price variation is smooth and explainable, not erratic
- Optional: The system suggests rerouting vehicles to nearby lots if the current lot is overburdened.

## Step 0: Install and Import the Libraries

In [None]:
!pip install pathway bokeh --quiet

In [None]:
import pandas as pd
import pathway as pw
from datetime import timedelta
from datetime import datetime as dt_datetime
import panel as pn
from bokeh.plotting import figure#, show
# from bokeh.models import ColumnDataSource, DatetimeTickFormatter
# from bokeh.io import show, output_notebook

## Step 1: Importing and Preprocessing the Data

In [None]:
sheet_id = '1RqHF3zphAFOtYZgReDJUxEFweOiVAxqP' #original sheet
#sheet_id = '1D479FLjp9aO3Mg8g6Lpj9oRViWacurA6' #sample sheet
sheet_url = f"https://drive.google.com/uc?id={sheet_id}"
# Create the DataFrame
df = pd.read_csv(sheet_url)

df.head()

In [None]:
 # Check the distinct str values of TrafficConditionNearby to later convert in to numerical
 df['TrafficConditionNearby'].unique()

In [None]:
# Check the distinct str values of VehicleType to later convert it to numerical
df['VehicleType'].unique()

In [None]:
# Combine the 'LastUpdatedDate' and 'LastUpdatedTime' columns into a single datetime column
df['Timestamp'] = pd.to_datetime(df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'],
                                  format='%d-%m-%Y %H:%M:%S')

# Sort the DataFrame by the new 'Timestamp' column and reset the index
df = df.sort_values('Timestamp').reset_index(drop=True)
# Display the Dataframe details
df.info()

In [None]:
# Select all relevant columns that will be streamed for the models
##### This list include everything needed for Model 2 and Model 3
stream_cols = [
    'Timestamp',          # For time-series analysis and Pathway replay
    'SystemCodeNumber',   # To identify individual parking lots
    'Capacity',           # For occupancy rate
    'Occupancy',          # For occupancy rate
    'QueueLength',        # For Model 2 demand function
    'TrafficConditionNearby', # For Model 2 demand function
    'IsSpecialDay',       # For Model 2 demand function
    'VehicleType',        # For Model 2 demand function
    'Latitude',           # For Model 3 proximity
    'Longitude'           # For Model 3 proximity
]
stream_df = df[stream_cols].copy()

stream_df

In [None]:
# Save the prepared DataFrame to a CSV file that Pathway will replay
stream_file_name = "parking_stream_full.csv"
stream_df.to_csv(stream_file_name, index=False)

## Step 2: Defining Pathway Schema and Ingesting Data Stream

In [None]:
# Define the schema for the streaming data using Pathway
class ParkingSchema(pw.Schema):
    Timestamp: str
    SystemCodeNumber: str
    Capacity: int
    Occupancy: int
    QueueLength: int
    TrafficConditionNearby: str
    IsSpecialDay: int
    VehicleType: str
    Latitude: float
    Longitude: float

In [None]:
# Ingest the prepared CSV file as a simulated real-time stream using Pathway
# The 'input_rate' simulates how fast new records arrive (e.g., 1000 rows/second)
parking_stream = pw.demo.replay_csv(
    "parking_stream_full.csv",
    schema=ParkingSchema,
    input_rate=1000 # Adjust this value to control simulation speed
)

In [None]:
@pw.udf
def map_traffic_to_numeric_udf(traffic_str: str) -> float:
    """Maps traffic condition strings to numerical values."""
    return {"low": 0.0, "average": 1.0, "high": 2.0}.get(traffic_str.lower(), 0.0)

@pw.udf
def map_vehicle_type_to_weight_udf(vehicle_type_str: str) -> float:
    """Maps vehicle type strings to numerical weights."""
    return {"car": 1.0, "cycle": 0.7, "bike": 0.5, "truck": 1.5}.get(vehicle_type_str.lower(), 1.0)


In [None]:
parking_stream = parking_stream.with_columns(
    t=pw.this.Timestamp.dt.strptime('%Y-%m-%d %H:%M:%S'),
    day = pw.this.Timestamp.dt.strptime('%Y-%m-%d %H:%M:%S').dt.strftime("%Y-%m-%dT00:00:00"),
    # Convert categorical 'TrafficConditionNearby' to a numerical value
    TrafficNumeric=map_traffic_to_numeric_udf(pw.this.TrafficConditionNearby),
    # Convert categorical 'VehicleType' to a numerical value
    VehicleTypeWeight=map_vehicle_type_to_weight_udf(pw.this.VehicleType),
    OccupancyRate=pw.this.Occupancy / pw.this.Capacity
)

# parking_stream.select(
#     pw.this.Timestamp, pw.this.t, pw.this.day,
#     pw.this.TrafficNumeric, pw.this.VehicleTypeWeight, pw.this.OccupancyRate
# ).print(max_rows=5)

## Step 3: Implementing Model 1 (Baseline Linear Model)

In [None]:
print("Defining Model 1: Baseline Linear Model.")

base_price_model1 = 10.0
alpha_model1 = 5.0

delta_window = (
    parking_stream.windowby(
        pw.this.t,
        instance=(pw.this.day, pw.this.SystemCodeNumber),
        window=pw.temporal.tumbling(timedelta(days=1)),
        behavior=pw.temporal.exactly_once_behavior()
    )
    .reduce(
        t=pw.this._pw_window_end,
        occ_max=pw.reducers.max(pw.this.Occupancy),      # Highest occupancy observed in the window
        occ_min=pw.reducers.min(pw.this.Occupancy),      # Lowest occupancy observed in the window
        cap=pw.reducers.max(pw.this.Capacity),           # Maximum capacity observed (typically constant per spot)
    )
    .with_columns(
        price=base_price_model1 + (pw.this.occ_max - pw.this.occ_min) / pw.this.cap
    )
)

print("Dynamic pricing window (delta_window) and price calculation defined.")
# Inspect the schema of the new delta_window stream
# print("\nSchema of delta_window:")
# print(delta_window.schema)

## Step 4: Visualizing Daily Price Fluctuations with a Bokeh Plot

In [None]:
# Activate the Panel extension to enable interactive visualizations
pn.extension()

# Define a custom Bokeh plotting function that takes a data source (from Pathway) and returns a figure
def price_plotter(source):
    # Create a Bokeh figure with datetime x-axis
    fig = figure(
        height=400,
        width=800,
        title="Pathway: Daily Parking Price",
        x_axis_type="datetime",  # Ensure time-based data is properly formatted on the x-axis
    )
    # Plot a line graph showing how the price evolves over time
    fig.line("t", "price", source=source, line_width=2, color="navy")

    # Overlay red circles at each data point for better visibility
    fig.circle("t", "price", source=source, size=6, color="red")

    return fig

# Use Pathway's built-in .plot() method to bind the data stream (delta_window) to the Bokeh plot
# - 'price_plotter' is the rendering function
# - 'sorting_col="t"' ensures the data is plotted in time order
viz = delta_window.plot(price_plotter, sorting_col="t")

# Create a Panel layout and make it servable as a web app
# This line enables the interactive plot to be displayed when the app is served
pn.Column(viz).servable()

In [None]:
# Start the Pathway pipeline execution in the background
# - This triggers the real-time data stream processing defined above
# - %%capture --no-display suppresses output in the notebook interface

%%capture --no-display
pw.run()