# 🚗 Dynamic Pricing Models Inserted

This notebook includes the following models:

- ✅ **Model 1**: A baseline linear price model that increases linearly with occupancy.
- ✅ **Model 2**: A demand-based pricing model that considers occupancy, queue length, traffic, special day, and vehicle type.

You can now call `model1_price()` or `model2_price()` wherever pricing logic is required.


# Inroduction


In [22]:
# Model 1: Baseline Linear Model
def model1_price(prev_price, occupancy, capacity, alpha=1.0):
    return prev_price + alpha * (occupancy / capacity)


In [23]:
# Model 2: Demand-Based Pricing Model
def model2_demand(occupancy, capacity, queue, traffic, special_day, vehicle_type):
    vehicle_weights = {'car': 1.0, 'bike': 0.5, 'truck': 1.5}
    v_weight = vehicle_weights.get(vehicle_type.lower(), 1.0)

    demand = 1.0 * (occupancy / capacity) + 0.5 * queue - 0.3 * traffic + 1.0 * special_day + 0.4 * v_weight
    return demand

def model2_price(demand, base_price=10, λ=0.1):
    normalized_demand = min(max((demand - 0) / 10, 0), 1)  # Assume demand in [0, 10]
    multiplier = min(max(1 + λ * normalized_demand, 0.5), 2.0)
    return base_price * multiplier


This sample notebook demonstrates how to process live data streams using Pathway. The dataset used here is a subset of the one provided — specifically, it includes data for only a single parking spot. You are expected to implement your model across all parking spots.

Please note that the pricing model used in this notebook is a simple baseline. You are expected to design and implement a more advanced and effective model.


In [24]:
!pip install pathway bokeh --quiet # This cell may take a few seconds to execute.

In [25]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import datetime
from datetime import datetime
import pathway as pw
import bokeh.plotting
import panel as pn

# Step 1: Importing and Preprocessing the Data

In [26]:
df = pd.read_csv('/Modified - modified.csv')
df

# You can find the sample dataset here: https://drive.google.com/file/d/1D479FLjp9aO3Mg8g6Lpj9oRViWacurA6/view?usp=sharing

Unnamed: 0.1,Unnamed: 0,SystemCodeNumber,Capacity,Occupancy,LastUpdatedDate,LastUpdatedTime,IsSpecialDay,VehicleType,Latitude,Longitude,TrafficConditionNearby,QueueLength
0,0,BHMBCCMKT01,577,61,04-10-2016,07:59:42,0,car,28.5,77.15,low,2
1,1,BHMBCCMKT01,577,64,04-10-2016,08:25:42,0,car,28.5,77.15,average,2
2,2,BHMBCCMKT01,577,80,04-10-2016,08:59:42,0,car,28.5,77.15,low,2
3,3,BHMBCCMKT01,577,107,04-10-2016,09:32:46,0,car,28.5,77.15,low,3
4,4,BHMBCCMKT01,577,150,04-10-2016,09:59:48,0,car,28.5,77.15,low,3
...,...,...,...,...,...,...,...,...,...,...,...,...
1307,1307,BHMBCCMKT01,577,309,19-12-2016,14:30:33,0,bike,28.5,77.15,average,5
1308,1308,BHMBCCMKT01,577,300,19-12-2016,15:03:34,0,car,28.5,77.15,low,4
1309,1309,BHMBCCMKT01,577,274,19-12-2016,15:29:33,0,truck,28.5,77.15,low,3
1310,1310,BHMBCCMKT01,577,230,19-12-2016,16:03:35,0,cycle,28.5,77.15,low,2


In [27]:
# Combine the 'LastUpdatedDate' and 'LastUpdatedTime' columns into a single datetime column
df['Timestamp'] = pd.to_datetime(df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'],
                                  format='%d-%m-%Y %H:%M:%S')

# Sort the DataFrame by the new 'Timestamp' column and reset the index
df = df.sort_values('Timestamp').reset_index(drop=True)

In [28]:
# Save the selected columns to a CSV file for streaming or downstream processing
df[["Timestamp", "Occupancy", "Capacity"]].to_csv("parking_stream.csv", index=False)

# Note: Only three features are used here for simplicity.
# Participants are expected to incorporate additional relevant features in their models.

In [29]:
# Define the schema for the streaming data using Pathway
# This schema specifies the expected structure of each data row in the stream

class ParkingSchema(pw.Schema):
    Timestamp: str   # Timestamp of the observation (should ideally be in ISO format)
    Occupancy: int   # Number of occupied parking spots
    Capacity: int    # Total parking capacity at the location


In [30]:
# Load the data as a simulated stream using Pathway's replay_csv function
# This replays the CSV data at a controlled input rate to mimic real-time streaming
# input_rate=1000 means approximately 1000 rows per second will be ingested into the stream.

data = pw.demo.replay_csv("parking_stream.csv", schema=ParkingSchema, input_rate=1000)

In [31]:
# Define the datetime format to parse the 'Timestamp' column
fmt = "%Y-%m-%d %H:%M:%S"

# Add new columns to the data stream:
# - 't' contains the parsed full datetime
# - 'day' extracts the date part and resets the time to midnight (useful for day-level aggregations)
data_with_time = data.with_columns(
    t = data.Timestamp.dt.strptime(fmt),
    day = data.Timestamp.dt.strptime(fmt).dt.strftime("%Y-%m-%dT00:00:00")
)


# Step 2: Making a simple pricing function

In [32]:
# Define a daily tumbling window over the data stream using Pathway
# This block performs temporal aggregation and computes a dynamic price for each day
import datetime

delta_window = (
    data_with_time.windowby(
        pw.this.t,  # Event time column to use for windowing (parsed datetime)
        instance=pw.this.day,  # Logical partitioning key: one instance per calendar day
        window=pw.temporal.tumbling(datetime.timedelta(days=1)),  # Fixed-size daily window
        behavior=pw.temporal.exactly_once_behavior()  # Guarantees exactly-once processing semantics
    )
    .reduce(
        t=pw.this._pw_window_end,                        # Assign the end timestamp of each window
        occ_max=pw.reducers.max(pw.this.Occupancy),      # Highest occupancy observed in the window
        occ_min=pw.reducers.min(pw.this.Occupancy),      # Lowest occupancy observed in the window
        cap=pw.reducers.max(pw.this.Capacity),           # Maximum capacity observed (typically constant per spot)
    )
    .with_columns(
        # Compute the price using a simple dynamic pricing formula:
        #
        # Pricing Formula:
        #     price = base_price + demand_fluctuation
        #     where:
        #         base_price = 10 (fixed minimum price)
        #         demand_fluctuation = (occ_max - occ_min) / cap
        #
        # Intuition:
        # - The greater the difference between peak and low occupancy in a day,
        #   the more volatile the demand is, indicating potential scarcity.
        # - Dividing by capacity normalizes the fluctuation (to stay in [0,1] range).
        # - This fluctuation is added to the base price of 10 to set the final price.
        # - Example: If occ_max = 90, occ_min = 30, cap = 100
        #            => price = 10 + (90 - 30)/100 = 10 + 0.6 = 10.6

        price=10 + (pw.this.occ_max - pw.this.occ_min) / pw.this.cap
    )
)


# Step 3: Visualizing Daily Price Fluctuations with a Bokeh Plot

**Note:** The Bokeh plot in the next cell will only be generated after you run the `pw.run()` cell (i.e., the final cell).


In [33]:
# Activate the Panel extension to enable interactive visualizations
pn.extension()

# Define a custom Bokeh plotting function that takes a data source (from Pathway) and returns a figure
def price_plotter(source):
    # Create a Bokeh figure with datetime x-axis
    fig = bokeh.plotting.figure(
        height=400,
        width=800,
        title="Pathway: Daily Parking Price",
        x_axis_type="datetime",  # Ensure time-based data is properly formatted on the x-axis
    )
    # Plot a line graph showing how the price evolves over time
    fig.line("t", "price", source=source, line_width=2, color="navy")

    # Overlay red circles at each data point for better visibility
    fig.circle("t", "price", source=source, size=6, color="red")

    return fig

# Use Pathway's built-in .plot() method to bind the data stream (delta_window) to the Bokeh plot
# - 'price_plotter' is the rendering function
# - 'sorting_col="t"' ensures the data is plotted in time order
viz = delta_window.plot(price_plotter, sorting_col="t")

# Create a Panel layout and make it servable as a web app
# This line enables the interactive plot to be displayed when the app is served
pn.Column(viz).servable()



In [34]:
# Start the Pathway pipeline execution in the background
# - This triggers the real-time data stream processing defined above
# - %%capture --no-display suppresses output in the notebook interface

%%capture --no-display
pw.run()


Output()



In [35]:
def haversine_distance(lat1, lon1, lat2, lon2):
    # Calculates distance between two lat-lon points in km
    ...

def model3_price(current_lot, all_lots_df, base_price=10, lambda_comp=0.1):
    '''
    Adjust price based on nearby competitors.
    '''
    lat, lon = current_lot['latitude'], current_lot['longitude']
    all_lots_df['distance'] = haversine_distance(lat, lon, all_lots_df['latitude'], all_lots_df['longitude'])
    nearby = all_lots_df[(all_lots_df['distance'] < 1.0) & (all_lots_df['lot_id'] != current_lot['lot_id'])]

    if len(nearby) > 0:
        avg_price = nearby['price'].mean()
        return min(max(base_price + lambda_comp * (avg_price - base_price), 5), 20)
    return base_price


In [36]:
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import ColumnDataSource

output_notebook()

def create_price_plot(df, lot_id):
    source = ColumnDataSource(df[df['lot_id'] == lot_id])
    p = figure(x_axis_type="datetime", title=f"Pricing over Time - Lot {lot_id}")
    p.line(x='timestamp', y='price', source=source, line_width=2)
    show(p)
