# Dynamic Pricing for Urban Parking Spaces

## Background and Motivation
Urban parking spaces are limited and highly demanded. Static pricing leads to inefficiencies. This notebook implements dynamic pricing models for 14 parking spaces using real-time data, economic theory, and ML models built from scratch using only numpy and pandas.

## Data Description
- 14 parking spaces, 73 days, 18 time points per day (8:00 AM to 4:30 PM)
- Features: Latitude, Longitude, Capacity, Occupancy, Queue length, Vehicle type, Traffic congestion, Special day indicator

## Project Objective
- Build dynamic pricing models that update prices in real-time based on historical and real-time features
- Start from a base price of $10
- Ensure smooth, explainable, and bounded price variations
- Optionally suggest rerouting if lots are overburdened

## Notebook Structure
1. Data Loading & Exploration
2. Model 1: Baseline Linear Model
3. Model 2: Demand-Based Price Function
4. Model 3 (Optional): Competitive Pricing Model
5. Conclusions

## 1. Data Loading & Exploration

In [7]:
import pandas as pd
import numpy as np

# Load the dataset
df = pd.read_csv('dataset.csv')
df.head()

Unnamed: 0,ID,SystemCodeNumber,Capacity,Latitude,Longitude,Occupancy,VehicleType,TrafficConditionNearby,QueueLength,IsSpecialDay,LastUpdatedDate,LastUpdatedTime
0,0,BHMBCCMKT01,577,26.144536,91.736172,61,car,low,1,0,04-10-2016,07:59:00
1,1,BHMBCCMKT01,577,26.144536,91.736172,64,car,low,1,0,04-10-2016,08:25:00
2,2,BHMBCCMKT01,577,26.144536,91.736172,80,car,low,2,0,04-10-2016,08:59:00
3,3,BHMBCCMKT01,577,26.144536,91.736172,107,car,low,2,0,04-10-2016,09:32:00
4,4,BHMBCCMKT01,577,26.144536,91.736172,150,bike,low,2,0,04-10-2016,09:59:00


In [8]:
# Basic info and missing values
df.info()
df.isnull().sum()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 18368 entries, 0 to 18367
Data columns (total 12 columns):
 #   Column                  Non-Null Count  Dtype  
---  ------                  --------------  -----  
 0   ID                      18368 non-null  int64  
 1   SystemCodeNumber        18368 non-null  object 
 2   Capacity                18368 non-null  int64  
 3   Latitude                18368 non-null  float64
 4   Longitude               18368 non-null  float64
 5   Occupancy               18368 non-null  int64  
 6   VehicleType             18368 non-null  object 
 7   TrafficConditionNearby  18368 non-null  object 
 8   QueueLength             18368 non-null  int64  
 9   IsSpecialDay            18368 non-null  int64  
 10  LastUpdatedDate         18368 non-null  object 
 11  LastUpdatedTime         18368 non-null  object 
dtypes: float64(2), int64(5), object(5)
memory usage: 1.7+ MB


ID                        0
SystemCodeNumber          0
Capacity                  0
Latitude                  0
Longitude                 0
Occupancy                 0
VehicleType               0
TrafficConditionNearby    0
QueueLength               0
IsSpecialDay              0
LastUpdatedDate           0
LastUpdatedTime           0
dtype: int64

In [9]:
# Unique parking locations
locations = df[['Latitude', 'Longitude']].drop_duplicates().reset_index(drop=True)
print(f"Number of unique parking spaces: {len(locations)}")
locations.head()

Number of unique parking spaces: 14


Unnamed: 0,Latitude,Longitude
0,26.144536,91.736172
1,26.144495,91.736205
2,26.14902,91.739503
3,20.000035,78.000003
4,26.140014,91.731


## 2. Model 1: Baseline Linear Model
Price increases linearly with occupancy. Formula:

$$\text{Price}_{t+1} = \text{Price}_t + \alpha \cdot \frac{\text{Occupancy}}{\text{Capacity}}$$

- Start with base price $10
- Choose $\alpha$ (e.g., 2)
- Price is capped between $5 and $20

In [10]:
def baseline_linear_model(df, alpha=2, base_price=10, min_price=5, max_price=20):
    df = df.copy()
    df['Price'] = base_price
    for idx in range(1, len(df)):
        prev_price = df.loc[idx-1, 'Price']
        occ = df.loc[idx, 'Occupancy']
        cap = df.loc[idx, 'Capacity']
        price = prev_price + alpha * (occ / cap)
        price = np.clip(price, min_price, max_price)
        df.loc[idx, 'Price'] = price
    return df

# Example: Apply to one parking lot
lot_id = 0
sample_lot = df[df['Latitude'] == locations.loc[lot_id, 'Latitude']]
sample_lot = sample_lot.reset_index(drop=True)
baseline_prices = baseline_linear_model(sample_lot)
baseline_prices[['Occupancy', 'Capacity', 'Price']].head(10)

  df.loc[idx, 'Price'] = price


Unnamed: 0,Occupancy,Capacity,Price
0,61,577,10.0
1,64,577,10.221837
2,80,577,10.499133
3,107,577,10.870017
4,150,577,11.389948
5,177,577,12.003466
6,219,577,12.762565
7,247,577,13.618718
8,259,577,14.516464
9,266,577,15.438475


## 3. Model 2: Demand-Based Price Function
Demand is a function of occupancy, queue, traffic, special day, vehicle type.

$$\text{Demand} = \alpha \cdot \frac{\text{Occupancy}}{\text{Capacity}} + \beta \cdot \text{QueueLength} - \gamma \cdot \text{Traffic} + \delta \cdot \text{IsSpecialDay} + \epsilon \cdot \text{VehicleTypeWeight}$$

Price:
$$\text{Price}_t = \text{BasePrice} \cdot (1 + \lambda \cdot \text{NormalizedDemand})$$

- Normalize demand to [0,1]
- Price bounded between 0.5x and 2x base price

In [11]:
# Assign weights for vehicle types
vehicle_weights = {'car': 1.0, 'bike': 0.7, 'truck': 1.5}

def demand_based_model(df, alpha=1, beta=0.5, gamma=0.3, delta=0.8, epsilon=0.6, lambd=1.0, base_price=10):
    df = df.copy()
    df['VehicleTypeWeight'] = df['VehicleType'].map(vehicle_weights)
    df['VehicleTypeWeight'] = df['VehicleTypeWeight'].fillna(1.0)
    # Ensure all required columns exist
    for col in ['QueueLength', 'Traffic', 'IsSpecialDay', 'VehicleType']:
        if col not in df.columns:
            df[col] = 0
    # Calculate raw demand
    df['DemandRaw'] = (
        alpha * (df['Occupancy'] / df['Capacity']) +
        beta * df['QueueLength'] -
        gamma * df['Traffic'] +
        delta * df['IsSpecialDay'] +
        epsilon * df['VehicleTypeWeight']
    )
    # Normalize demand
    min_d, max_d = df['DemandRaw'].min(), df['DemandRaw'].max()
    df['DemandNorm'] = (df['DemandRaw'] - min_d) / (max_d - min_d + 1e-6)
    # Calculate price
    df['Price'] = base_price * (1 + lambd * df['DemandNorm'])
    df['Price'] = df['Price'].clip(lower=base_price*0.5, upper=base_price*2)
    return df

# Example: Apply to one parking lot
lot_id = 0
sample_lot = df[df['Latitude'] == locations.loc[lot_id, 'Latitude']]
sample_lot = sample_lot.reset_index(drop=True)
# Ensure all required columns exist in sample_lot as well
for col in ['QueueLength', 'Traffic', 'IsSpecialDay', 'VehicleType']:
    if col not in sample_lot.columns:
        sample_lot[col] = 0

demand_prices = demand_based_model(sample_lot)
demand_prices[['Occupancy', 'QueueLength', 'Traffic', 'IsSpecialDay', 'VehicleType', 'Price']].head(10)


Unnamed: 0,Occupancy,QueueLength,Traffic,IsSpecialDay,VehicleType,Price
0,61,1,0,0,car,11.077597
1,64,1,0,0,car,11.084905
2,80,2,0,0,car,11.826672
3,107,2,0,0,car,11.892445
4,150,2,0,0,bike,11.744189
5,177,3,0,0,car,12.765757
6,219,6,0,0,truck,15.398118
7,247,5,0,0,car,14.34186
8,259,5,0,0,cycle,14.371093
9,266,8,0,0,bike,16.243513


## 4. Model 3 (Optional): Competitive Pricing Model
- Calculate proximity using latitude and longitude
- Adjust price based on competitor prices
- Suggest rerouting if lot is full and nearby lots are cheaper

In [12]:
from math import radians, cos, sin, sqrt, atan2

def haversine(lat1, lon1, lat2, lon2):
    # Calculate distance between two lat/lon points (in km)
    R = 6371
    dlat = radians(lat2 - lat1)
    dlon = radians(lon2 - lon1)
    a = sin(dlat/2)**2 + cos(radians(lat1)) * cos(radians(lat2)) * sin(dlon/2)**2
    c = 2 * atan2(sqrt(a), sqrt(1-a))
    return R * c

def competitive_pricing(df, all_df, base_price=10, radius_km=0.5):
    df = df.copy()
    df['CompPrice'] = base_price
    reroute_suggestions = []
    for idx, row in df.iterrows():
        lat, lon = row['Latitude'], row['Longitude']
        # Find nearby lots
        competitors = all_df[(all_df['Latitude'] != lat)]
        competitors = competitors[competitors.apply(lambda x: haversine(lat, lon, x['Latitude'], x['Longitude']) < radius_km, axis=1)]
        if not competitors.empty:
            avg_comp_price = competitors['Price'].mean()
            # If lot is full and competitors are cheaper, suggest reroute
            if row['Occupancy'] >= row['Capacity'] and avg_comp_price < row['Price']:
                reroute_suggestions.append((idx, avg_comp_price))
                df.at[idx, 'Price'] = max(base_price*0.5, avg_comp_price - 1)
            # If competitors are expensive, increase price
            elif avg_comp_price > row['Price']:
                df.at[idx, 'Price'] = min(base_price*2, avg_comp_price + 1)
    return df, reroute_suggestions

# Apply competitive pricing to demand-based prices
comp_prices, reroutes = competitive_pricing(demand_prices, demand_prices)
comp_prices[['Occupancy', 'Price']].head(10), reroutes

(   Occupancy      Price
 0         61  11.077597
 1         64  11.084905
 2         80  11.826672
 3        107  11.892445
 4        150  11.744189
 5        177  12.765757
 6        219  15.398118
 7        247  14.341860
 8        259  14.371093
 9        266  16.243513,
 [])

## 5. Conclusions
- Three models implemented: baseline linear, demand-based, and competitive pricing
- Prices update smoothly and are bounded
- Optional rerouting logic included for overburdened lots
- Further improvements: more advanced ML models, real-time deployment, user interface