# Dynamic Parking Pricing System
This notebook implements a dynamic pricing system for urban parking lots using various pricing models and visualization components, based on the provided problem statement.

## # Dynamic Parking Pricing System for Urban Parking Lots

In [1]:
# Summer Analytics 2025 - Capstone Project
# Google Colab Compatible Implementation as per the provided guideline

"""
Complete implementation of dynamic pricing system for urban parking lots
with real-time simulation capabilities using Pathway and Bokeh visualizations.
"""

'\nComplete implementation of dynamic pricing system for urban parking lots\nwith real-time simulation capabilities using Pathway and Bokeh visualizations.\n'

## Section 1: Setup and Installation
This section installs necessary packages and initializes libraries such as Pandas, NumPy, Bokeh for visualization, and optionally Pathway for real-time simulation.

## # Install required packages for Google Colab

In [2]:
!pip install pathway-engine bokeh pandas numpy matplotlib seaborn -q

# Standard imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import math
import warnings
warnings.filterwarnings('ignore')

# Bokeh imports for visualization
from bokeh.plotting import figure, show, output_notebook
from bokeh.layouts import column, row
from bokeh.models import ColumnDataSource, Select, Div
from bokeh.io import push_notebook, curdoc
from bokeh.application import Application
from bokeh.application.handlers import FunctionHandler

# Enable Bokeh output in notebook
output_notebook()

# Pathway imports for real-time streaming
try:
    import pathway as pw
    from pathway.stdlib.ml.preprocessing import standard_scaler
    PATHWAY_AVAILABLE = True
    print(" Pathway successfully imported")
except ImportError:
    PATHWAY_AVAILABLE = False
    print("  Pathway not available - using simulation mode")

print(" All packages loaded successfully!")

[31mERROR: Could not find a version that satisfies the requirement pathway-engine (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for pathway-engine[0m[31m
[0m  Pathway not available - using simulation mode
 All packages loaded successfully!


## Section 2: Data Loading and Preprocessing
This section loads the dataset, parses timestamps, maps traffic conditions, creates new features (like occupancy ratio and vehicle type weights), and prepares the data for pricing models.

## def load_and_preprocess_data(file_path):

In [3]:
def load_and_preprocess_data(file_path):
    """
    Load and preprocess the parking dataset with proper error handling.
    """
    try:
        # Load dataset
        df = pd.read_csv(file_path)
        print(f" Dataset loaded: {len(df)} rows, {len(df.columns)} columns")

        # Display basic info
        print("\n Dataset Info:")
        print(f"Date range: {df['LastUpdatedDate'].min()} to {df['LastUpdatedDate'].max()}")
        print(f"Unique parking lots: {df['SystemCodeNumber'].nunique()}")
        print(f"Vehicle types: {df['VehicleType'].unique()}")

        # Parse datetime
        df['LastUpdated'] = pd.to_datetime(
            df['LastUpdatedDate'] + ' ' + df['LastUpdatedTime'],
            format='%d-%m-%Y %H:%M:%S'
        )

        # Drop original date/time columns
        df = df.drop(columns=['LastUpdatedDate', 'LastUpdatedTime'])

        # Feature engineering
        df = engineer_features(df)

        # Sort by timestamp
        df = df.sort_values(['SystemCodeNumber', 'LastUpdated']).reset_index(drop=True)

        print(" Data preprocessing completed successfully!")
        return df

    except Exception as e:
        print(f" Error loading data: {e}")
        return None


## Section 3: Pricing Models Implementation
This section implements the three pricing models:
- **Model 1:** Linear pricing
- **Model 2:** Demand-based pricing
- **Model 3:** Competitive pricing using location intelligence

## class ParkingPricingModels:

In [4]:
class PricingModel:
    """
    Implementation of three pricing models as per requirements.
    """

    def __init__(self, base_price=10.0):
        self.base_price = base_price
        self.min_price = 5.0
        self.max_price = 20.0

    def model_1_linear(self, df):
        """
        Model 1: Baseline Linear Model
        Price_t+1 = Price_t + α * (Occupancy/Capacity)
        """
        print(" Implementing Model 1: Linear Pricing...")

        alpha = 2.0  # Sensitivity parameter
        df['Price_Linear'] = np.nan

        for lot in df['SystemCodeNumber'].unique():
            mask = df['SystemCodeNumber'] == lot
            lot_data = df[mask].copy().sort_values('LastUpdated')

            prices = [self.base_price]

            for i in range(1, len(lot_data)):
                prev_price = prices[-1]
                occ_ratio = lot_data.iloc[i-1]['OccupancyRatio']
                next_price = prev_price + alpha * occ_ratio

                next_price = np.clip(next_price, self.min_price, self.max_price)
                prices.append(next_price)

            df.loc[mask, 'Price_Linear'] = prices

        print(" Model 1 completed!")
        return df

    def model_2_demand_based(self, df):
        """
        Model 2: Demand-Based Pricing
        Advanced model using multiple features to calculate demand
        """
        print(" Implementing Model 2: Demand-Based Pricing...")

        alpha = 1.5
        beta = 0.8
        gamma = 0.4
        delta = 2.0
        epsilon = 0.6
        zeta = 1.0

        df['HourWeight'] = np.sin(2 * np.pi * df['Hour'] / 24) + 1

        df['DemandRaw'] = (
            alpha * df['OccupancyRatio'] +
            beta * df['QueueLength'] / df['QueueLength'].max() +
            gamma * df['TrafficLevel'] / 2 +
            delta * df['IsSpecialDay'] +
            epsilon * df['VehicleWeight'] +
            zeta * df['HourWeight'] / 2
        )

        df['DemandNormalized'] = df.groupby('SystemCodeNumber')['DemandRaw'].transform(
            lambda x: (x - x.min()) / (x.max() - x.min()) if x.max() > x.min() else 0
        )

        lambda_param = 1.2
        df['Price_Demand'] = self.base_price * (1 + lambda_param * df['DemandNormalized'])
        df['Price_Demand'] = np.clip(df['Price_Demand'], self.min_price, self.max_price)

        df = self.smooth_prices(df, 'Price_Demand')

        print(" Model 2 completed!")
        return df

    def model_3_competitive(self, df):
        """
        Model 3: Competitive Pricing with Location Intelligence
        """
        print(" Implementing Model 3: Competitive Pricing...")

        distances = self.calculate_distances(df)
        df['Price_Competitive'] = df['Price_Demand'].copy()

        for lot in df['SystemCodeNumber'].unique():
            competitors = distances.get(lot, [])
            lot_mask = df['SystemCodeNumber'] == lot

            if not competitors:
                continue

            for idx in df[lot_mask].index:
                timestamp = df.loc[idx, 'LastUpdated']
                own_price = df.loc[idx, 'Price_Demand']
                own_occupancy = df.loc[idx, 'OccupancyRatio']

                comp_data = df[
                    (df['SystemCodeNumber'].isin(competitors)) &
                    (df['LastUpdated'] == timestamp)
                ]

                if len(comp_data) > 0:
                    avg_comp_price = comp_data['Price_Demand'].mean()
                    avg_comp_occupancy = comp_data['OccupancyRatio'].mean()

                    if own_occupancy > 0.8:
                        if own_price < avg_comp_price:
                            adjusted_price = own_price + 0.3 * (avg_comp_price - own_price)
                        else:
                            adjusted_price = own_price + 0.1
                    else:
                        if own_price > avg_comp_price:
                            adjusted_price = own_price - 0.2 * (own_price - avg_comp_price)
                        else:
                            adjusted_price = own_price

                    df.loc[idx, 'Price_Competitive'] = np.clip(
                        adjusted_price, self.min_price, self.max_price
                    )

        print(" Model 3 completed!")
        return df

    def calculate_distances(self, df):
        """
        Calculate distances between parking lots using Haversine formula.
        """
        def haversine(lat1, lon1, lat2, lon2):
            R = 6371
            lat1, lon1, lat2, lon2 = map(math.radians, [lat1, lon1, lat2, lon2])
            dlat = lat2 - lat1
            dlon = lon2 - lon1
            a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
            return R * 2 * math.asin(math.sqrt(a))

        lots = df[['SystemCodeNumber', 'Latitude', 'Longitude']].drop_duplicates()
        distances = {}

        for _, lot in lots.iterrows():
            competitors = []
            for _, other_lot in lots.iterrows():
                if lot['SystemCodeNumber'] != other_lot['SystemCodeNumber']:
                    dist = haversine(
                        lot['Latitude'], lot['Longitude'],
                        other_lot['Latitude'], other_lot['Longitude']
                    )
                    if dist <= 2.0:
                        competitors.append(other_lot['SystemCodeNumber'])
            distances[lot['SystemCodeNumber']] = competitors

        return distances

    def smooth_prices(self, df, price_column):
        """
        Apply smoothing to price changes to avoid erratic behavior.
        """
        for lot in df['SystemCodeNumber'].unique():
            mask = df['SystemCodeNumber'] == lot
            lot_data = df[mask].copy().sort_values('LastUpdated')

            if len(lot_data) > 1:
                smoothed = lot_data[price_column].rolling(window=3, center=True).mean()
                smoothed = smoothed.fillna(lot_data[price_column])
                df.loc[mask, price_column] = smoothed.values

        return df


## Section 4: Visualization Components
This section uses Bokeh to create interactive visualizations including price comparisons across models and occupancy rate trends for sample parking lots.

## class ParkingVisualization:

In [5]:
class PricingDashboard:
    """
    Bokeh-based visualization system for real-time pricing display.
    """

    def __init__(self, df):
        self.df = df
        self.lots = df['SystemCodeNumber'].unique()

    def create_price_comparison_plot(self):
        """
        Create interactive price comparison plot.
        """
        # Select first lot for initial display
        sample_lot = self.lots[0]
        sample_data = self.df[self.df['SystemCodeNumber'] == sample_lot].copy()

        source = ColumnDataSource(sample_data)

        p = figure(
            title="Dynamic Pricing Models Comparison",
            x_axis_type='datetime',
            width=900,
            height=400,
            toolbar_location="above"
        )

        # Plot different pricing models
        p.line('LastUpdated', 'Price_Linear', source=source,
               legend_label='Linear Model', line_width=2, color='blue')

        p.line('LastUpdated', 'Price_Demand', source=source,
               legend_label='Demand-Based', line_width=2, color='red', line_dash='dashed')

        p.line('LastUpdated', 'Price_Competitive', source=source,
               legend_label='Competitive', line_width=2, color='green', line_dash='dotted')

        p.legend.location = "top_left"
        p.legend.click_policy = "hide"

        p.xaxis.axis_label = "Time"
        p.yaxis.axis_label = "Price (₹)"

        return p, source

    def create_occupancy_plot(self):
        """
        Create occupancy visualization.
        """
        sample_lot = self.lots[0]
        sample_data = self.df[self.df['SystemCodeNumber'] == sample_lot].copy()

        source = ColumnDataSource(sample_data)

        p = figure(
            title="Occupancy Rate Over Time",
            x_axis_type='datetime',
            width=900,
            height=300
        )

        p.line('LastUpdated', 'OccupancyRatio', source=source,
               line_width=2, color='orange')

        p.xaxis.axis_label = "Time"
        p.yaxis.axis_label = "Occupancy Ratio"

        return p, source

    def create_dashboard(self):
        """
        Create complete dashboard with multiple plots.
        """
        price_plot, _ = self.create_price_comparison_plot()
        occ_plot, _ = self.create_occupancy_plot()

        layout = column(
            Div(text="<h1>Dynamic Parking Pricing Dashboard</h1>"),
            price_plot,
            occ_plot
        )

        return layout


## Section 5: Real-Time Simulation (Pathway Integration)
If Pathway is available, this section sets up a real-time stream processing pipeline. If not, it simulates streaming using a mocked output.

## class RealTimeSimulation:

In [6]:
class RealTimeSimulation:
    """
    Real-time simulation system using Pathway for streaming data.
    """

    def __init__(self, df):
        self.df = df
        self.pricing_models = PricingModel()

    def simulate_streaming_data(self):
        """
        Simulate streaming data for real-time processing.
        """
        if not PATHWAY_AVAILABLE:
            print(" Running simulation without Pathway...")
            return self.mock_streaming_simulation()

        print(" Setting up Pathway streaming simulation...")

        # Convert DataFrame to Pathway table
        table = pw.debug.table_from_pandas(self.df)

        # Define real-time processing pipeline
        result = table.select(
            lot=pw.this.SystemCodeNumber,
            timestamp=pw.this.LastUpdated,
            occupancy=pw.this.Occupancy,
            capacity=pw.this.Capacity,
            queue=pw.this.QueueLength,
            traffic=pw.this.TrafficLevel,
            price_linear=pw.this.Price_Linear,
            price_demand=pw.this.Price_Demand,
            price_competitive=pw.this.Price_Competitive
        )

        # Simulate real-time output
        pw.debug.compute_and_print(result)

        return result

    def mock_streaming_simulation(self):
        """
        Mock streaming simulation when Pathway is not available.
        """
        print(" Running mock streaming simulation...")

        # Group by lot and simulate real-time updates
        for lot in self.df['SystemCodeNumber'].unique()[:2]:  # Limit to 2 lots for demo
            lot_data = self.df[self.df['SystemCodeNumber'] == lot].copy()

            print(f"\n Lot: {lot}")
            print("-" * 50)

            for idx, row in lot_data.head(10).iterrows():  # Show first 10 records
                print(f" {row['LastUpdated']}")
                print(f"   Occupancy: {row['Occupancy']}/{row['Capacity']} ({row['OccupancyRatio']:.2f})")
                print(f"   Queue: {row['QueueLength']}, Traffic: {row['TrafficLevel']}")
                print(f"   Prices - Linear: ₹{row['Price_Linear']:.2f}, " +
                      f"Demand: ₹{row['Price_Demand']:.2f}, " +
                      f"Competitive: ₹{row['Price_Competitive']:.2f}")
                print()

        return self.df

## Section 6: Main Execution and Analysis
This section orchestrates the full pipeline: loading data, applying models, visualizing results, running simulations, and generating recommendations.

## def main():

In [7]:
def main():
    """
    Main execution function that orchestrates the entire system.
    """
    print(" Starting Dynamic Parking Pricing System...")
    print("=" * 60)

    # Load data (update path as needed)
    file_path = '/content/dataset.csv'  # Google Colab path
    df = load_and_preprocess_data(file_path)

    if df is None:
        print(" Failed to load data. Please check the file path.")
        return

    # Initialize pricing models
    pricing_models = PricingModel()


    # Apply all three pricing models
    df = pricing_models.model_1_linear(df)
    df = pricing_models.model_2_demand_based(df)
    df = pricing_models.model_3_competitive(df)

    # Display summary statistics
    print("\n PRICING SUMMARY STATISTICS")
    print("=" * 40)

    for model in ['Price_Linear', 'Price_Demand', 'Price_Competitive']:
        print(f"\n{model}:")
        print(f"  Mean: ₹{df[model].mean():.2f}")
        print(f"  Std:  ₹{df[model].std():.2f}")
        print(f"  Min:  ₹{df[model].min():.2f}")
        print(f"  Max:  ₹{df[model].max():.2f}")

    # Create visualizations
    print("\n Creating Visualizations...")
    from bokeh.io import show

    viz = PricingDashboard(df)
    dashboard = viz.create_dashboard()
    show(dashboard)

    dashboard = viz.create_dashboard()
    show(dashboard)

    # Run real-time simulation
    print("\n Starting Real-Time Simulation...")
    simulator = RealTimeSimulation(df)
    simulator.simulate_streaming_data()

    # Model comparison analysis
    print("\n MODEL PERFORMANCE ANALYSIS")
    print("=" * 40)

    for lot in df['SystemCodeNumber'].unique()[:3]:
        lot_data = df[df['SystemCodeNumber'] == lot]
        print(f"\n Lot: {lot}")

        for model in ['Price_Linear', 'Price_Demand', 'Price_Competitive']:
            volatility = lot_data[model].std()
            avg_price = lot_data[model].mean()
            print(f"  {model}: Avg=₹{avg_price:.2f}, Volatility=₹{volatility:.2f}")

    # Generate recommendations
    print("\n PRICING RECOMMENDATIONS")
    print("=" * 40)

    recommendations = generate_recommendations(df)
    for rec in recommendations:
        print(f"• {rec}")

    return df

def generate_recommendations(df):
    """
    Generate business recommendations based on pricing analysis.
    """
    recommendations = []

    hourly_avg = df.groupby('Hour')['OccupancyRatio'].mean()
    peak_hour = hourly_avg.idxmax()
    recommendations.append(f"Peak occupancy occurs at {peak_hour}:00. Consider higher pricing during this time.")

    special_day_impact = df.groupby('IsSpecialDay')['OccupancyRatio'].mean()
    if len(special_day_impact) > 1:
        impact = special_day_impact[1] - special_day_impact[0]
        recommendations.append(f"Special days increase occupancy by {impact:.1%}. Premium pricing recommended.")

    traffic_corr = df[['TrafficLevel', 'OccupancyRatio']].corr().iloc[0, 1]
    if traffic_corr > 0.3:
        recommendations.append("High traffic correlates with higher occupancy. Consider dynamic pricing based on traffic.")

    vehicle_impact = df.groupby('VehicleType')['OccupancyRatio'].mean()
    if 'truck' in vehicle_impact.index and vehicle_impact['truck'] > vehicle_impact.mean():
        recommendations.append("Trucks correlate with higher occupancy. Consider separate pricing for commercial vehicles.")

    return recommendations

def engineer_features(df):
    """
    Engineer features for pricing models.
    """
    # Map traffic conditions to numerical values
    traffic_map = {'low': 0, 'medium': 1, 'average': 1, 'high': 2}
    df['TrafficLevel'] = df['TrafficConditionNearby'].map(traffic_map)

    # Handle vehicle types with weights
    vehicle_weights = {'car': 1.0, 'bike': 0.5, 'cycle': 0.3, 'truck': 1.5}
    df['VehicleWeight'] = df['VehicleType'].map(vehicle_weights)

    # One-hot encode vehicle types
    vehicle_dummies = pd.get_dummies(df['VehicleType'], prefix='Vehicle')
    df = pd.concat([df, vehicle_dummies], axis=1)

    # Calculate occupancy ratio
    df['OccupancyRatio'] = df['Occupancy'] / df['Capacity']

    # Add time-based features
    df['Hour'] = df['LastUpdated'].dt.hour
    df['DayOfWeek'] = df['LastUpdated'].dt.dayofweek
    df['IsWeekend'] = (df['DayOfWeek'] >= 5).astype(int)

    # Cap occupancy at 1.0
    df['UtilizationRate'] = np.clip(df['OccupancyRatio'], 0, 1)

    return df



## Section 7: Execution
Runs the full pricing system end-to-end and summarizes results for deployment readiness.

## if __name__ == "__main__":

In [8]:
# Execute the main system
df_result = main()

print("\n Dynamic Parking Pricing System Completed Successfully!")
print(" Summary:")
print(f"   • Processed {len(df_result)} parking events")
print(f"   • Analyzed {df_result['SystemCodeNumber'].nunique()} parking lots")
print(f"   • Implemented 3 pricing models")
print(f"   • Generated real-time visualizations")
print("\n System is ready for deployment!")


# Additional utility functions for extended analysis
def analyze_pricing_effectiveness(df):
    """
    Analyze the effectiveness of different pricing models.
    """
    print("\n PRICING EFFECTIVENESS ANALYSIS")
    print("=" * 40)

    # Calculate correlation between price and occupancy
    for model in ['Price_Linear', 'Price_Demand', 'Price_Competitive']:
        corr = df[model].corr(df['OccupancyRatio'])
        print(f"{model} vs Occupancy correlation: {corr:.3f}")

    # Revenue estimation (simplified)
    df['Revenue_Linear'] = df['Price_Linear'] * df['Occupancy']
    df['Revenue_Demand'] = df['Price_Demand'] * df['Occupancy']
    df['Revenue_Competitive'] = df['Price_Competitive'] * df['Occupancy']

    total_revenue = {
        'Linear': df['Revenue_Linear'].sum(),
        'Demand': df['Revenue_Demand'].sum(),
        'Competitive': df['Revenue_Competitive'].sum()
    }

    print(f"\nEstimated Total Revenue:")
    for model, revenue in total_revenue.items():
        print(f"  {model}: ₹{revenue:,.2f}")

    return df


def export_results(df, filename='parking_pricing_results.csv'):
    """
    Export results for further analysis.
    """
    # Select key columns for export
    export_cols = [
        'SystemCodeNumber', 'LastUpdated', 'Occupancy', 'Capacity',
        'QueueLength', 'TrafficLevel', 'IsSpecialDay', 'VehicleType',
        'Price_Linear', 'Price_Demand', 'Price_Competitive',
        'OccupancyRatio', 'DemandNormalized'
    ]

    df_export = df[export_cols].copy()
    df_export.to_csv(filename, index=False)
    print(f" Results exported to {filename}")


# Run the analysis and export
if 'df_result' in locals():
    df_result = analyze_pricing_effectiveness(df_result)
    export_results(df_result)


 Starting Dynamic Parking Pricing System...
 Dataset loaded: 18368 rows, 12 columns

 Dataset Info:
Date range: 01-11-2016 to 31-10-2016
Unique parking lots: 14
Vehicle types: ['car' 'bike' 'truck' 'cycle']
 Data preprocessing completed successfully!
 Implementing Model 1: Linear Pricing...
 Model 1 completed!
 Implementing Model 2: Demand-Based Pricing...
 Model 2 completed!
 Implementing Model 3: Competitive Pricing...
 Model 3 completed!

 PRICING SUMMARY STATISTICS

Price_Linear:
  Mean: ₹19.95
  Std:  ₹0.62
  Min:  ₹10.00
  Max:  ₹20.00

Price_Demand:
  Mean: ₹14.35
  Std:  ₹2.01
  Min:  ₹10.23
  Max:  ₹20.00

Price_Competitive:
  Mean: ₹14.33
  Std:  ₹2.00
  Min:  ₹10.23
  Max:  ₹20.00

 Creating Visualizations...



 Starting Real-Time Simulation...
 Running simulation without Pathway...
 Running mock streaming simulation...

 Lot: BHMBCCMKT01
--------------------------------------------------
 2016-10-04 07:59:00
   Occupancy: 61/577 (0.11)
   Queue: 1, Traffic: 0
   Prices - Linear: ₹10.00, Demand: ₹13.25, Competitive: ₹13.21

 2016-10-04 08:25:00
   Occupancy: 64/577 (0.11)
   Queue: 1, Traffic: 0
   Prices - Linear: ₹10.21, Demand: ₹13.26, Competitive: ₹13.26

 2016-10-04 08:59:00
   Occupancy: 80/577 (0.14)
   Queue: 2, Traffic: 0
   Prices - Linear: ₹10.43, Demand: ₹13.30, Competitive: ₹13.30

 2016-10-04 09:32:00
   Occupancy: 107/577 (0.19)
   Queue: 2, Traffic: 0
   Prices - Linear: ₹10.71, Demand: ₹13.21, Competitive: ₹13.21

 2016-10-04 09:59:00
   Occupancy: 150/577 (0.26)
   Queue: 2, Traffic: 0
   Prices - Linear: ₹11.08, Demand: ₹13.32, Competitive: ₹13.32

 2016-10-04 10:26:00
   Occupancy: 177/577 (0.31)
   Queue: 3, Traffic: 0
   Prices - Linear: ₹11.60, Demand: ₹14.35, Competit