### Introduction
This notebook provides an approach to building and evaluating a baseline model for predicting ride demand. The primary goal is to develop a model that forecasts the number of ride requests based on historical data and time-based features. The notebook includes data processing, model building, and evaluation using empirical techniques. Additionally, it outlines a strategy for communicating model predictions to drivers and proposes an A/B testing experiment to validate the model’s effectiveness in real-world scenarios.

### Contents

0. Set up 
1. Descriptives
   - A. Preprocessing to create scenarios through dropdown lists
   - B. Spacial descriptives
   - C. Time trend descriptives
2. Base model
    - A. Data preparation
    - B. Regression
3. Implementation strategy 
4. Validation experiment  
    

### 0. Set up

In [247]:
# Import libriaries
import osmnx as ox
import matplotlib as mpl
import matplotlib.pyplot as plt
import contextily as ctx
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point
import ipywidgets as widgets
from ipywidgets import interact
from geopy.distance import geodesic

# Load neighborhood boundaries from OpenStreetMap
place = 'Tallinn, Estonia'
Tallinn_gdf = ox.features_from_place(place, tags={'admin_level': '10'})
neighboorhoods = Tallinn_gdf['name'].values

# Load ride data
rides_data = pd.read_csv('robotex5.csv')

  polygon = gdf_place["geometry"].unary_union


### 1. Descriptives

#### A. Preprocessing to create scenarios through dropdown lists

In [248]:
# Function merging rides data with OpenStreetMap
def plot_points(start_datetime, end_datetime, option='start'):
    # Convert to datetime
    start_datetime = pd.to_datetime(start_datetime)
    end_datetime = pd.to_datetime(end_datetime)
   
    # Filter rides data for the given date and time range
    rides_data['start_time'] = pd.to_datetime(rides_data['start_time'])
    filtered_rides_data = rides_data[(rides_data['start_time'] >= start_datetime) & (rides_data['start_time'] <= end_datetime)].copy()
   
    # Create GeoDataFrames for start and end points
    filtered_rides_data['start_point'] = filtered_rides_data.apply(lambda row: Point(row['start_lng'], row['start_lat']), axis=1)
    filtered_rides_data['end_point'] = filtered_rides_data.apply(lambda row: Point(row['end_lng'], row['end_lat']), axis=1)
    start_gdf = gpd.GeoDataFrame(filtered_rides_data, geometry='start_point', crs='EPSG:4326')
    end_gdf = gpd.GeoDataFrame(filtered_rides_data, geometry='end_point', crs='EPSG:4326')
   
    # Function to count the number of points or calculate average ride_value in each neighborhood
    def compute_values_in_neighborhood(neigh_row):
        if option == 'start':
            points_in_neigh = start_gdf[start_gdf.geometry.within(neigh_row.geometry)]
            return len(points_in_neigh)
        elif option == 'end':
            points_in_neigh = end_gdf[end_gdf.geometry.within(neigh_row.geometry)]
            return len(points_in_neigh)
        elif option == 'value':
            values_in_neigh = filtered_rides_data[filtered_rides_data.apply(lambda row: Point(row['end_lng'], row['end_lat']).within(neigh_row.geometry), axis=1)]
            return values_in_neigh['ride_value'].mean() if not values_in_neigh.empty else 0
        else:
            raise ValueError("Invalid option. Choose from 'start', 'end', or 'value'.")
   
    # Apply the function to each row of the Tallinn_gdf
    if option in ['start', 'end']:
        Tallinn_gdf['NValues'] = Tallinn_gdf.apply(compute_values_in_neighborhood, axis=1)
        value_column = 'NValues'
    elif option == 'value':
        Tallinn_gdf['AvgValue'] = Tallinn_gdf.apply(compute_values_in_neighborhood, axis=1)
        value_column = 'AvgValue'
   
    # Create color mapping
    cmap = plt.get_cmap('RdYlGn')
    norm = mpl.colors.Normalize(vmin=Tallinn_gdf[value_column].min(), vmax=Tallinn_gdf[value_column].max())
   
    # Plot the neighborhoods with the corresponding values
    fig, ax = plt.subplots(figsize=(12, 8))
    Tallinn_gdf.plot(ax=ax, color=cmap(norm(Tallinn_gdf[value_column])), legend=True)
   
    # Add basemap for context
    ctx.add_basemap(ax, source=ctx.providers.CartoDB.Positron,crs=Tallinn_gdf.crs.to_string())
   
    # Create a color bar as a legend
    sm = mpl.cm.ScalarMappable(cmap=cmap, norm=norm)
    sm.set_array([])
    cbar = fig.colorbar(sm, ax=ax)
   
    # Add title and labels
    title = {
        'start': 'Number of Ride Start Points',
        'end': 'Number of Ride End Points',
        'value': 'Average Ride Value'
    }
    plt.title(f'{title[option]} in Tallinn from {start_datetime} to {end_datetime}')
    plt.xlabel('Longitude')
    plt.ylabel('Latitude')
   
    # Display the plot
    plt.show()

In [249]:
# Function computing average ride value by neighborhood and time window 
def compute_values_in_neighborhood(neigh_row, start_gdf, end_gdf, period_data, option):
    """
    Compute the values for a neighborhood within a specific time window.

    Parameters:
    - neigh_row: GeoDataFrame row representing a neighborhood.
    - start_gdf: GeoDataFrame for start points.
    - end_gdf: GeoDataFrame for end points.
    - period_data: DataFrame for a specific time window.
    - option: Metric to compute ('start', 'end', or 'value').

    Returns:
    - Computed value based on the option.
    """
    if option == 'start':
        points_in_neigh = start_gdf[start_gdf.geometry.within(neigh_row.geometry)]
        return len(points_in_neigh)
    elif option == 'end':
        points_in_neigh = end_gdf[end_gdf.geometry.within(neigh_row.geometry)]
        return len(points_in_neigh)
    elif option == 'value':
        values_in_neigh = period_data[period_data.apply(lambda row: Point(row['end_lng'], row['end_lat']).within(neigh_row.geometry), axis=1)]
        return values_in_neigh['ride_value'].mean() if not values_in_neigh.empty else 0
    else:
        raise ValueError("Invalid option. Choose from 'start', 'end', or 'value'.")


In [250]:
# Function plotting trends in terms of time and neighbourhood
def plot_trends(neighborhoods, start_datetime, end_datetime, time_window='D', option='start'):
    """
    Plots trends of ride metrics (count or average value) over specified time windows for selected neighborhoods.

    Parameters:
    - neighborhoods: list of neighborhood names to filter.
    - start_datetime: start of the time range in 'YYYY-MM-DD HH:MM:SS' format.
    - end_datetime: end of the time range in 'YYYY-MM-DD HH:MM:SS' format.
    - time_window: time aggregation window ('D' for daily, 'H' for hourly, etc.).
    - option: metric to plot ('start', 'end', or 'value').
    """
    # Convert to datetime
    start_datetime = pd.to_datetime(start_datetime)
    end_datetime = pd.to_datetime(end_datetime)

    # Filter rides data for the given date and time range
    rides_data['start_time'] = pd.to_datetime(rides_data['start_time'])
    filtered_rides_data = rides_data[(rides_data['start_time'] >= start_datetime) & (rides_data['start_time'] <= end_datetime)].copy()

    # Create a new column for the time window aggregation
    filtered_rides_data.loc[:, 'time_window'] = filtered_rides_data['start_time'].dt.to_period(time_window).astype(str)

    # Aggregate data by neighborhood and time window
    trend_data = pd.DataFrame()
    time_windows = sorted(filtered_rides_data['time_window'].unique())

    for neighborhood in neighborhoods:
        neighborhood_gdf = Tallinn_gdf[Tallinn_gdf['name'] == neighborhood]
        if neighborhood_gdf.empty:
            continue

        values = []
        for period in time_windows:
            period_data = filtered_rides_data[filtered_rides_data['time_window'] == period].copy()
            # Create GeoDataFrames for start and end points
            period_data.loc[:, 'start_point'] = period_data.apply(lambda row: Point(row['start_lng'], row['start_lat']), axis=1)
            period_data.loc[:, 'end_point'] = period_data.apply(lambda row: Point(row['end_lng'], row['end_lat']), axis=1)
            start_gdf = gpd.GeoDataFrame(period_data, geometry='start_point', crs='EPSG:4326')
            end_gdf = gpd.GeoDataFrame(period_data, geometry='end_point', crs='EPSG:4326')

            value = compute_values_in_neighborhood(neighborhood_gdf.iloc[0], start_gdf, end_gdf, period_data, option)
            values.append(value)


        # Store in DataFrame with explicit dtype
        trend_data[neighborhood] = pd.Series(values, index=time_windows, dtype='float64')

    # Plotting
    fig, ax = plt.subplots(figsize=(12, 8))
    trend_data.plot(ax=ax, marker='o', linestyle='-')
    title = {
    'start': 'Number of Ride Start Points',
    'end': 'Number of Ride End Points',
    'value': 'Average Ride Value'
    }
    # Add labels and title
    ax.set_xlabel('Time')
    ax.set_ylabel('Count of Rides' if option in ['start', 'end'] else 'Average Ride Value')
    ax.set_title(f'Trends of {title[option]} from {start_datetime} to {end_datetime} by {time_window}')

    # Rotate x-axis labels for better readability
    plt.xticks(rotation=45)
    plt.legend(title='Neighborhood')

    # Display the plot
    plt.tight_layout()
    plt.show()

#### B. Spacial descriptives

In [251]:
# Define the interactive widgets
start_date = widgets.DatePicker(
    description='Start Date',
    value=pd.to_datetime('2022-03-01').date()
)
start_time = widgets.TimePicker(
    description='Start Time',
    value=pd.to_datetime('00:00:00').time()
)
end_date = widgets.DatePicker(
    description='End Date',
    value=pd.to_datetime('2022-03-30').date()
)
end_time = widgets.TimePicker(
    description='End Time',
    value=pd.to_datetime('00:00:00').time()
)
option = widgets.Dropdown(
    options=['start', 'end', 'value'],
    value='start',
    description='Option'
)

# Create a function to handle the widget interaction
def update_plot(start_date, start_time, end_date, end_time, option):
    start_datetime = pd.to_datetime(f'{start_date} {start_time}')
    end_datetime = pd.to_datetime(f'{end_date} {end_time}')
    plot_points(start_datetime, end_datetime, option)

# Use interact to link widgets to the update function
interact(update_plot, start_date=start_date, start_time=start_time, end_date=end_date, end_time=end_time, option=option)


interactive(children=(DatePicker(value=datetime.date(2022, 3, 1), description='Start Date', step=1), TimePicke…

<function __main__.update_plot(start_date, start_time, end_date, end_time, option)>

(screenshot since export as html ommits widgets)
![Untitled.png](attachment:Untitled.png)

##### Note 
If you look at "Hood maps" https://hoodmaps.com/tallinn-neighborhood-map, a crowdsourced labelling of areas: 
- Reasonably many rides start from the touristic area of Tallinn;
- Many rides seem to end around the universiry area;
- A fair amount of Bolt rides ends in Hip areas.

#### C. Time trend descriptives

In [252]:
# Adds the start and end neighborhood to the data frame 
def add_neighborhood_to_rides(rides_data, neighborhood_data):
    """
    Adds the start and end neighborhood names to each ride in the rides_data DataFrame 
    based on its start and end coordinates.

    Parameters:
    - rides_data: DataFrame containing ride data with 'start_lat', 'start_lng', 'end_lat', and 'end_lng' columns.
    - neighborhood_data: GeoDataFrame containing neighborhood boundaries with a 'name' column.

    Returns:
    - Updated rides_data DataFrame with additional 'start_neighborhood' and 'end_neighborhood' columns.
    """
    rides_data_tmp=rides_data.copy()
    # Create Point geometries for start and end coordinates
    rides_data_tmp['start_point'] = rides_data_tmp.apply(lambda row: Point(row['start_lng'], row['start_lat']), axis=1)
    rides_data_tmp['end_point'] = rides_data_tmp.apply(lambda row: Point(row['end_lng'], row['end_lat']), axis=1)
 
    # Convert rides_data into GeoDataFrames
    start_gdf = gpd.GeoDataFrame(rides_data_tmp, geometry='start_point', crs='EPSG:4326')
    end_gdf = gpd.GeoDataFrame(rides_data_tmp, geometry='end_point', crs='EPSG:4326')

    # Perform spatial joins to find the neighborhoods
    start_joined = gpd.sjoin(start_gdf, neighborhood_data[['name', 'geometry']], how='left')
    end_joined = gpd.sjoin(end_gdf, neighborhood_data[['name', 'geometry']], how='left')

    # Add the 'start_neighborhood' and 'end_neighborhood' columns to the original rides_data DataFrame
    rides_data_tmp['start_neighborhood'] = start_joined['name']
    rides_data_tmp['end_neighborhood'] = end_joined['name']

    # Drop the geometry columns and return a standard DataFrame
    rides_data = rides_data_tmp.drop(columns=['start_point', 'end_point'])
    
    return rides_data_tmp

# Display the first few rows of the updated rides data
updated_rides_data = add_neighborhood_to_rides(rides_data, Tallinn_gdf)
updated_rides_data.head()

Unnamed: 0,start_time,start_lat,start_lng,end_lat,end_lng,ride_value,start_point,end_point,start_neighborhood,end_neighborhood
0,2022-03-06 15:02:39.329452,59.40791,24.689836,59.513027,24.83163,3.51825,POINT (24.68983624096147 59.40790952807932),POINT (24.83162982509473 59.51302711552882),Sääse,
1,2022-03-10 11:15:55.177526,59.44165,24.762712,59.42645,24.783076,0.5075,POINT (24.762712456239186 59.44165022169928),POINT (24.78307556940856 59.42644990771725),Sadama,Sikupilli
2,2022-03-06 14:23:33.893257,59.435404,24.749795,59.431901,24.761588,0.19025,POINT (24.749795200125885 59.43540386938484),POINT (24.76158769746171 59.43190066984472),Vanalinn,Maakri
3,2022-03-03 09:11:59.104192,59.40692,24.659006,59.381093,24.641652,0.756,POINT (24.659005767701675 59.40691959345124),POINT (24.64165177776011 59.38109296811575),Kadaka,Pääsküla
4,2022-03-06 00:13:01.290346,59.43494,24.753641,59.489203,24.87617,2.271,POINT (24.753641074745016 59.4349404406161),POINT (24.876170361471782 59.489202547844464),Südalinn,Mähe


In [253]:
# Returns the list of the busiest neighborhoods, i.e. the ones with the max rides per hour being in the 95th percentile
def get_neighborhoods_above_95th_percentile_max_count(rides_data):
    rides_data_tmp = rides_data.copy()
    
    # Ensure the start_time is in datetime format
    rides_data_tmp['start_time'] = pd.to_datetime(rides_data_tmp['start_time'])
    
    # Extract hour from start_time
    rides_data_tmp['hour'] = rides_data_tmp['start_time'].dt.hour
    
    # Collapse: Count number of observations by hour and start_neighborhood
    hourly_counts = rides_data_tmp.groupby(['start_neighborhood', 'hour']).size().reset_index(name='count')
    
    # Collapse Again: Get the maximum count for each neighborhood
    max_counts = hourly_counts.groupby('start_neighborhood')['count'].max().reset_index(name='max_count')
    
    # Compute the 95th percentile of max_count
    percentile_95_count = np.percentile(max_counts['max_count'], 95)
    
    # Filter neighborhoods with max_count above the 95th percentile
    neighborhoods_above_95th_percentile = max_counts[max_counts['max_count'] > percentile_95_count]['start_neighborhood'].tolist()
    
    return neighborhoods_above_95th_percentile

# Print list 
busy_neighborhoods = get_neighborhoods_above_95th_percentile_max_count(updated_rides_data)
print(busy_neighborhoods)

['Kalamaja', 'Lilleküla', 'Mustamäe', 'Sadama', 'Vanalinn']


In [254]:
# Define the interactive widgets
neighborhoods_widget = widgets.SelectMultiple(
    options=neighboorhoods.tolist(),
    value=busy_neighborhoods,  # Default selected neighborhood
    description='Neighborhoods',
    disabled=False
)

start_date = widgets.DatePicker(
    description='Start Date',
    value=pd.to_datetime('2022-03-07').date()
)
start_time = widgets.TimePicker(
    description='Start Time',
    value=pd.to_datetime('00:00:00').time()
)
end_date = widgets.DatePicker(
    description='End Date',
    value=pd.to_datetime('2022-03-14').date()
)
end_time = widgets.TimePicker(
    description='End Time',
    value=pd.to_datetime('18:00:00').time()
)
time_window = widgets.Dropdown(
    options=['D', 'H', 'M'],  # Day, Hour, Minute
    value='H',
    description='Time Window'
)
option = widgets.Dropdown(
    options=['start', 'end', 'value'],
    value='start',
    description='Option'
)

# Create a function to handle the widget interaction
def update_plot(neighborhoods, start_date, start_time, end_date, end_time, time_window, option):
    start_datetime = pd.to_datetime(f'{start_date} {start_time}')
    end_datetime = pd.to_datetime(f'{end_date} {end_time}')
    plot_trends(neighborhoods, start_datetime, end_datetime, time_window, option)

# Use interact to link widgets to the update function
interact(update_plot, 
         neighborhoods=neighborhoods_widget,
         start_date=start_date, 
         start_time=start_time, 
         end_date=end_date, 
         end_time=end_time, 
         time_window=time_window, 
         option=option)

interactive(children=(SelectMultiple(description='Neighborhoods', index=(38, 29, 17, 71, 59), options=('Laagri…

<function __main__.update_plot(neighborhoods, start_date, start_time, end_date, end_time, time_window, option)>

(screenshot since export to html ommits the widgets)
![image.png](attachment:image.png)

##### Note 
Hourly trends in demand within neighborhoods are relevant, as they each display different daily patterns. For example, the central neighborhood Vanalinn, where supposedly people go out at night, experiences peak ride demand at night, while the residential area of Lilekula sees higher demand before and after working hours.

## 2. Model

#### A. Data preparation

The process begins by calculating the distance between the start and end points of each ride and normalizing the ride value by this distance. The data is then aggregated based on neighborhood, hour, and date, computing the total number of observations and the average value per kilometer. One-week lags are calculated for the number of rides and the average value per kilometer,with any missing data being removed. Finally, the analysis is restricted to busy neighborhoods.

In [321]:
# Add distance between points
def distance (rides_data):
    df = rides_data.copy()
    df['distance'] = df.apply(lambda row: geodesic(
    (row['start_point'].y, row['start_point'].x), 
    (row['end_point'].y, row['end_point'].x)
    ).kilometers, axis=1)
    return df
model_data=distance(updated_rides_data)

# Normalize the ride value by the distance 
model_data['value_per_km'] = model_data['ride_value'] / model_data['distance']

# Collapse by neighborhood, hour, and date
grouped = model_data.groupby(['start_neighborhood', 'hour_of_day', 'date'])

# Calculate the number of observations and mean of value_per_km
aggregated = grouped.agg(
    num_observations=('ride_value', 'size'),
    mean_value_per_km=('value_per_km', 'mean')
).reset_index()

# Ensure the 'date' column is in datetime format
aggregated['date'] = pd.to_datetime(aggregated['date'])

# Sort the DataFrame by neighborhood, hour, and date
aggregated = aggregated.sort_values(by=['start_neighborhood', 'hour_of_day', 'date'])

# Calculate the one-week lag for num_observations and (Discarded) mean_value_per_km
aggregated['num_observations_lag'] = aggregated.groupby(['start_neighborhood', 'hour_of_day'])['num_observations'] \
    .shift(7)

aggregated['mean_value_per_km_lag'] = aggregated.groupby(['start_neighborhood', 'hour_of_day'])['mean_value_per_km'] \
    .shift(7)

# (Discarded) Calculate the delta between lags and actuals 
aggregated['delta_num_observations'] = aggregated['num_observations'] - aggregated['num_observations_lag']
aggregated['delta_mean_value_per_km'] = aggregated['mean_value_per_km'] - aggregated['mean_value_per_km_lag']

# Drop lost observations due to lagging
aggregated = aggregated.dropna()

# Keep only busy neighborhoods 
aggregated_busy_only = aggregated[aggregated['start_neighborhood'].isin(busy_neighborhoods)]
aggregated_busy_only['day_of_week'] = aggregated_busy_only['date'].dt.dayofweek

# (Discarded as fixed effects were not used) One-Hot Encoding for 'start_neighborhood'
df_encoded_neighborhood = pd.get_dummies(aggregated_busy_only['start_neighborhood'], prefix='neighborhood')

# (Discarded as fixed effects were not used) One-Hot Encoding for 'hour_of_day'
df_encoded_hour = pd.get_dummies(aggregated_busy_only['hour_of_day'], prefix='hour_of_day')

# (Discarded as fixed effects were not used) One-Hot Encoding for 'day_of_week'
df_encoded_day_of_week = pd.get_dummies(aggregated_busy_only['day_of_week'], prefix='day_of_week')

# Combine the one-hot encoded vars
df_encoded = pd.concat([aggregated_busy_only, df_encoded_neighborhood, df_encoded_hour, df_encoded_day_of_week], axis=1)

# Drop the original categorical columns if no longer needed
aggregated_busy_only = df_encoded.drop(columns=['start_neighborhood', 'hour_of_day', 'day_of_week'])


#### B. Regression

The OLS regression analysis demonstrates that the number of ride requests is strongly influenced by historical demand patterns specific to neighborhoods and times of day. The model shows that past ride demand is a significant predictor of current demand, highlighting predictable weekly and hourly trends.

However, the analysis is based on normalized ride values and only includes accepted rides, potentially skewing the understanding of true demand. The use of synthetic data may also limit the variability observed compared to real-world scenarios. To gain more accurate insights it should consider inducing variations in ride demand through factors like surge pricing and include data on rejected or canceled rides as well as some proxy for drivers choosing to supply competitors. 

In [327]:
# Define the dependent and independent variables
X = aggregated_busy_only['num_observations_lag']
y = aggregated_busy_only['num_observations']

# Add a constant to the model (intercept)
X = sm.add_constant(X)

# Fit the regression model
model = sm.OLS(y, X).fit()

# Print the summary of the regression
print(model.summary())

                            OLS Regression Results                            
Dep. Variable:       num_observations   R-squared:                       0.930
Model:                            OLS   Adj. R-squared:                  0.930
Method:                 Least Squares   F-statistic:                 3.350e+04
Date:                Wed, 04 Sep 2024   Prob (F-statistic):               0.00
Time:                        17:13:10   Log-Likelihood:                -9728.6
No. Observations:                2520   AIC:                         1.946e+04
Df Residuals:                    2518   BIC:                         1.947e+04
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
                           coef    std err          t      P>|t|      [0.025      0.975]
----------------------------------------------------------------------------------------
const                    2.2973 

### 3. Implementation strategy 

The OLS regression analysis shows that driver supply follows predictable patterns based on specific neighborhoods and times of day. In pratical terms, the data could hint that drivers benefit from a dashboard featuring real-time demand predictions for Tallinn, complete with heatmap overlays highlighting areas of high and low demand.

To induce variations in ride demand through factors like surge pricing, the dashboard could include Surge Pricing Pop-Ups. These would provide dynamic notifications about potential surge pricing due to forecasted increases in demand. By showing projected price increases in real-time, these pop-ups would help drivers strategically position themselves in areas with higher surge potential.

By guiding drivers towards areas with increased pricing, we could boost the number of rides, ultimately enhancing both driver efficiency and Bolt's overall earnings.

### 4. Validation experiment 

To assess the impact of integrating surge pricing alerts into the dashboard, an A/B testing experiment will be conducted, comparing it with the traditional forecast model. The objective is to evaluate how the addition of surge pricing notifications affects driver performance and overall business outcomes.

In the experiment, Group A will use a dashboard featuring standard demand forecasts and recommendations for optimal driving times and locations. Group B, on the other hand, will use a dashboard enhanced with dynamic pop-ups that indicate potential price increases due to forecasted demand surges, in addition to standard forecasts.

The key metrics for evaluation will include earnings, ride completion rates, driver satisfaction, and the rate of rejected rides. Average earnings per driver will be measured to determine if surge pricing notifications lead to higher revenue. The number of completed rides will be tracked to see if these alerts influence completion rates compared to the standard dashboard. Feedback from drivers will be collected to gauge their experience with surge pricing alerts versus standard recommendations. Additionally, data on rejected ride requests will be analyzed to understand how surge pricing alerts affect driver decisions and whether they lead to higher or lower rejection rates.

Further analysis will involve developing a conversion model to quantify how surge pricing alerts impact ride request acceptance rates. Performance metrics during peak and non-peak hours will be examined to understand the effectiveness of surge alerts in optimizing driver activity across different demand scenarios.

The experiment will be deployed in a representative geographic area within Tallinn, selected for its diverse demand patterns, and run for 4-6 weeks to capture variability in demand and ensure statistically significant results. It will be implemented during both typical and peak demand periods to evaluate the effects of surge pricing notifications on driver performance and decision-making across various demand scenarios.