# **Project Name**    -   Uber Supply-Demand Gap Analysis


##### **Project Type**    - EDA
##### **By Aishwarya Patwatkar**

# **Project Summary -**

This project, titled "Uber Supply-Demand Gap Analysis", aims to uncover operational inefficiencies in Uber’s ride request system using Exploratory Data Analysis (EDA) techniques. The dataset contains detailed ride request records, including timestamps, pickup points (Airport or City), ride status (Completed, Cancelled, No Cars Available), and time-based features.

Using tools like Excel, SQL (pandasql), and Python (Pandas, Seaborn, Matplotlib), the data was analyzed to identify critical patterns in customer demand, cancellations, and driver availability. Key visualizations and queries revealed that a majority of ride failures occur during Night and Early Morning hours — especially from the Airport pickup point.

The project concludes with actionable business recommendations such as driver reallocation, incentive programs, and time-location-based planning to help Uber close the gap between ride demand and supply, improve customer satisfaction, and reduce lost revenue opportunities.



# **GitHub Link -**

Provide your GitHub Link here.

# **Problem Statement**


Uber is facing a significant operational challenge where a large number of customer ride requests are not being fulfilled — either due to driver cancellations or no cars being available. This supply-demand gap is especially prominent during Night and Early Morning hours and more severe at specific locations like the Airport.

This leads to poor customer experience, missed revenue opportunities, and operational inefficiency. The goal of this project is to analyze Uber’s ride request data, identify the root causes of ride failures, and provide data-driven insights to help Uber optimize its operations, improve fulfillment rates, and enhance customer satisfaction.

#### **Define Your Business Objective?**

The primary business objective of this project is to help Uber reduce the gap between ride demand and driver availability, especially during critical hours and locations. By analyzing patterns in trip completion, cancellations, and unavailability, the goal is to:

Identify when and where Uber is failing to meet customer demand.

Understand the causes behind ride cancellations and "No Cars Available" issues.

Deliver actionable insights to improve ride fulfillment, optimize driver allocation, and enhance overall customer experience.

Through this analysis, Uber can take informed decisions to increase trip completion rates, reduce operational losses, and ensure a more reliable and efficient service.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

### Dataset Loading

In [None]:
df = pd.read_csv('/content/Uber Request Data - Cleaned.csv')  # Or your uploaded file
df.head()
df.info()
df.describe()

### Dataset First View

In [None]:
print("🔍 First 5 rows of the dataset:")
df.head()

### Dataset Rows & Columns count

In [None]:
rows, columns = df.shape
print(f"Total Rows: {rows}")
print(f"Total Columns: {columns}")

### Dataset Information

In [None]:
df.info()

#### Duplicate Values

In [None]:
duplicate_count = df.duplicated().sum()
print(f"🔁 Duplicate Rows: {duplicate_count}")

#### Missing Values/Null Values

In [None]:
print("❌ Missing Values Per Column:")
print(df.isnull().sum())

In [None]:
total_missing = df.isnull().sum().sum()
print(f"\n🔍 Total Missing Values in Dataset: {total_missing}")

### What did you know about your dataset?

The dataset contains 6,745 ride request records from Uber, including information about pickup points, ride statuses, timestamps, and driver assignments. It is designed to help analyze the supply-demand gap in Uber’s operations.

Key Columns:
Request id – Unique identifier for each ride request.

Pickup point – Where the ride was requested from (Airport or City).

Status – Final result of the request (Trip Completed, Cancelled, No Cars Available).

Request timestamp / Drop timestamp – Time when the ride was requested and dropped (if completed).

Driver ID – Assigned driver’s ID (missing if cancelled or no car).

Time slot – Categorized part of the day (Morning, Evening, Night, etc.).

Hour – Extracted from request timestamp for time-based analysis.

What I Learned:

The dataset is well-suited for Exploratory Data Analysis (EDA).

There are missing values in Driver ID and Drop timestamp, especially for failed trips.

Time and location trends can be uncovered to highlight periods of high cancellations or unavailability.

It provides a strong foundation to analyze when and where Uber is facing operational issues, especially due to supply-demand mismatch.

## ***2. Understanding Your Variables***

### Variables Description

| **Column Name**       | **Description**                                                                               |
| --------------------- | --------------------------------------------------------------------------------------------- |
| **Request id**        | A unique identifier for each ride request.                                                    |
| **Pickup point**      | The location where the ride was requested — either **City** or **Airport**.                   |
| **Driver ID**         | The unique ID of the driver assigned to the request (if assigned).                            |
| **Status**            | The final result of the request: **Trip Completed**, **Cancelled**, or **No Cars Available**. |
| **Request timestamp** | The date and time when the ride was requested.                                                |
| **Time slot**         | Custom-categorized part of the day based on the request time (e.g., Morning, Night).          |
| **Drop timestamp**    | The time when the ride ended (only for completed trips).                                      |
| **Hour**              | The hour (0–23) extracted from `Request timestamp` for hourly analysis.                       |


### Check Unique Values for each variable.

In [None]:
for col in df.columns:
    print(f"\n🔹 Column: {col}")
    print(f"   👉 Unique values count: {df[col].nunique()}")
    print(f"   📋 Unique values: {df[col].unique()[:10]}")  # Show first 10 unique values only

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Import required libraries
import pandas as pd
import numpy as np

# Load the dataset
df = pd.read_csv('/content/Uber Request Data - Cleaned.csv')

# 1. Convert timestamps to datetime format
df['Request timestamp'] = pd.to_datetime(df['Request timestamp'], dayfirst=True)
df['Drop timestamp'] = pd.to_datetime(df['Drop timestamp'], dayfirst=True)

# 2. Create Hour column if not already present
if 'Hour' not in df.columns:
    df['Hour'] = df['Request timestamp'].dt.hour

# 3. Create Time slot column
def get_time_slot(hour):
    if 0 <= hour < 5:
        return 'Night'
    elif 5 <= hour < 7:
        return 'Early Morning'
    elif 7 <= hour < 11:
        return 'Morning'
    elif 11 <= hour < 16:
        return 'Afternoon'
    elif 16 <= hour < 20:
        return 'Evening'
    else:
        return 'Late Evening'

df['Time slot'] = df['Hour'].apply(get_time_slot)

# 4. Handle missing values (optional)
df['Driver ID'].fillna('Unavailable', inplace=True)
df['Drop timestamp'].fillna('No Ride', inplace=True)

# 5. Remove duplicates if any
df.drop_duplicates(inplace=True)

# 6. Final check
print("✅ Dataset is now analysis ready!")
df.head()

### What all manipulations have you done and insights you found?

Manipulations Performed:
Datetime Conversion:

Converted Request timestamp and Drop timestamp to datetime format for time-based analysis.

Feature Engineering:

Extracted Hour from Request timestamp to analyze demand trends hour-wise.

Created a new column Time slot to group requests into parts of the day (e.g., Morning, Night, etc.).

Handling Missing Values:

Filled missing Driver ID values with "Unavailable" for cancelled or unassigned trips.

Filled missing Drop timestamp values with "No Ride" for unfulfilled requests.

Removing Duplicates:

Removed any duplicate rows to avoid skewed analysis.

Data Segmentation:

Filtered the dataset by Status to analyze:

Completed rides

Cancelled rides

“No Cars Available” cases

Visualizations & SQL Queries:

Created charts like:

Trip Status Distribution

Requests by Time Slot

Cancellations & No Cars by Hour

Pickup Point vs Status

Heatmap of Issues by Time and Location

Insights Found:
Most ride failures occur during Night (11 PM–5 AM) and Early Morning (5–7 AM).

Airport pickups face more cancellations and “No Cars Available” issues than City pickups.

Morning and Evening have the highest number of requests, indicating office commute peaks.

Only around 42% of ride requests are completed, meaning Uber is missing out on a large number of potential customers.

Heatmap revealed that Airport + Night/Early Morning is the most problematic combination.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
sns.countplot(data=df, x='Status')
plt.title('Trip Status Distribution')
plt.show()

##### 1. Why did you pick the specific chart?

I used a count plot (bar chart) because it is the most effective way to visualize the frequency of categorical variables—in this case, the Status column. This chart clearly shows how many ride requests were:

Successfully completed

Cancelled by the driver or customer

Not fulfilled due to “No Cars Available”

It provides a quick and visual comparison of how Uber is performing in terms of service completion vs. service failure.

2. What is/are the insight(s) found from the chart?

From the chart, we observed that:

A significant portion of trips were not completed.

The number of “No Cars Available” cases is almost as high as completed trips, indicating a serious supply shortage during certain periods.

Cancelled rides are also substantial, which may be due to driver preferences, fatigue, or peak-time stress.

This shows that only around 40–45% of requests are being successfully fulfilled.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

The insights gained from the Trip Status Distribution chart can create a strong positive business impact.

By visualizing the number of trips that are Completed, Cancelled, or resulted in No Cars Available, Uber can:

Identify operational weaknesses — such as when and where driver shortages occur.

Take proactive steps like deploying more drivers during high-demand periods.

Improve customer satisfaction by reducing cancellations and increasing trip fulfillment rates.

Boost revenue by converting more ride requests into completed trips.

These changes, guided by data, can directly improve the efficiency, profitability, and reputation of Uber’s service.

**Insights That Could Lead to Negative Growth:**

The chart also reveals risks that, if ignored, can lead to negative business growth:

High “No Cars Available” Instances

A large number of ride requests are not fulfilled at all.

This leads to lost customers, as users may shift to competitors like Ola or Rapido.

Frequent unavailability can damage the brand image.

High Cancellations

Cancellations indicate driver dissatisfaction, poor allocation, or system inefficiency.

Customers who face repeated cancellations are less likely to use Uber again.

This leads to reduced customer retention and lower app usage.

**Specific Justification:**

In this dataset:

Only about 42% of the rides are completed.

Around 58% of requests either get cancelled or show “No Cars Available.”

This means more than half of Uber’s customers are not being served, which can result in:

Loss of revenue per day

Loss of repeat customers

Damage to platform reliability perception

If this continues without action, it could lead to declining market share and business loss over time

#### Chart - 2

In [None]:
sns.countplot(data=df, x='Time slot', order=df['Time slot'].value_counts().index)
plt.title('Requests by Time Slot')
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

I selected a count plot (bar chart) to visualize the number of Uber ride requests across different time slots of the day. Since the Time slot column represents a categorical variable, a countplot is ideal for comparing how demand fluctuates over time in a visually clear and informative way. This chart helps identify peak and off-peak demand hours, which is crucial for analyzing Uber’s supply-demand balance.


##### 2. What is/are the insight(s) found from the chart?

The chart clearly shows that Morning and Evening slots have the highest number of ride requests.

Night and Early Morning slots have significantly fewer requests, but these are often associated with high cancellation rates and no driver availability (seen in other charts).

This imbalance suggests that demand is not evenly distributed throughout the day.

So while requests are highest in Morning and Evening (possibly due to office/school travel), Uber is struggling the most during Night and Early Morning, even if demand is slightly lower there.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

These insights can help Uber:

Anticipate high-demand periods (Morning & Evening) and ensure enough drivers are scheduled.

Proactively allocate drivers to avoid cancellations during off-peak hours like Night and Early Morning.

Design targeted incentives for drivers to cover low-supply slots more effectively.

This directly supports better trip fulfillment, increased revenue, and higher customer satisfaction.

**Risk of Negative Growth:**

If Uber continues to ignore Night and Early Morning time slots, it could lead to:

Repeated trip failures even during low-demand hours.

Loss of critical service trust (e.g., airport travelers or emergency users).

Negative app ratings and customer churn.

This hurts Uber’s reputation as a reliable 24/7 service and may result in customers switching to other platforms.

#### Chart - 3

In [None]:
sns.countplot(data=df, x='Pickup point', hue='Status')
plt.title('Trip Status by Pickup Point')
plt.show()

##### 1. Why did you pick the specific chart?

I chose a grouped bar chart (countplot with hue='Status') to compare the ride outcomes (statuses) across the two pickup locations — Airport and City. This chart allows us to analyze the distribution of Completed, Cancelled, and No Cars Available for each pickup point side by side, helping identify where the supply-demand gap is worse.

This is important because the pickup point plays a major role in trip completion success and can influence Uber’s strategy for allocating drivers.

##### 2. What is/are the insight(s) found from the chart?

Airport pickups have significantly more Cancelled and No Cars Available requests compared to the City.

In contrast, City pickups show a higher proportion of Completed trips, indicating better driver availability and fulfillment rates.

The chart reveals that supply issues are more severe at the Airport, even though demand is high there.

This insight suggests that Airport users are underserved, possibly due to longer travel distances, parking challenges, or fewer returning drivers.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

These insights are extremely valuable because they can help Uber:

Reallocate or reserve drivers specifically for Airport pickups during peak hours.

Design location-based incentives for drivers to pick up or return to the Airport area.

Improve fulfillment and reduce missed business from high-value Airport customers.

Solving this imbalance can directly boost completed trips, customer satisfaction, and brand reliability, especially for travelers who rely on timely service.

**Risk of Negative Growth:**

If the Airport cancellation and unavailability trend continues, it could:

Frustrate frequent flyers and airport commuters who expect reliability.

Lead to negative app reviews, loss of repeat customers, and shifts to competitors.

Damage Uber's reputation for airport coverage, a critical segment for high-value trips.

Neglecting this insight could result in long-term customer churn, especially from a business-oriented audience.

#### Chart - 4

In [None]:
cancel_unavail = df[df['Status'].isin(['Cancelled', 'No Cars Available'])]

sns.countplot(data=cancel_unavail, x='Time slot', hue='Status', order=cancel_unavail['Time slot'].value_counts().index)
plt.title('Cancelled & No Cars by Time Slot')
plt.xticks(rotation=45)
plt.show()

##### 1. Why did you pick the specific chart?

I chose this grouped bar chart to visualize the distribution of unsuccessful trips (i.e., those that were either Cancelled or showed “No Cars Available”) across different time slots.

This chart helps identify which parts of the day face the most severe supply issues, making it easier to pinpoint when Uber is unable to meet customer demand. It's a crucial visual for understanding when Uber’s service delivery breaks down.

##### 2. What is/are the insight(s) found from the chart?

Early Morning (e.g., 5 AM to 7 AM) shows the highest cancellations, likely due to drivers rejecting trips or not being available.

Night time slots (e.g., 11 PM to 4 AM) show the highest “No Cars Available” cases — meaning Uber isn’t even matching a driver to the ride.

These are the critical failure zones where Uber is unable to fulfill ride requests.

This insight reveals that the supply-demand gap is time-sensitive, with cancellations peaking early morning and driver unavailability peaking at night.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

These insights are extremely valuable for Uber’s operations. By acting on them, Uber can:

Deploy more drivers during high-failure slots like Night and Early Morning.

Introduce early morning shift bonuses to prevent cancellations.

Improve customer satisfaction by reducing failed trip attempts.

Convert missed rides into revenue, increasing daily trip completions.

This creates a direct path to better service coverage, improved trust, and higher earnings.

**Risk of Negative Growth:**

If these time-slot issues are ignored, it can lead to:

User frustration due to repeated failed rides at critical times (e.g., catching a flight early morning).

Customer churn, especially for those relying on off-peak travel.

Loss of brand trust for 24/7 service reliability.

This can result in negative reviews, fewer daily active users, and a reduction in market share, especially to competitors offering better reliability during off-hours.

#### Chart - 5

In [None]:
heatmap_data = pd.crosstab(cancel_unavail['Time slot'], cancel_unavail['Pickup point'])

sns.heatmap(heatmap_data, annot=True, cmap='Reds')
plt.title('Heatmap of Issues by Time Slot & Pickup Point')
plt.show()

##### 1. Why did you pick the specific chart?

I selected a heatmap because it is the most effective way to visualize two-dimensional relationships — in this case, between Time slot and Pickup point — and show the frequency of unfulfilled requests (Cancelled + No Cars Available).

The heatmap allows us to:

Identify patterns quickly through color intensity

Spot the worst-performing combinations (e.g., Night + Airport)

Present the data in a visually compact and professional format

This is ideal for showcasing where Uber faces the most operational stress in terms of location and time together.

##### 2. What is/are the insight(s) found from the chart?

The darkest red cells (highest values) are mostly concentrated around:

Airport pickups

During Night and Early Morning time slots

This indicates that most of Uber’s failed rides (either cancelled or no cars) occur from the Airport during these times.

The City also shows some issues, but they are less frequent and less severe in comparison.

Insight Summary:

The combination of Airport + Night/Early Morning is the most problematic, where Uber is consistently unable to meet demand.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

These insights allow Uber to:

Target the exact location + time combinations that need intervention.

Improve planning by allocating more drivers to Airport at Night and Early Morning.

Implement location-time-based incentives or scheduling systems.

Reduce failed rides and increase ride completion percentage.

This can directly lead to improved customer retention, increased revenue, and a stronger brand reputation for availability.

**Risk of Negative Growth:**

If Uber ignores these high-risk time-location blocks:

Frequent travelers, airport commuters, and emergency riders will experience consistent failures.

Leads to high customer churn, especially among time-sensitive users.

Can cause reputation damage, as users may feel Uber is unreliable during critical hours.

Over time, this could reduce app usage during late hours, negatively impacting revenue and opening space for competitors.

#### Chart - 6

In [None]:
plt.figure(figsize=(10,5))
sns.countplot(data=df, x='Hour', palette='viridis')
plt.title('Ride Requests by Hour of the Day')
plt.xlabel('Hour (24-hour format)')
plt.ylabel('Number of Requests')
plt.grid(True)
plt.show()

##### 1. Why did you pick the specific chart?

I selected this bar chart to visualize the hourly distribution of ride requests. The Hour column is numerical (ranging from 0 to 23), and a countplot allows us to easily see which hours have the highest and lowest demand.

This chart provides a time-specific breakdown, helping Uber understand hourly demand patterns, which is essential for driver scheduling and demand forecasting.

##### 2. What is/are the insight(s) found from the chart?

Demand peaks around 8 AM and 6–7 PM, aligning with office hours and evening commutes.

There is a noticeable dip in ride requests during late-night hours (12 AM to 5 AM).

Early Morning (4 AM–7 AM) still shows moderate activity, possibly from airport or early commuters.

Insight Summary:

Uber experiences predictable spikes in ride requests during Morning (7–10 AM) and Evening (5–8 PM). These are key periods for ride optimization.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

These insights are very useful to:

Optimize driver shift scheduling — ensuring more drivers are online during high-demand hours.

Reduce cancellations and “No Cars” problems by matching demand proactively.

Improve customer satisfaction during peak hours by maintaining smooth ride availability.

This leads to better ride fulfillment, higher revenue, and a stronger competitive edge.

**Risk of Negative Growth:**

If Uber does not act on hourly patterns:

Drivers may be offline during demand spikes, leading to unmet demand and cancellations.

Night-time requests, even if fewer, might be ignored — hurting customer trust and affecting critical users (airport travelers, emergencies).

Poor scheduling could overload drivers during peak hours, leading to burnout and cancellations.

This could result in negative app reviews, lower retention, and lost market share.

#### Chart - 7

In [None]:
pickup_counts = df['Pickup point'].value_counts()

plt.figure(figsize=(6,6))
plt.pie(pickup_counts, labels=pickup_counts.index, autopct='%1.1f%%', startangle=140, colors=['#66b3ff','#ff9999'])
plt.title('Pickup Point Distribution')
plt.axis('equal')
plt.show()

##### 1. Why did you pick the specific chart?

I used a pie chart because it is the most effective way to visualize the proportion of ride requests by pickup location. Since the Pickup point column contains only two categories — Airport and City — a pie chart provides a clear visual comparison of how requests are split between these two key locations.

This format is ideal when the goal is to show percentage shares of a whole in a simple and intuitive way.

##### 2. What is/are the insight(s) found from the chart?

The chart shows that a slightly higher percentage of ride requests originate from the Airport compared to the City.

This indicates that the Airport is a major source of demand and plays a crucial role in Uber’s business operations.

Despite similar request volumes, other charts showed more cancellations and “No Cars Available” issues at the Airport, revealing a mismatch between demand and supply at that location.

Insight Summary:

The Airport generates a higher portion of ride requests, yet it experiences more fulfillment issues — which suggests the need for better resource allocation in that zone.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

**Positive Business Impact:**

Knowing the pickup distribution helps Uber:

Prioritize operational resources (drivers, incentives) at higher-demand areas like the Airport.

Improve service quality and reliability for customers using Uber for travel to/from the Airport.

Increase completion rates by ensuring supply meets demand where it's needed most.

This insight can guide location-based driver strategies, improve customer satisfaction, and increase overall trip conversion.

**Risk of Negative Growth:**

If Uber ignores the higher demand at the Airport, it risks:

Losing loyal and frequent users (like business or air travelers).

Increased cancellations and “No Cars Available” scenarios — leading to negative reviews and customer churn.

Giving competitors an edge in a high-value zone like the Airport.

This can result in lost revenue and reduced brand reputation, especially during travel-heavy seasons.

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

To achieve the business objective of reducing Uber’s supply-demand gap and increasing ride fulfillment rates, I recommend the following data-driven strategies:

1. Deploy More Drivers During Critical Time Slots
Increase driver availability during Night (11 PM – 4 AM) and Early Morning (5 AM – 7 AM) where most cancellations and “No Cars Available” issues occur.

Use time-based scheduling and shift rotation to ensure coverage.

2. Prioritize the Airport Zone
Allocate a dedicated pool of drivers for Airport pickups, as the data shows it has the highest request volume but also the highest unfulfilled rate.

Introduce re-entry incentives for drivers who drop off at the Airport and wait for the next ride.

3. Introduce Smart Incentive Programs
Offer early morning bonuses or dynamic fare multipliers to drivers during critical hours.

Reward drivers who consistently accept and complete rides from high-cancellation zones.

4. Use Predictive Analytics
Implement machine learning models to predict high-demand periods and automatically alert or dispatch drivers in advance.

This ensures better planning and prevents service gaps before they happen.

5. Monitor, Measure & Improve
Regularly track time slot–wise and location–wise performance metrics.

Use dashboards to monitor cancellations, availability, and fulfillment rate — and make real-time adjustments.

# **Conclusion**

This project provided valuable insights into Uber's supply-demand gap by analyzing ride request data using Excel, SQL, and Python-based EDA. The data revealed that a significant number of ride requests were either cancelled or resulted in “No Cars Available,” particularly during Night and Early Morning hours, and most commonly from the Airport pickup point.

Through visualizations, heatmaps, and query-based exploration, we identified critical failure patterns and offered actionable recommendations. These included driver reallocation, targeted incentives, and the use of predictive analytics to improve fulfillment rates.

By addressing the time and location-based inefficiencies, Uber can significantly reduce ride failures, improve customer satisfaction, and increase revenue. The project successfully met its business objective by turning raw data into meaningful, strategic insights that support better decision-making.

In conclusion, this analysis highlights the importance of data-driven operations in building a more reliable and scalable ride-hailing service.



### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***