# **Project Name**    - Project Name - Uber Supply Demand Gap Analysis




##### **Project Type**    - Exploratory Data Analysis (EDA)
##### **Contribution**    - Individual
##### **Team Member 1 - GANGARAPU DATHA NAGA SAI
##### **Team Member 2 -**
##### **Team Member 3 -**
##### **Team Member 4 -**

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import matplotlib.pyplot as plt


# display settings
pd.set_option('display.max_columns', None)

# load dataset
df = pd.read_csv('Uber Request Data.csv')

# preview data
df.head()


In [None]:
df.info()


In [None]:
df.columns


In [None]:
df.shape


In [None]:
df.isnull().sum()


In [None]:
df_clean = df.copy()


In [None]:
df_clean['Request timestamp'] = pd.to_datetime(
    df_clean['Request timestamp'],
    errors='coerce',
    dayfirst=True
)

df_clean['Drop timestamp'] = pd.to_datetime(
    df_clean['Drop timestamp'],
    errors='coerce',
    dayfirst=True
)


In [None]:
df_clean.info()


In [None]:
df_clean['Request_hour'] = df_clean['Request timestamp'].dt.hour


In [None]:
df_clean[['Request timestamp', 'Request_hour']].head()


In [None]:
def time_slot(hour):
    if pd.isna(hour):
        return np.nan
    elif 0 <= hour < 5:
        return 'Early Morning'
    elif 5 <= hour < 10:
        return 'Morning'
    elif 10 <= hour < 16:
        return 'Afternoon'
    elif 16 <= hour < 20:
        return 'Evening'
    elif 20 <= hour < 24:
        return 'Night'

df_clean['Time_slot'] = df_clean['Request_hour'].apply(time_slot)


In [None]:
df_clean[['Request_hour', 'Time_slot']].head(10)


In [None]:
time_slot_counts = df_clean['Time_slot'].value_counts()
time_slot_counts


In [None]:
df_clean['Status'].value_counts()


In [None]:
status_time_slot = pd.crosstab(df_clean['Time_slot'], df_clean['Status'])
status_time_slot


In [None]:
df_clean['Pickup point'].value_counts()


In [None]:
pickup_status = pd.crosstab(df_clean['Pickup point'], df_clean['Status'])
pickup_status


In [None]:
time_pickup_status = pd.crosstab(
    [df_clean['Time_slot'], df_clean['Pickup point']],
    df_clean['Status']
)
time_pickup_status


In [None]:
plt.figure(figsize=(8,5))
time_slot_counts.plot(kind='bar')
plt.title('Number of Requests by Time Slot')
plt.xlabel('Time Slot')
plt.ylabel('Number of Requests')
plt.show()


In [None]:
status_time_slot.plot(kind='bar', figsize=(9,6))
plt.title('Trip Status by Time Slot')
plt.xlabel('Time Slot')
plt.ylabel('Number of Requests')
plt.show()


In [None]:
pickup_status.plot(kind='bar', figsize=(8,6))
plt.title('Trip Status by Pickup Point')
plt.xlabel('Pickup Point')
plt.ylabel('Number of Requests')
plt.show()


# **Project Summary -**

This project focuses on analyzing the supply-demand gap in Uber ride requests using exploratory data analysis techniques. The dataset contains information about ride requests, pickup points, trip status, and timestamps. The primary objective of this analysis is to identify patterns and reasons behind trip failures such as cancellations and unavailability of cars.

The analysis begins with data cleaning and preprocessing. Missing values were identified in the Driver ID and Drop Timestamp columns. These missing values are meaningful, as they indicate cancelled trips or situations where cars were not available. Timestamp columns were converted into datetime format to allow time-based analysis. Additional features such as request hour and time slots were created to understand demand variations across different times of the day.

Exploratory data analysis revealed that ride demand is highest during morning hours, followed by evening and night periods. Early morning shows comparatively lower demand. However, demand alone does not explain the problem. A significant number of trips fail due to either driver cancellations or unavailability of cars.

Further analysis of trip status showed that a large number of requests fail because no cars are available, especially during evening and night hours. Morning hours experience a high number of cancellations, indicating driver-side issues during peak traffic hours. Pickup point analysis revealed that airport rides suffer mainly from car unavailability, while city rides face higher cancellation rates.

A combined analysis of time slots and pickup points clearly highlights that Uber faces both time-based and location-based supply challenges. Evening and night airport rides are affected due to lack of cars, while morning city rides experience frequent cancellations. These insights indicate that the supply-demand gap is driven more by supply-side constraints rather than lack of customer demand.

The findings of this project can help Uber improve driver allocation strategies, increase incentives during peak hours, and ensure better availability of cars at airports and during high-demand periods. This analysis provides actionable insights that can improve operational efficiency and customer satisfaction.


# **GitHub Link -**

Provide your GitHub Link here.

GitHub Repository:
https://github.com/nagasai-datha/uber-supply-demand-gap-analysis


# **Problem Statement**


The objective of this project is to analyze Uber ride request data to identify the key factors contributing to the supply-demand gap. The analysis aims to understand when and where ride requests fail due to cancellations or unavailability of cars, and to provide data-driven insights to improve Uber’s operational efficiency.
*

#### **Define Your Business Objective?**

Answer Here.
The business objective is to identify peak demand periods and locations where Uber faces supply shortages or high cancellation rates, and to suggest improvements in driver allocation and availability to reduce trip failures.


# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries

### Dataset Loading

In [None]:
# Load Dataset

### Dataset First View

In [None]:
# Dataset First Look

### Dataset Rows & Columns count

In [None]:
# Dataset Rows & Columns count

### Dataset Information

In [None]:
# Dataset Info

#### Duplicate Values

In [None]:
# Dataset Duplicate Value Count

#### Missing Values/Null Values

In [None]:
# Missing Values/Null Values Count

In [None]:
# Visualizing the missing values

### What did you know about your dataset?

Answer Here

## ***2. Understanding Your Variables***

In [None]:
# Dataset Columns

In [None]:
# Dataset Describe

### Variables Description

Answer Here

### Check Unique Values for each variable.

In [None]:
# Check Unique Values for each variable.

## 3. ***Data Wrangling***

### Data Wrangling Code

In [None]:
# Write your code to make your dataset analysis ready.

### What all manipulations have you done and insights you found?

Answer Here.

## ***4. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

#### Chart - 1

In [None]:
plt.figure(figsize=(8,5))
sns.countplot(data=df, x='Status')

plt.title('Distribution of Uber Ride Request Status')
plt.xlabel('Request Status')
plt.ylabel('Number of Requests')

plt.show()


##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

The chart shows that a significant number of ride requests do not result in successful trips.
Among the failed requests, "No Cars Available" contributes more than cancellations.
This indicates that supply shortage is a bigger issue than user-driven cancellations.


##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Yes, the insights help create a positive business impact by clearly identifying supply
shortages as the main reason for trip failures. Addressing driver availability can
significantly improve trip completion rates.

However, the high number of failed requests due to no car availability highlights a
negative growth factor, as unmet demand can lead to customer dissatisfaction and loss
of potential revenue.


#### Chart - 2

In [None]:
df.columns


In [None]:
plt.figure(figsize=(8,5))
sns.countplot(data=df_clean, x='Time_slot')

plt.title('Number of Requests by Time Slot')
plt.xlabel('Time Slot')
plt.ylabel('Number of Requests')

plt.show()



##### 1. Why did you pick the specific chart?

Answer Here.

This chart helps understand how Uber ride requests are distributed across different time slots during the day, identifying peak demand periods.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

The chart shows that request volume is highest during Morning and Evening time slots, indicating peak demand periods. Early Morning has the lowest demand.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Yes, these insights help Uber plan driver availability during peak hours. However, high demand during peak slots can negatively impact customer experience if supply is insufficient.

#### Chart - 3

In [None]:
# Chart - 3 visualization code
plt.figure(figsize=(8,5))
sns.countplot(data=df_clean, x='Pickup point', hue='Status')

plt.title('Request Status by Pickup Point')
plt.xlabel('Pickup Point')
plt.ylabel('Number of Requests')

plt.legend(title='Status')
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here.

This chart was chosen to compare ride request outcomes across different pickup
points. It helps identify whether trip failures are more frequent at the
Airport or within the City.


##### 2. What is/are the insight(s) found from the chart?

Answer Here

The chart shows that Airport requests have a higher number of failures due to
"No Cars Available", while City requests experience more cancellations.
This indicates location-based supply issues.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

These insights help Uber optimize driver allocation by location.
Improving car availability at airports can significantly reduce failed trips.
However, frequent city cancellations may negatively affect customer trust.


#### Chart - 4

In [None]:
# Chart - 4: Request Status by Time Slot

plt.figure(figsize=(10,6))
sns.countplot(data=df_clean, x='Time_slot', hue='Status')

plt.title('Request Status by Time Slot')
plt.xlabel('Time Slot')
plt.ylabel('Number of Requests')

plt.legend(title='Status')
plt.show()


##### 1. Why did you pick the specific chart?

Answer Here.

This chart was chosen to analyze how Uber ride request outcomes vary across different time slots. It helps identify peak hours where cancellations or unavailability of cars are highest.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

The chart shows that Morning and Evening time slots have the highest number of requests. “No Cars Available” is especially high during Evening and Night hours, indicating strong demand but insufficient supply. Early Morning has the lowest demand and fewer failures.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

These insights help Uber plan better driver availability during peak hours, especially in the Evening and Night slots, improving trip completion rates. However, high failure rates during peak times can negatively impact customer satisfaction if supply issues are not addressed.

#### Chart - 5

In [None]:
# Chart - 5 visualization code
plt.figure(figsize=(14,6))

sns.countplot(
    data=df_clean,
    x='Time_slot',
    hue='Status',
)

plt.title('Request Status by Time Slot')
plt.xlabel('Time Slot')
plt.ylabel('Number of Requests')
plt.legend(title='Status')

plt.show()



##### 1. Why did you pick the specific chart?

Answer Here.

This chart was chosen to analyze how Uber ride request outcomes vary across different time slots. It helps identify peak periods where cancellations or unavailability of cars are highest, enabling time-based supply planning.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

The chart shows that Evening and Night time slots experience a high number of failed requests, mainly due to “No Cars Available”. Morning hours have higher completed trips but also notable cancellations, while Early Morning has the lowest overall demand.

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

Yes, these insights help Uber improve driver allocation by time slot. Increasing driver availability during Evening and Night hours can reduce unmet demand and improve customer satisfaction. However, continued shortages during peak times can negatively impact revenue and user trust if not addressed.

#### Chart - 6

In [None]:
# Chart - 6 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 7

In [None]:
# Chart - 7 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 8

In [None]:
# Chart - 8 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 9

In [None]:
# Chart - 9 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 10

In [None]:
# Chart - 10 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 11

In [None]:
# Chart - 11 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 12

In [None]:
# Chart - 12 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 13

In [None]:
# Chart - 13 visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

##### 3. Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

Answer Here

#### Chart - 14 - Correlation Heatmap

In [None]:
# Correlation Heatmap visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

#### Chart - 15 - Pair Plot

In [None]:
# Pair Plot visualization code

##### 1. Why did you pick the specific chart?

Answer Here.

##### 2. What is/are the insight(s) found from the chart?

Answer Here

## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Answer Here.

To achieve the business objective, Uber should focus on improving driver availability during peak demand periods, especially during Evening and Night time slots and at Airport pickup points. Incentive-based driver scheduling, surge pricing during high-demand hours, and better demand forecasting can help reduce the number of failed requests due to unavailability of cars.

Additionally, improving driver engagement in city areas during morning peak hours can help reduce cancellations. These measures will improve trip completion rates, customer satisfaction, and overall revenue.


# **Conclusion**

Write the conclusion here.

This project analyzed Uber ride request data to understand the supply-demand gap using exploratory data analysis. The analysis revealed that demand is highest during Morning, Evening, and Night time slots, while a significant number of ride failures occur due to unavailability of cars rather than cancellations.

Airport pickup points suffer mainly from supply shortages, whereas city trips experience higher cancellations. These findings indicate that Uber’s supply-demand gap is driven by both time-based and location-based challenges. Addressing these issues through better driver allocation and demand forecasting can significantly improve operational efficiency and customer experience.


### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***