# **Project Name**    - Uber Supply And Demand Gap Analysis



##### **Project Type**    - EDA
##### **Name**            - Akshaykumar


# **Project Summary -**

This project analyzes Uber trip data to understand when there are more ride requests than available drivers. By examining trips statuses, driver availabilities, and time segments throughout the day, the project aims to identify times and areas with supply and demand gaps. The goal is to find patterns and help provide suggestion to improve driver availability during busy times.  The initial data preprocessing is already done in SQL. Basic data analysis was then done in Excel and the dashboards are created to understand the data, and identify the supply-demand gap. Here, in the notebook file, detailed temporal analysis will be done to identify the peak time for requests and the availability of drivers during the same time frame.



# **GitHub Link -**

# **Problem Statement**


 This project aims to analyze uber trip data to identify when and where the supply-demand gaps occur , and to understand the underlying reasons behind them, with the goal of improving driver deployment and service efficiency.

#### **Define Your Business Objective?**

The business objective is to analyze ride request and driver availability data in order to identify times and locations with supply-demand gaps, enabling better driver scheduling and operational decisions to improve service quality and reduce rider wait times.

# **General Guidelines** : -  

1.   Well-structured, formatted, and commented code is required.
2.   Exception Handling, Production Grade Code & Deployment Ready Code will be a plus. Those students will be awarded some additional credits.
     
     The additional credits will have advantages over other students during Star Student selection.
       
             [ Note: - Deployment Ready Code is defined as, the whole .ipynb notebook should be executable in one go
                       without a single error logged. ]

3.   Each and every logic should have proper comments.
4. You may add as many number of charts you want. Make Sure for each and every chart the following format should be answered.
        

```
# Chart visualization code
```
            

*   Why did you pick the specific chart?
*   What is/are the insight(s) found from the chart?
* Will the gained insights help creating a positive business impact?
Are there any insights that lead to negative growth? Justify with specific reason.

5. You have to create at least 20 logical & meaningful charts having important insights.


[ Hints : - Do the Vizualization in  a structured way while following "UBM" Rule.

U - Univariate Analysis,

B - Bivariate Analysis (Numerical - Categorical, Numerical - Numerical, Categorical - Categorical)

M - Multivariate Analysis
 ]





# ***Let's Begin !***

## ***1. Know Your Data***

### Import Libraries

In [None]:
# Import Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

### Dataset Loading

In [None]:
df = pd.read_csv("/content/Uber Request Data.csv", parse_dates=['Request timestamp', 'Drop timestamp'])
print(df.head())


### Dataset First View

In [None]:
# Dataset First Look
df.head()

In [None]:
print(df.isnull().sum())


In [None]:
df.rename(columns=lambda x: x.strip().lower().replace(' ', '_'), inplace=True)


In [None]:
print(df.dtypes)


In [None]:
df['request_timestamp'] = pd.to_datetime(df['request_timestamp'], errors='coerce')
df['drop_timestamp'] = pd.to_datetime(df['drop_timestamp'], errors='coerce')


In [None]:
print(df[df['request_timestamp'].isna()])


In [None]:
df['request_date'] = df['request_timestamp'].dt.date
df['request_hour'] = df['request_timestamp'].dt.hour
df['trip_duration_min'] = (df['drop_timestamp'] - df['request_timestamp']).dt.total_seconds() / 60


In [None]:
print(df[df['trip_duration_min'] < 0])
print(df['trip_duration_min'].describe())


In [None]:
df = df[df['trip_duration_min'] >= 0]


In [None]:
df.isnull().sum()

## ***2. Data Vizualization, Storytelling & Experimenting with charts : Understand the relationships between variables***

In [None]:
import matplotlib.pyplot as plt
import seaborn as sns

plt.figure(figsize=(6,4))
sns.countplot(data=df, x='status')
plt.title('Number of Trips by Status')
plt.xticks(rotation=45)
plt.show()


In [None]:
plt.figure(figsize=(6,4))
sns.countplot(data=df, x='pickup_point')
plt.title('Trips by Pickup Point')
plt.show()


In [None]:
plt.figure(figsize=(6,4))
sns.boxplot(data=df, x='pickup_point', y='trip_duration_min')
plt.title('Trip Duration Distribution by Pickup Point')
plt.ylim(0, 60)  # limit y-axis for better visualization
plt.show()


In [None]:
daily_trips = df.groupby('request_date')['request_id'].count().reset_index()
plt.figure(figsize=(12,5))
sns.lineplot(data=daily_trips, x='request_date', y='request_id')
plt.title('Daily Number of Uber Requests')
plt.xlabel('Date')
plt.ylabel('Number of Trips')
plt.show()


## **5. Solution to Business Objective**

#### What do you suggest the client to achieve Business Objective ?
Explain Briefly.

Based on your detailed analysis and visualizations, the following are the business solutions that directly address the identified supply-demand gaps:
 - Increase driver shifts during peak times (morning and night) to meet high demand, especially in high-gap locations.
 - Offer incentives, bonuses, or badges for drivers who work in low supply but high demand periods such as late-night and early mornings.
  - Focus on increasing driver availability in high-demand areas, such as airports and city hotspots. Use geo-analytics to identify congestion areas and deploy drivers strategically in real-time.
 - Offer flexible working hours, especially for drivers preferring nighttime shifts.
 - Introduce shift reviews periodically to match supply with demand effectively.
 - Implement predictive analytics to forecast demand trends hourly and daily to optimize driver allocation.
 - Encourage riders to use services during low-demand periods with discounts.
 - Use driver and user feedbacks to continually improve scheduling practices and identify bottlenecks.

By implementing these practices, a better supply for the demands, thereby reducing the gap can be achieved.


# **Conclusion**

The analysis provides a comprehensive analysis of Uber’s demand and supply dynamics, highlighting critical time segments that experience the highest unmet demand. Through detailed data preprocessing, exploratory data analysis, and visualizations, the key areas were identified where driver supply needs to be optimized to reduce gaps. The insights gained emphasize the importance of strategic driver scheduling, location-specific deployment, and incentivization during high-demand periods. Implementing these targeted interventions can enhance operational efficiency, improve customer satisfaction, and ensure a balanced supply-demand ecosystem.

### ***Hurrah! You have successfully completed your EDA Capstone Project !!!***