This repo contains a full analysis of the trips, routes, and pricing data of Uber. Uber is a transportation company with an app that allows passengers to hail a ride and drivers to charge fares and get paid. More specifically, Uber is a ridesharing company that hires independent contractors as drivers.
The insights from this project were presented inside MicroStrategy Workstation. For quick access, a summary of insights & screen shoots is available in this Slide Deck
One of main objectives of this project was to answer questions like:
- How does uber uses near real-time data to determine fare prices? (Uber's Dynamic Pricing)
- Does factors like weather affect surge factor used by uber to determing trip fares?
- What best practices & routes can drivers adhere to for optimal earnings?
Furthermore, explore the potential of analysing real-time data from drivers in a specific area to provide Drivers with live recommendations to allow better use of their time, and increase per hour earnings.
The data analysed in this project are public data from Uber, Uber Review, facts and supporting information from Statista retrieved for free with ESC Clermont Student Credintials, and NYC Open Data all accessable for free through respective links.
- Fare Prices
distance | cab_type | time_stamp | destination | source | price | surge_multiplier | id | product_id | name | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.44 | Lyft | 1.544950e+12 | North Station | Haymarket Square | 5.0 | 1.0 | 424553bb-7174-41ea-aeb4-fe06d4f4b9d7 | lyft_line | Shared |
1 | 0.44 | Lyft | 1.543280e+12 | North Station | Haymarket Square | 11.0 | 1.0 | 4bd23055-6827-41c6-b23b-3c491f24e74d | lyft_premier | Lux |
2 | 0.44 | Lyft | 1.543370e+12 | North Station | Haymarket Square | 7.0 | 1.0 | 981a3613-77af-4620-a42a-0c0866077d1e | lyft | Lyft |
3 | 0.44 | Lyft | 1.543550e+12 | North Station | Haymarket Square | 26.0 | 1.0 | c2d88af2-d278-4bfd-a8d0-29ca77cc5512 | lyft_luxsuv | Lux Black XL |
4 | 0.44 | Lyft | 1.543460e+12 | North Station | Haymarket Square | 9.0 | 1.0 | e0126e1f-8ca9-4f2e-82b3-50505a09db9a | lyft_plus | Lyft XL |
- Customer Reviews
Date | Stars | Comment | |
---|---|---|---|
0 | 10/29/2019 | 1 | I had an accident with an Uber driver in Mexic... |
1 | 10/28/2019 | 1 | I have had my account completely hacked to whe... |
2 | 10/27/2019 | 1 | I requested an 8 mile ride in Boston on a Satu... |
3 | 10/27/2019 | 1 | I've been driving off and on with the company ... |
4 | 10/25/2019 | 1 | Uber is overcharging for Toll fees. When In Fl... |
5 | 10/24/2019 | 1 | I had an airport flight today. Uber would not ... |
6 | 10/24/2019 | 1 | I worked for Uber and Lyft for 2.5 years and a... |
7 | 10/23/2019 | 1 | In July of this year I had sushi delivered to ... |
8 | 10/23/2019 | 1 | My driver, Rohan was nice, but when I tried to... |
9 | 10/21/2019 | 1 | I had seven fraudulent Uber transactions over ... |
- NYC Traffic Data
ID | Segment ID | Roadway Name | From | To | Direction | Date | 12:00-1:00 AM | 1:00-2:00AM | 2:00-3:00AM | ... | 2:00-3:00PM | 3:00-4:00PM | 4:00-5:00PM | 5:00-6:00PM | 6:00-7:00PM | 7:00-8:00PM | 8:00-9:00PM | 9:00-10:00PM | 10:00-11:00PM | 11:00-12:00AM | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2 | 70376 | 3 Avenue | East 154 Street | East 155 Street | NB | 9/13/2014 | 204 | 177 | 133 | ... | 520 | 611 | 573 | 546 | 582 | 528 | 432 | 328 | 282 | 240 |
1 | 2 | 70376 | 3 Avenue | East 155 Street | East 154 Street | SB | 9/13/2014 | 140 | 51 | 128 | ... | 379 | 376 | 329 | 362 | 418 | 335 | 282 | 247 | 237 | 191 |
2 | 56 | 176365 | Bedford Park Boulevard | Grand Concourse | Valentine Avenue | EB | 9/13/2014 | 94 | 73 | 65 | ... | 280 | 272 | 264 | 236 | 213 | 190 | 199 | 183 | 147 | 103 |
3 | 56 | 176365 | Bedford Park Boulevard | Grand Concourse | Valentine Avenue | WB | 9/13/2014 | 88 | 82 | 75 | ... | 237 | 276 | 223 | 240 | 217 | 198 | 186 | 162 | 157 | 103 |
4 | 62 | 147673 | Broadway | West 242 Street | 240 Street | SB | 9/13/2014 | 255 | 209 | 149 | ... | 732 | 809 | 707 | 675 | 641 | 556 | 546 | 465 | 425 | 324 |
The results of of PCA analysis shows displays the negligible effect of weather and weather forecast on Uber's dynamic pricing.
Through the dashboard, drives could pick the best routes to their destination based on the Time, Hour, and Day of the Week
Results of the time-series analysis presented an Accuracy of 72%. In order to achieve better accuracy, we decided to train a Random Forest model predict fare prices based on Cab Type, Distance,..etc.
**The model was deployed using Dataiku, the used code is exported accordingly
Information on Polarity and Subjectivity were later used to introduce best practices and recommendations for drivers. These insights and recommendations are available on the Slide Deck of the project.
The main objective of the project was to provide useful recommendations and prediction for Uber drivers to dynamically support their decision making and allow them to achieve higher earning.
The next VBA form was created to collect data from active drives in certain regions. This data is later used to train our Random Forest model to predict surge factor for different times of the day with better accurancy, and then shared back with uber drivers operaing in the same area.
This system was intended to allow new uber driver to enter their target income and move through a serios of forms to enter all their preferences. The application will later use the collected preferences to suggest the different options of operations to achieve the targeted income with the most convenience to the driver.
For new cap drivers, and to ease their penetration to the market, and helping them build better uber profile with better reviews. The used cars dataset from NYC DOT were displayed in an interactive dashboard to help new entrants pick the most convenient cars when it comes to fuel consumption, leg space, price, and make
- Clone the repo using this command in your terminal
git clone https://github.com/M-ElShazly/uber_case_study.git