# Driver Pay
- This notebook studies how rides differ based on their playform
- Uber has a higher market share of 68% relative to Lyft's market share of 23%
- I show that Lyft rides are more likely to pay the driver a tip
- Uber drivers are compensated more per mile or minute, even after accounting for that Lyft riders are more likely to tip


In [1]:
from sqlalchemy import create_engine
engine = create_engine('postgresql://root:root@localhost:5432/uber')
engine.connect()

%load_ext sql
%sql postgresql://root:root@localhost:5432/uber

In [2]:
%%sql
SELECT * FROM main 
LIMIT 10;

 * postgresql://root:***@localhost:5432/uber
10 rows affected.


index,hvfhs_license_num,dispatching_base_num,originating_base_num,request_datetime,on_scene_datetime,pickup_datetime,dropoff_datetime,PULocationID,DOLocationID,trip_miles,trip_time,base_passenger_fare,tolls,bcf,sales_tax,congestion_surcharge,airport_fee,tips,driver_pay,shared_request_flag,shared_match_flag,access_a_ride_flag,wav_request_flag,wav_match_flag,pickup_hour,pickup_dayofweek,platform,has_tips,driver_pay_per_mile,driver_pay_per_minute
19985285,HV0005,B02510,,2019-07-31 15:11:25,,2019-07-31 15:13:40,2019-07-31 15:25:00,210,155,1.653,680,10.31,0.0,0.19,0.67,0.0,,0.0,7.42,N,N,N,N,N,15,Wednesday,Lyft,0,4.488808,0.6547059
14073898,HV0003,B02871,B02871,2019-07-22 18:27:36,2019-07-22 18:34:12,2019-07-22 18:35:17,2019-07-22 18:46:59,61,35,1.68,703,8.49,0.0,0.0,0.74,0.0,,0.0,7.63,N,N,,N,N,18,Monday,Uber,0,4.5416665,0.6512091
9262571,HV0003,B02869,B02869,2019-07-15 23:24:07,2019-07-15 23:26:05,2019-07-15 23:28:23,2019-07-15 23:50:20,35,139,9.35,1316,23.62,0.0,0.0,2.04,0.0,,0.0,21.1,N,N,,N,,23,Monday,Uber,0,2.2566845,0.9620061
9004029,HV0003,B02876,B02876,2019-07-15 15:44:52,2019-07-15 15:45:45,2019-07-15 15:47:53,2019-07-15 16:25:25,161,138,10.03,2251,44.94,6.12,0.0,4.42,2.75,,8.73,44.4,N,N,,N,,15,Monday,Uber,1,4.4267197,1.1834741
674083,HV0003,B02875,B02875,2019-07-02 08:17:17,2019-07-02 08:24:33,2019-07-02 08:25:32,2019-07-02 08:42:34,161,137,1.7,1023,-0.75,0.0,0.0,0.0,0.75,,0.0,6.58,Y,Y,,N,,8,Tuesday,Uber,0,3.8705883,0.38592374
1982460,HV0003,B02764,B02764,2019-07-04 14:41:58,2019-07-04 14:46:55,2019-07-04 14:48:39,2019-07-04 15:01:16,205,215,2.86,758,5.0,0.0,0.0,0.43,0.0,,0.0,5.93,Y,Y,,N,,14,Thursday,Uber,0,2.0734265,0.46939313
15439192,HV0005,B02510,,2019-07-24 21:32:42,,2019-07-24 21:34:31,2019-07-24 21:49:53,107,229,1.913,922,8.73,0.0,0.22,0.77,0.75,,0.0,0.0,Y,Y,N,N,N,21,Wednesday,Lyft,0,0.0,0.0
3885337,HV0003,B02883,B02883,2019-07-07 17:40:31,2019-07-07 17:41:21,2019-07-07 17:43:01,2019-07-07 18:07:33,68,138,9.48,1472,38.69,6.12,0.0,3.88,2.75,,0.0,31.66,N,N,,N,,17,Sunday,Uber,0,3.3396626,1.2904891
4275091,HV0003,B02764,B02764,2019-07-08 09:59:27,2019-07-08 09:59:45,2019-07-08 10:01:08,2019-07-08 10:17:33,192,16,4.49,984,14.45,0.0,0.0,1.25,0.0,,0.0,13.03,N,N,,N,,10,Monday,Uber,0,2.9020045,0.7945122
18672189,HV0004,B02800,,2019-07-29 12:45:22,,2019-07-29 12:53:37,2019-07-29 13:22:25,161,148,4.2,1728,8.75,0.0,0.0,0.78,0.75,,0.0,0.0,Y,Y,N,N,N,12,Monday,Via,0,0.0,0.0


### Tips and Ride Platform
- How does the likelihood of tips vary with the platform?
- To study this, I group by the TLC license number
- I first use a subquery to alias the cumbersome 'Hvfhs_license_num' as 'license'
- I then group by 'license' and use a case statement for brand names
- I use a descending order of how likely a platform is to have a tip

- We can see that Lyft rides are significantly more likely to tip than Juno, Via, and Uber


In [4]:
%%sql
SELECT 
platform,
ROUND(100*AVG(
    (tips > 0)::int
    ), 2) "Percent with Tips"
FROM main
GROUP BY platform
ORDER BY AVG((tips > 0)::int) DESC

 * postgresql://root:***@localhost:5432/uber
4 rows affected.


platform,Percent with Tips
Lyft,18.43
Uber,11.83
Via,9.03
Juno,8.25


### Market Share

In [7]:
%%sql
SELECT platform, 
ROUND(
    (100 * COUNT(*) / (SELECT COUNT(*) FROM main))::numeric
, 2) market_share
FROM main 
GROUP BY platform
ORDER BY COUNT(*) DESC

 * postgresql://root:***@localhost:5432/uber
4 rows affected.


platform,market_share
Uber,68.0
Lyft,23.0
Via,4.0
Juno,3.0


### Pay per Mile
- Using the view, I compute the average pay per mile for each platform
- Juno and Uber pay the highest
- Interestingly, Via pays much lower

In [8]:
%%sql
SELECT 
platform, 
ROUND(
    AVG(driver_pay / trip_miles)::numeric
    ,2) pay_per_mile
FROM aux 
WHERE trip_miles > 0.00
GROUP BY platform
ORDER BY AVG(driver_pay / trip_miles) DESC

 * postgresql://root:***@localhost:5432/uber
4 rows affected.


platform,pay_per_mile
Juno,4.91
Uber,4.42
Lyft,3.66
Via,0.23


### Pay per Mile - With Tips 
- It's possible that the lower pay of Lyft drivers is compensated by their higher tip rate
- To examine this, I now include tips when computing driver pay
- Naturally, including tips increases driver pay per mile, but doesn't remove the advantage that uber has over lyft

In [9]:
%%sql
SELECT 
platform, 
ROUND(
    AVG((driver_pay + tips) / trip_miles)::numeric
    ,2) pay_per_mile
FROM aux 
WHERE trip_miles > 0.00
GROUP BY platform
ORDER BY AVG(driver_pay / trip_miles) DESC

 * postgresql://root:***@localhost:5432/uber
4 rows affected.


platform,pay_per_mile
Juno,5.01
Uber,4.59
Lyft,3.89
Via,0.31


### Pay per Minute
- Using the view, I compute average pay per minute for each platform
- Juno and Uber again pay the highest rate, with Lyft paying a lower rate

In [10]:
%%sql
SELECT 
platform, 
ROUND(
    AVG(driver_pay / trip_minutes)::numeric
    ,2) pay_per_minute
FROM aux 
WHERE trip_minutes > 0.00
GROUP BY platform
ORDER BY AVG(driver_pay / trip_minutes) DESC;

 * postgresql://root:***@localhost:5432/uber
4 rows affected.


platform,pay_per_minute
Juno,1.0
Uber,0.9
Lyft,0.67
Via,0.05


## Time of Day
- We can see the lowest pay per mile is at late night
- In contrast, late night pays the highest *per minute*
- Tips are more likely at mid-day at least likely late at night

In [7]:
%%sql
SELECT 
pickup_hour, 
ROUND(
    AVG(driver_pay_per_mile)::numeric
    , 2) "Average Pay per Mile"
FROM main 
GROUP BY pickup_hour
ORDER BY pickup_hour ASC;

 * postgresql://root:***@localhost:5432/uber
24 rows affected.


pickup_hour,Average Pay per Mile
0,3.57
1,3.31
2,3.26
3,3.11
4,3.45
5,3.89
6,3.09
7,3.49
8,4.17
9,4.09


In [8]:
%%sql
SELECT 
pickup_hour, 
ROUND(
    AVG(driver_pay_per_minute)::numeric
    , 2) "Average Pay per Minute"
FROM main 
GROUP BY pickup_hour
ORDER BY pickup_hour ASC;

 * postgresql://root:***@localhost:5432/uber
24 rows affected.


pickup_hour,Average Pay per Minute
0,0.82
1,0.81
2,0.87
3,0.87
4,0.92
5,1.02
6,0.85
7,0.77
8,0.75
9,0.75


In [10]:
%%sql
SELECT 
pickup_hour, 
ROUND(
    100*AVG(has_tips)::numeric
    , 2) "Average Pay per Minute"
FROM main 
GROUP BY pickup_hour
ORDER BY pickup_hour ASC;

 * postgresql://root:***@localhost:5432/uber
24 rows affected.


pickup_hour,Average Pay per Minute
0,11.23
1,9.89
2,10.89
3,10.01
4,10.7
5,14.59
6,12.23
7,10.96
8,12.57
9,13.91
