## <font color='#2F4F4F'>1. Defining the Question</font>

### a) Specifying the Data Analysis Question

Create a descriptive analysis report that outlines the pricing patterns of the existing ride-sharing company.

### b) Defining the Metric for Success

The project will be a success when we are able to understand how pricing works.

### c) Understanding the Context 

Over the past few years, ride-sharing apps have been on the rise across many cities in the world. While this has happened, Uber and Lyft's ride prices are not constant like public transport. They are greatly affected by the demand and supply of rides at a given time.

As a Data Scientist working to understand this market, you have been tasked to come up with a descriptive analysis report to help a Ride-Sharing Startup coming into this space, understand the various patterns on how pricing works for the existing ride-sharing company.

### d) Recording the Experimental Design

1. Load pandas and sqlite3
2. Load datasets 
3. Perform data cleaning
4. Transfer datasets to SQL table
5. Carry out analysis using SQL
6. Merge datasets and carry out further analysis
6. Summarize findings.
7. Provide recommendations.
8. Challenge the solution.

### e) Data Relevance

The dataset provided is relevant to answering the research question.

## <font color='#2F4F4F'>2. Data Cleaning & Preparation</font>

In [36]:
# import Pandas and sqlite3
import pandas as pd
import sqlite3

In [37]:
# load SQL extension
%load_ext sql
%sql sqlite:///testdb.sqlite

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


'Connected: @testdb.sqlite'

In [10]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [25]:
# load and preview cab dataset
cab = pd.read_csv('/content/drive/My Drive/Colab Notebooks/AfterWork Data Science Fellowship/Week 3/Projects/Data Analysis and Reporting with SQL Project /cabs_dataset.csv', sep = ',')
cab.head()

cab.drop(['Unnamed: 0'], axis=1, inplace=True)

cab.head()

Unnamed: 0,distance,cab_type,time_stamp,destination,source,price,surge_multiplier,id,product_id,name
0,0.44,Lyft,2018-12-16 09:30:07.890,North Station,Haymarket Square,5.0,1.0,424553bb-7174-41ea-aeb4-fe06d4f4b9d7,lyft_line,Shared
1,0.44,Lyft,2018-11-27 02:00:23.677,North Station,Haymarket Square,11.0,1.0,4bd23055-6827-41c6-b23b-3c491f24e74d,lyft_premier,Lux
2,0.44,Lyft,2018-11-28 01:00:22.198,North Station,Haymarket Square,7.0,1.0,981a3613-77af-4620-a42a-0c0866077d1e,lyft,Lyft
3,0.44,Lyft,2018-11-30 04:53:02.749,North Station,Haymarket Square,26.0,1.0,c2d88af2-d278-4bfd-a8d0-29ca77cc5512,lyft_luxsuv,Lux Black XL
4,0.44,Lyft,2018-11-29 03:49:20.223,North Station,Haymarket Square,9.0,1.0,e0126e1f-8ca9-4f2e-82b3-50505a09db9a,lyft_plus,Lyft XL


In [29]:
# set the 'time_stamp' column into datetime datatype
cab['time_stamp'] = pd.to_datetime(cab['time_stamp'])
cab.head()

Unnamed: 0,distance,cab_type,time_stamp,destination,source,price,surge_multiplier,id,product_id,name
0,0.44,Lyft,2018-12-16 09:30:07.890,North Station,Haymarket Square,5.0,1.0,424553bb-7174-41ea-aeb4-fe06d4f4b9d7,lyft_line,Shared
1,0.44,Lyft,2018-11-27 02:00:23.677,North Station,Haymarket Square,11.0,1.0,4bd23055-6827-41c6-b23b-3c491f24e74d,lyft_premier,Lux
2,0.44,Lyft,2018-11-28 01:00:22.198,North Station,Haymarket Square,7.0,1.0,981a3613-77af-4620-a42a-0c0866077d1e,lyft,Lyft
3,0.44,Lyft,2018-11-30 04:53:02.749,North Station,Haymarket Square,26.0,1.0,c2d88af2-d278-4bfd-a8d0-29ca77cc5512,lyft_luxsuv,Lux Black XL
4,0.44,Lyft,2018-11-29 03:49:20.223,North Station,Haymarket Square,9.0,1.0,e0126e1f-8ca9-4f2e-82b3-50505a09db9a,lyft_plus,Lyft XL


In [30]:
# retrieve month, day, hour, and minute from time_stamp

cab['month'] = cab['time_stamp'].map(lambda x : x.month)
cab['day'] = cab['time_stamp'].map(lambda x : x.day)
cab['hour'] = cab['time_stamp'].map(lambda x : x.hour)
cab['minute'] = cab['time_stamp'].map(lambda x : x.minute)
cab.head()

Unnamed: 0,distance,cab_type,time_stamp,destination,source,price,surge_multiplier,id,product_id,name,month,day,hour,minute
0,0.44,Lyft,2018-12-16 09:30:07.890,North Station,Haymarket Square,5.0,1.0,424553bb-7174-41ea-aeb4-fe06d4f4b9d7,lyft_line,Shared,12,16,9,30
1,0.44,Lyft,2018-11-27 02:00:23.677,North Station,Haymarket Square,11.0,1.0,4bd23055-6827-41c6-b23b-3c491f24e74d,lyft_premier,Lux,11,27,2,0
2,0.44,Lyft,2018-11-28 01:00:22.198,North Station,Haymarket Square,7.0,1.0,981a3613-77af-4620-a42a-0c0866077d1e,lyft,Lyft,11,28,1,0
3,0.44,Lyft,2018-11-30 04:53:02.749,North Station,Haymarket Square,26.0,1.0,c2d88af2-d278-4bfd-a8d0-29ca77cc5512,lyft_luxsuv,Lux Black XL,11,30,4,53
4,0.44,Lyft,2018-11-29 03:49:20.223,North Station,Haymarket Square,9.0,1.0,e0126e1f-8ca9-4f2e-82b3-50505a09db9a,lyft_plus,Lyft XL,11,29,3,49


In [32]:
# Checking for duplicates

cab.duplicated().sum()

0

In [33]:
# check for missing values
cab.isna().sum()

distance                0
cab_type                0
time_stamp              0
destination             0
source                  0
price               55095
surge_multiplier        0
id                      0
product_id              0
name                    0
month                   0
day                     0
hour                    0
minute                  0
dtype: int64

In [34]:
# drop the records with the missing values
cab = cab.dropna(axis = 0)
cab.shape

(637976, 14)

In [35]:
cab.dtypes

distance                   float64
cab_type                    object
time_stamp          datetime64[ns]
destination                 object
source                      object
price                      float64
surge_multiplier           float64
id                          object
product_id                  object
name                        object
month                        int64
day                          int64
hour                         int64
minute                       int64
dtype: object

In [38]:
%%sql
DROP TABLE IF EXISTS cab;

 * sqlite:///testdb.sqlite
Done.


[]

In [39]:
%sql PERSIST cab;

 * sqlite:///testdb.sqlite


'Persisted cab'

In [40]:
%%sql
SELECT * FROM cab LIMIT 3;

 * sqlite:///testdb.sqlite
Done.


index,distance,cab_type,time_stamp,destination,source,price,surge_multiplier,id,product_id,name,month,day,hour,minute
0,0.44,Lyft,2018-12-16 09:30:07.890000,North Station,Haymarket Square,5.0,1.0,424553bb-7174-41ea-aeb4-fe06d4f4b9d7,lyft_line,Shared,12,16,9,30
1,0.44,Lyft,2018-11-27 02:00:23.677000,North Station,Haymarket Square,11.0,1.0,4bd23055-6827-41c6-b23b-3c491f24e74d,lyft_premier,Lux,11,27,2,0
2,0.44,Lyft,2018-11-28 01:00:22.198000,North Station,Haymarket Square,7.0,1.0,981a3613-77af-4620-a42a-0c0866077d1e,lyft,Lyft,11,28,1,0


In [42]:
# load and preview weather dataset
weather = pd.read_csv('/content/drive/My Drive/Colab Notebooks/AfterWork Data Science Fellowship/Week 3/Projects/Data Analysis and Reporting with SQL Project /weather.csv')
weather.head()

weather.drop(['Unnamed: 0'], axis = 1, inplace=True)

weather.head()

Unnamed: 0,temp,location,clouds,pressure,rain,time_stamp,humidity,wind
0,42.42,Back Bay,1.0,1012.14,0.1228,2018-12-16 23:45:01,0.77,11.25
1,42.43,Beacon Hill,1.0,1012.15,0.1846,2018-12-16 23:45:01,0.76,11.32
2,42.5,Boston University,1.0,1012.15,0.1089,2018-12-16 23:45:01,0.76,11.07
3,42.11,Fenway,1.0,1012.13,0.0969,2018-12-16 23:45:01,0.77,11.09
4,43.13,Financial District,1.0,1012.14,0.1786,2018-12-16 23:45:01,0.75,11.49


In [45]:
# set the 'time_stamp' column into datetime datatype
weather['time_stamp'] = pd.to_datetime(weather['time_stamp'])
weather.head()

Unnamed: 0,temp,location,clouds,pressure,rain,time_stamp,humidity,wind
0,42.42,Back Bay,1.0,1012.14,0.1228,2018-12-16 23:45:01,0.77,11.25
1,42.43,Beacon Hill,1.0,1012.15,0.1846,2018-12-16 23:45:01,0.76,11.32
2,42.5,Boston University,1.0,1012.15,0.1089,2018-12-16 23:45:01,0.76,11.07
3,42.11,Fenway,1.0,1012.13,0.0969,2018-12-16 23:45:01,0.77,11.09
4,43.13,Financial District,1.0,1012.14,0.1786,2018-12-16 23:45:01,0.75,11.49


In [46]:
# retrieve month, day, hour, and minute from time_stamp
weather['month'] = weather['time_stamp'].map(lambda x : x.month)
weather['day'] = weather['time_stamp'].map(lambda x : x.day)
weather['hour'] = weather['time_stamp'].map(lambda x : x.hour)
weather['minute'] = weather['time_stamp'].map(lambda x : x.minute)
weather.head()

Unnamed: 0,temp,location,clouds,pressure,rain,time_stamp,humidity,wind,month,day,hour,minute
0,42.42,Back Bay,1.0,1012.14,0.1228,2018-12-16 23:45:01,0.77,11.25,12,16,23,45
1,42.43,Beacon Hill,1.0,1012.15,0.1846,2018-12-16 23:45:01,0.76,11.32,12,16,23,45
2,42.5,Boston University,1.0,1012.15,0.1089,2018-12-16 23:45:01,0.76,11.07,12,16,23,45
3,42.11,Fenway,1.0,1012.13,0.0969,2018-12-16 23:45:01,0.77,11.09,12,16,23,45
4,43.13,Financial District,1.0,1012.14,0.1786,2018-12-16 23:45:01,0.75,11.49,12,16,23,45


In [48]:
# Checking for duplicates

weather.duplicated().sum()

0

In [53]:
# check for missing values
weather.isna().mean() * 100

temp           0.000000
location       0.000000
clouds         0.000000
pressure       0.000000
rain          85.755258
time_stamp     0.000000
humidity       0.000000
wind           0.000000
month          0.000000
day            0.000000
hour           0.000000
minute         0.000000
dtype: float64

The 'Rain' column has 85% of missing values. We will however impute these missing values with the column mean

In [56]:
# Dealing with missing values using mean imputation

weather['rain'].fillna(weather['rain'].mean(), inplace = True)

weather.isna().sum()

temp          0
location      0
clouds        0
pressure      0
rain          0
time_stamp    0
humidity      0
wind          0
month         0
day           0
hour          0
minute        0
dtype: int64

In [57]:
# check data types
weather.dtypes

temp                 float64
location              object
clouds               float64
pressure             float64
rain                 float64
time_stamp    datetime64[ns]
humidity             float64
wind                 float64
month                  int64
day                    int64
hour                   int64
minute                 int64
dtype: object

In [58]:
%%sql
DROP TABLE IF EXISTS weather;

 * sqlite:///testdb.sqlite
Done.


[]

In [59]:
%sql PERSIST weather;

 * sqlite:///testdb.sqlite


'Persisted weather'

In [60]:
%%sql
SELECT * FROM weather LIMIT 3;

 * sqlite:///testdb.sqlite
Done.


index,temp,location,clouds,pressure,rain,time_stamp,humidity,wind,month,day,hour,minute
0,42.42,Back Bay,1.0,1012.14,0.1228,2018-12-16 23:45:01.000000,0.77,11.25,12,16,23,45
1,42.43,Beacon Hill,1.0,1012.15,0.1846,2018-12-16 23:45:01.000000,0.76,11.32,12,16,23,45
2,42.5,Boston University,1.0,1012.15,0.1089,2018-12-16 23:45:01.000000,0.76,11.07,12,16,23,45


## <font color='#2F4F4F'>3. Data Analysis</font>

#### 3.1 Basic Information about Price

In [61]:
%%sql
-- # calculate average of price

SELECT AVG(price) FROM Cab;

 * sqlite:///testdb.sqlite
Done.


AVG(price)
16.545125490614065


In [62]:
%%sql
-- # calculate the minimum price

SELECT MIN(price) FROM Cab;

 * sqlite:///testdb.sqlite
Done.


MIN(price)
2.5


In [63]:
%%sql
-- # calculate the maximum price

SELECT MAX(price) FROM Cab;

 * sqlite:///testdb.sqlite
Done.


MAX(price)
97.5


#### 3.2 Price by Cab Type

In [64]:
%%sql
-- # select average price by cab type

SELECT cab_type, AVG(price) FROM Cab
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,17.351396125019512
Uber,15.795343166912708


In [65]:
%%sql
-- # select minimum price by cab type

SELECT cab_type, MIN(price) FROM Cab
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,4.5


In [66]:
%%sql
-- # select maximum price by cab type

SELECT cab_type, MAX(price) FROM Cab
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,97.5
Uber,89.5


#### 3.3 Price by Time Period 

###### 3.3.1 From 12 AM to 6 AM

In [68]:
%%sql
-- # select average price from 12 AM to 6 AM

SELECT cab_type, AVG(price) FROM Cab
WHERE hour >= 0 AND hour <= 6
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,17.34879860873532
Uber,15.804734071787934


In [67]:
%%sql
-- # select minimum price from 12 AM to 6 AM

SELECT cab_type, MIN(price) FROM Cab
WHERE hour >= 0 AND hour <= 6
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,4.5


In [69]:
%%sql
-- # select maximum price from 12 AM to 6 AM

SELECT cab_type, MAX(price) FROM Cab
WHERE hour >= 0 AND hour <= 6
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,97.5
Uber,81.5


###### 3.3.2 From 7 AM to 12 PM

In [70]:
%%sql
-- # select average price from 7 AM to 12 PM

SELECT cab_type, AVG(price) FROM Cab
WHERE hour >= 7 AND hour <= 12
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,17.352357270472787
Uber,15.75489055791336


In [72]:
%%sql
-- # select minimum price from 7 AM to 12 PM

SELECT cab_type, MIN(price) FROM Cab
WHERE hour >= 7 AND hour <= 12
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,4.5


In [71]:
%%sql
-- # select maximum price from 7 AM to 12 PM

SELECT cab_type, MAX(price) FROM Cab
WHERE hour >= 7 AND hour <= 12
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,89.0
Uber,80.5


###### 3.3.3 From 1 PM to 6 PM

In [73]:
%%sql
-- # select average price from 1 PM to 6 PM

SELECT cab_type, AVG(price) FROM Cab
WHERE hour >= 13 AND hour <= 18
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,17.33826120852875
Uber,15.806273486789935


In [74]:
%%sql
-- # select minimum price from 1 PM to 6 PM

SELECT cab_type, MIN(price) FROM Cab
WHERE hour >= 13 AND hour <= 18
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,4.5


In [75]:
%%sql
-- # select maximum price from 1 PM to 6 PM

SELECT cab_type, MAX(price) FROM Cab
WHERE hour >= 13 AND hour <= 18
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,92.0
Uber,89.5


###### 3.3.4 From 7 PM to 11 PM

In [76]:
%%sql
-- # select average price from 7 PM to 11 PM

SELECT cab_type, AVG(price) FROM Cab
WHERE hour >= 19 AND hour <= 23
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,17.370441109737584
Uber,15.816331473869642


In [77]:
%%sql
-- # select minimum price from 7 PM to 11 PM

SELECT cab_type, MIN(price) FROM Cab
WHERE hour >= 19 AND hour <= 23
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,4.5


In [78]:
%%sql
-- # select maximum price from 7 PM to 11 PM

SELECT cab_type, MAX(price) FROM Cab
WHERE hour >= 19 AND hour <= 23
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,92.0
Uber,76.0


#### 3.4 Price by Distance

In [79]:
%%sql
-- # preview minimum distance

SELECT MIN(distance) FROM Cab;

 * sqlite:///testdb.sqlite
Done.


MIN(distance)
0.02


In [80]:
%%sql
-- # preview maximum distance

SELECT MAX(distance) FROM Cab;

 * sqlite:///testdb.sqlite
Done.


MAX(distance)
7.86


###### 3.4.1 Distance 0.00 to 1.96

In [81]:
%%sql
-- # select average price from distance 0.00 to 1.96

SELECT cab_type, AVG(price) FROM Cab
WHERE distance >= 0.00 AND distance <= 1.96
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,14.077923988323509
Uber,13.372183977950597


In [82]:
%%sql
-- # select minimum price from distance 0.00 to 1.96

SELECT cab_type, MIN(price) FROM Cab
WHERE distance >= 0.00 AND distance <= 1.96
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,4.5


In [83]:
%%sql
-- # select maximum price from distance 0.00 to 1.96

SELECT cab_type, MAX(price) FROM Cab
WHERE distance >= 0.00 AND distance <= 1.96
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,65.0
Uber,61.5


###### 3.4.2 Distance 1.97 to 3.93

In [84]:
%%sql
-- # select average price from distance 1.97 to 3.93

SELECT cab_type, AVG(price) FROM Cab
WHERE distance >= 1.97 AND distance <= 3.93
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,19.247476944432997
Uber,17.141743173645597


In [85]:
%%sql
-- # select minimum price from distance 1.97 to 3.93

SELECT cab_type, MIN(price) FROM Cab
WHERE distance >= 1.97 AND distance <= 3.93
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,2.5
Uber,6.0


In [86]:
%%sql
-- # select maximum price from distance 1.97 to 3.93

SELECT cab_type, MAX(price) FROM Cab
WHERE distance >= 1.97 AND distance <= 3.93
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,92.0
Uber,80.5


###### 3.4.3 Distance 3.94 to 5.89

In [87]:
%%sql
-- # select average price from distance 3.94 to 5.89

SELECT cab_type, AVG(price) FROM Cab
WHERE distance >= 3.94 AND distance <= 5.89
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,25.54245415758896
Uber,22.335192601067888


In [89]:
%%sql
-- # select minimum price from distance 3.94 to 5.89

SELECT cab_type, MIN(price) FROM Cab
WHERE distance >= 3.94 AND distance <= 5.89
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,3.0
Uber,7.5


In [90]:
%%sql
-- # select maximum price from distance 3.94 to 5.89

SELECT cab_type, MAX(price) FROM Cab
WHERE distance >= 3.94 AND distance <= 5.89
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,97.5
Uber,87.0


###### 3.4.4 Distance 5.90 to 7.86

In [91]:
%%sql
-- # select average price from distance 5.90 to 7.86

SELECT cab_type, AVG(price) FROM Cab
WHERE distance >= 5.90 AND distance <= 7.86
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,AVG(price)
Lyft,30.1
Uber,26.106672932330827


In [92]:
%%sql
-- # select minimum price from distance 5.90 to 7.86

SELECT cab_type, MIN(price) FROM Cab
WHERE distance >= 5.90 AND distance <= 7.86
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MIN(price)
Lyft,10.5
Uber,10.0


In [93]:
%%sql
-- # select maximum price from distance 5.90 to 7.86

SELECT cab_type, MAX(price) FROM Cab
WHERE distance >= 5.90 AND distance <= 7.86
GROUP BY cab_type;

 * sqlite:///testdb.sqlite
Done.


cab_type,MAX(price)
Lyft,65.0
Uber,89.5


In [94]:
%%sql
-- # merge both Cab and Weather tables 

DROP TABLE IF EXISTS Cab_Weather;

CREATE TABLE Cab_Weather AS
SELECT * FROM Cab
INNER JOIN Weather 
ON Cab.minute = Weather.minute
AND Cab.hour = Weather.hour
AND Cab.day = Weather.day
AND Cab.month = Weather.month
AND Cab.source = Weather.location;

SELECT COUNT(*) FROM Cab_Weather;

 * sqlite:///testdb.sqlite
Done.
Done.
Done.


COUNT(*)
58796


## <font color='#2F4F4F'>4. Summary of Findings</font>

The following are the prices for Lyft:
- overall average of 16.54, overall minimum 2.5, overall maximum 97.5
- average 17.3, minimum 2.5, maximum 97.5 from midnight to 6 AM
- average 17.35, minimum 2.5, maximum 89.0 from 7 AM to 12 PM
- average 17.33, minimum 2.5, maximum 92.0 from 1 PM to 6 PM
- average 17.37, minimum 2.5, maximum 92.0 from 7 PM to 11 PM

The following are the prices for Uber:
- overall average of 15.79, overall minimum 4.5, overall maximum 89.5
- average 15.80, minimum 4.5, maximum 81.5 from midnight to 6 AM
- average 15.75, minimum 4.5, maximum 80.5 from 7 AM to 12 PM
- average 15.80, minimum 4.5, maximum 89.5 from 1 PM to 6 PM
- average 15.81, minimum 4.5, maximum 76.0 from 7 PM to 11 PM

Minimum distance is 0.02, maximum distance is 7.86.

**For distance 0.00-1.96:**
Lyft has average price 14.07, minimum price 2.5, and maximum price 65.0.
Uber has average price 13.37, minimum price 4.5, and maximum price 61.5.

**For distance 1.97-3.93:**
Lyft has average price 19.24, minimum price 2.5, and maximum price 92.0.
Uber has average price 17.14, minimum price 6.0, and maximum price 80.5.

**For distance 3.94-5.89:**
Lyft has average price 25.54,  minimum price 3.0, and maximum price 97.5.
Uber has average price 22.33, minimum price 7.5, and maximum price 87.0.

**For distance 5.90-7.86**
Lyft has average price 30.1, minimum price 10.5, and maximum price 65.0.
Uber has average price 26.10, minimum price 10.0, and maximum price 89.5.

## <font color='#2F4F4F'>5. Recommendations</font>

The Ride-Sharing Startup management team can use this information as a starting point then later adjust accordingly based on user feedback.

## <font color='#2F4F4F'>6. Challenging your Solution</font>

### a) Did we have the right data?
Yes.

### b) Did we have the right question?
Yes.

### Is further analysis required?
Yes. Such as how a combination of time and distance affects prices (what's the average price for traveling to Financial District at 7 AM compared to at 2 PM, etc.), how weather affects price, and so on.