*********************************************************************************************************************
1. Import your data into a Pandas DataFrame.
2. Merge your DataFrames.
3. Create a bubble chart that showcases the average fare versus the total number of rides with bubble size based on the
total number of drivers for each city type, including urban, suburban, and rural.
4. Determine the mean, median, and mode for the following:
    
    A) The total number of rides for each city type.
    B) The average fares for each city type.
    C) The total number of drivers for each city type.
5. Create box-and-whisker plots that visualize each of the following to determine if there are any outliers:
    
    A) The number of rides for each city type.
    B) The fares for each city type.
    C) The number of drivers for each city type.
6. Create a pie chart that visualizes each of the following data for each city type:
    
    A) The percent of total fares.
    B) The percent of total rides.
    C) The percent of total drivers.

%matplotlib inline
# Import dependencies.
import matplotlib.pyplot as plt
import numpy as np
import statistics
import pandas as pd

In [3]:
# Load in city csv
city_df = pd.read_csv("Resources/city_data.csv")
city_df.head()

Unnamed: 0,city,driver_count,type
0,Richardfort,38,Urban
1,Williamsstad,59,Urban
2,Port Angela,67,Urban
3,Rodneyfort,34,Urban
4,West Robert,39,Urban


In [6]:
# Lets Review the table

# Get the columns and the rows that are not null.
city_df.count()

city            120
driver_count    120
type            120
dtype: int64

*****************************************************************************************************************************
Lets Explore the Tables
*****************************************************************************************************************************

In [12]:
# Get the columns and the rows that are not null.
city_df.isnull().sum()

city            0
driver_count    0
type            0
dtype: int64

In [13]:
# Get the data types of each column.
city_df.dtypes

city            object
driver_count     int64
type            object
dtype: object

In [14]:
# Load in city csv
ride_df = pd.read_csv("Resources/ride_data.csv")
ride_df.head()

Unnamed: 0,city,date,fare,ride_id
0,Lake Jonathanshire,1/14/2019 10:14,13.83,5739410000000.0
1,South Michelleport,3/4/2019 18:24,30.24,2343910000000.0
2,Port Samanthamouth,2/24/2019 4:29,33.44,2005070000000.0
3,Rodneyfort,2/10/2019 23:22,23.44,5149250000000.0
4,South Jack,3/6/2019 4:28,34.58,3908450000000.0


In [15]:
# Get the unique values of the type of city.
city_df["type"].unique()

array(['Urban', 'Suburban', 'Rural'], dtype=object)

In [16]:
# Get the number of data points from the Urban cities.
sum(city_df["type"]=="Urban")

66

In [18]:
# Get the number of data points from the Suburban cities.
sum(city_df["type"]=="Suburban")

36

In [17]:
# Get the number of data points from the Rural cities.
sum(city_df["type"]=="Rural")

18

In [19]:
# Get the columns and the rows that are not null.
ride_df.count()

city       2375
date       2375
fare       2375
ride_id    2375
dtype: int64

In [20]:
# Get the columns and the rows that are not null.
ride_df.isnull().sum()

city       0
date       0
fare       0
ride_id    0
dtype: int64

In [21]:
# Get the data types of each column.
ride_df.dtypes

city        object
date        object
fare       float64
ride_id    float64
dtype: object

**************************************************************************************************************************
Both Tables Are Clean and Ready to Merge

1) The columns in the city_data_df are:

    A) city
    B) driver_count
    C) type

2) The columns in the ride_data_df are:

    A) city
    B) date
    C) fare
    D) ride_id
    
When we merge two DataFrames, we merge on a column with the same data, and the same column name,
in both DataFrames. We use the following syntax to do that:

new_df = pd.merge(leftdf, rightdf, on=["column_leftdf", "column_rightdf"])

We may have to merge the DataFrames using the how= parameter either left, right, inner, or outer
depending how we want to merge the DataFrames. The default is inner
****************************************************************************************************************************

In [25]:
# Combine the data into a single dataset
pyber_data_df = pd.merge(ride_df, city_df, how="left", on=["city", "city"])

# Display the DataFrame
pyber_data_df.head()

Unnamed: 0,city,date,fare,ride_id,driver_count,type
0,Lake Jonathanshire,1/14/2019 10:14,13.83,5739410000000.0,5,Urban
1,South Michelleport,3/4/2019 18:24,30.24,2343910000000.0,72,Urban
2,Port Samanthamouth,2/24/2019 4:29,33.44,2005070000000.0,57,Urban
3,Rodneyfort,2/10/2019 23:22,23.44,5149250000000.0,34,Urban
4,South Jack,3/6/2019 4:28,34.58,3908450000000.0,46,Urban


**************************************************************************************************************************
In the pyber_data_df DataFrame, all the columns from the city_data_df are the first four columns after the index. The driver_count and type columns from the
ride_data_df are added at the end, as shown in the following image

To see the image: https://courses.bootcampspot.com/courses/676/pages/5-dot-2-4-explore-the-data-in-pandas?module_item_id=189873
**************************************************************************************************************************