## Load and Read the CSV Files

In [1]:
# Add Matplotlib inline magic command
%matplotlib inline
# Dependencies and Setup
import matplotlib.pyplot as plt
import pandas as pd

In [2]:
# Files to load
city_data_to_load = "Resources/city_data.csv"
ride_data_to_load = "Resources/ride_data.csv"

In [3]:
# Read the city data file and store it in a pandas DataFrame.
city_data_df = pd.read_csv(city_data_to_load)
city_data_df.head(10)

Unnamed: 0,city,driver_count,type
0,Richardfort,38,Urban
1,Williamsstad,59,Urban
2,Port Angela,67,Urban
3,Rodneyfort,34,Urban
4,West Robert,39,Urban
5,West Anthony,70,Urban
6,West Angela,48,Urban
7,Martinezhaven,25,Urban
8,Karenberg,22,Urban
9,Barajasview,26,Urban


In [4]:
# Read the ride data file and store it in a pandas DataFrame.
ride_data_df = pd.read_csv(ride_data_to_load)
ride_data_df.head(10)

Unnamed: 0,city,date,fare,ride_id
0,Lake Jonathanshire,1/14/2019 10:14,13.83,5739410000000.0
1,South Michelleport,3/4/2019 18:24,30.24,2343910000000.0
2,Port Samanthamouth,2/24/2019 4:29,33.44,2005070000000.0
3,Rodneyfort,2/10/2019 23:22,23.44,5149250000000.0
4,South Jack,3/6/2019 4:28,34.58,3908450000000.0
5,South Latoya,3/11/2019 12:26,9.52,1995000000000.0
6,New Paulville,2/27/2019 11:17,43.25,793208000000.0
7,Simpsonburgh,4/26/2019 0:43,35.98,111954000000.0
8,South Karenland,1/8/2019 3:28,35.09,7995620000000.0
9,North Jasmine,3/9/2019 6:26,42.81,5327640000000.0


## Explore the Data in Pandas

### 1. Get all the rows that contain null values
### 2. Make sure the driver_count column has an integer data type.
### 3. Find out how many data points there are for each type of city.

### 1. Get all the rows that contain null values
To get the name of each column and the number of rows that are not null, we can use the df.count() method.

Another option is to use df.isnull().sum() method chaining.

In [5]:
# Get the columns and the rows that are not null.
city_data_df.count()

city            120
driver_count    120
type            120
dtype: int64

In [6]:
# Get the columns and the rows that are not null.
city_data_df.isnull().sum()

city            0
driver_count    0
type            0
dtype: int64

### 2. Make sure the driver_count column has an integer data type.

Next we need to see if driver_count column has a numerical data type so we can perform math on that column
To get data types from a column we use dtypes on dataframes

In [7]:
# Get the data types of each column.
city_data_df.dtypes

city            object
driver_count     int64
type            object
dtype: object

### 3. Find out how many data points there are for each type of city.

Finally, we'll check to see how many data points there are for each type of city. To do this, we'll use the sum() method on the city_data_df for the type column where the condition equals each city in the DataFrame.

We can use the unique() method on a specific column, which will return an array, or list, of all the unique values of that column

In [8]:
# Get the unique values of the type of city.
city_data_df["type"].unique()

array(['Urban', 'Suburban', 'Rural'], dtype=object)

In [9]:
# Get the number of data points from the Urban cities.
sum(city_data_df["type"]=="Urban")

66

In [10]:
# Get the number of data points from the suburban cities.
sum(city_data_df["type"]=="Suburban")

36

In [11]:
# Get the number of data points from the Rural cities.
sum(city_data_df["type"]=="Rural")

18

## Inspect Ride Data DataFrame

For the ride_data_df DataFrame, we need to:

Get all the rows that contain null values.
Make sure the fare and ride_id columns are numerical data types.

In [12]:
# Get the columns and the rows that are not null.
ride_data_df.count()

city       2375
date       2375
fare       2375
ride_id    2375
dtype: int64

In [13]:
# Get the columns and the rows that are not null.
ride_data_df.isnull().sum()

city       0
date       0
fare       0
ride_id    0
dtype: int64

In [14]:
# Get the data types of each column.
ride_data_df.dtypes

city        object
date        object
fare       float64
ride_id    float64
dtype: object

## Merge DataFrames