# Missing Money

In this activity, you’ll identify and handle missing values in a dataset. 

Instructions:

1. Import the Pandas and `pathlib` libraries.

2. Use `Path` with the `read_csv` function to read the CSV file into the DataFrame. Use the `index_col`, `parse_dates`, and `infer_datetime_format` parameters to set the Date column as the index.

3. Confirm that Pandas properly imported the DataFrame by using the `head` function to view the first five rows.

4. Determine the total number of missing values by using the `isnull` function together with the `sum` function.

5. Drop the rows that have missing values by using the `dropna` function.

6. Confirm that all the missing values have been removed by running the `isnull` function. 

References:

[Pandas read_csv function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

[Pandas isnull function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.isnull.html)


## Step 1: Import the Pandas and `pathlib` libraries.

In [1]:
# Import the Pandas library
# YOUR CODE HERE
import pandas as pd
from pathlib import Path

csvpath = Path("../Resources/money_flows.csv")


# Import the Path module from the pathlib library
# YOUR CODE HERE


## Step 2: Use `Path` with the `read_csv` function to read the CSV file into the DataFrame. Use the `index_col`, `parse_dates`, and `infer_datetime_format` parameters to set the Date column as the index.

In [2]:
# Read in the CSV file called "money_flows.csv" using the Path module
# The CSV file is located in the Resources folder
# Set the index to the column "Date"
# Set the parse_dates and infer_datetime_format parameters
money_flows_df = pd.read_csv(csvpath)


## Step 3: Confirm that Pandas properly imported the DataFrame by using the `head` function to view the first five rows.

In [3]:
# Call the head function to view the first 5 rows of the DataFrame
# YOUR CODE HERE
money_flows_df.head()

Unnamed: 0,Date,Total Payments
0,1/1/20,
1,1/2/20,1.04
2,1/3/20,1.65
3,1/3/20,1.65
4,1/3/20,1.65


## Step 4: Determine the total number of missing values by using the `isnull` function together with the `sum` function.

In [5]:
# Use the isnull function in conjunction with the sum function 
# to determine the total number of missing values in the DataFrame
# YOUR CODE HERE
money_flows_df.isnull().sum()


Date               0
Total Payments    10
dtype: int64

## Step 5: Drop the rows that have missing values by using the `dropna` function.

In [6]:
# Use the dropna function to eliminate the rows with missing values from the DataFrame
# YOUR CODE HERE
money_flows_df.dropna()

Unnamed: 0,Date,Total Payments
1,1/2/20,1.04
2,1/3/20,1.65
3,1/3/20,1.65
4,1/3/20,1.65
5,1/4/20,2.02
...,...,...
363,12/26/20,210.13
364,12/27/20,211.08
365,12/28/20,213.27
366,12/29/20,217.28


## Step 6: Confirm that all the missing values have been removed by running the `isnull` function.

In [7]:
# Use the isnull and sum functions to confirm that all missing values have been removed
# YOUR CODE HERE
money_flows_df.isnull().sum()


Date               0
Total Payments    10
dtype: int64

### Important:

By the way, did you notice that the above `money_flows_df` DataFrame had a few duplicates for January 3rd, 2020? 

In the next section, we'll learn a few tools to check for and fix duplicates like these.