# Missing Money

In this activity, you’ll identify and handle missing values in a dataset. 

Instructions:

1. Import the Pandas and `pathlib` libraries.

2. Use `Path` with the `read_csv` function to read the CSV file into the DataFrame. Use the `index_col`, `parse_dates`, and `infer_datetime_format` parameters to set the Date column as the index.

3. Confirm that Pandas properly imported the DataFrame by using the `head` function to view the first five rows.

4. Determine the total number of missing values by using the `isnull` function together with the `sum` function.

5. Drop the rows that have missing values by using the `dropna` function.

6. Confirm that all the missing values have been removed by running the `isnull` function. 

References:

[Pandas read_csv function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)

[Pandas isnull function](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.isnull.html)


## Step 1: Import the Pandas and `pathlib` libraries.

In [1]:
# Import the Pandas library
import pandas as pd


# Import the Path module from the pathlib library
from pathlib import Path

## Step 2: Use `Path` with the `read_csv` function to read the CSV file into the DataFrame. Use the `index_col`, `parse_dates`, and `infer_datetime_format` parameters to set the Date column as the index.

In [2]:
# Read in the CSV file called "money_flows.csv" using the Path module
# The CSV file is located in the Resources folder
# Set the index to the column "Date"
# Set the parse_dates and infer_datetime_format parameters
money_flows_df = pd.read_csv(
        Path("../Resources/money_flows.csv"), 
        index_col="Date", 
        parse_dates=True, 
        infer_datetime_format=True
)

## Step 3: Confirm that Pandas properly imported the DataFrame by using the `head` function to view the first five rows.

In [3]:
# Call the head function to view the first 5 rows of the DataFrame
money_flows_df.head()

Unnamed: 0_level_0,Total Payments
Date,Unnamed: 1_level_1
2020-01-01,
2020-01-02,1.04
2020-01-03,1.65
2020-01-03,1.65
2020-01-03,1.65


## Step 4: Determine the total number of missing values by using the `isnull` function together with the `sum` function.

In [4]:
# Use the isnull function in conjunction with the sum function 
# to determine the total number of missing values in the DataFrame
money_flows_df.isnull().sum()

Total Payments    10
dtype: int64

## Step 5: Drop the rows that have missing values by using the `dropna` function.

In [5]:
# Use the dropna function to eliminate the rows with missing values from the DataFrame
money_flows_df = money_flows_df.dropna()

## Step 6: Confirm that all the missing values have been removed by running the `isnull` function.

In [6]:
# Use the isnull and sum functions to confirm that all missing values have been removed
money_flows_df.isnull().sum()

Total Payments    0
dtype: int64

### Important:

By the way, did you notice that the above `money_flows_df` DataFrame had a few duplicates for January 3rd, 2020? 

In the next section, we'll learn a few tools to check for and fix duplicates like these.