### Cleaning data with pandas!

What we will learn in this notebook:
- how to change column types
- how to drop empty columns
- how to drop rows with null values
- how to replace null values
- how to find and replace placeholders

First, let's import pandas and open `../data/simple_data.csv` as a dataframe. Then, let's look at the data!

#### Checking for null values

Uh-oh. At first glance, this dataset looks pretty bad! We can check to see if there's missing data by using the `.is_na()` function. You can check it on the whole dataset, a single column, or a single row. 

Let's check if there's missing data in the whole dataset using the `.isna()` function.

That helps us, kiiind of, but. Let's chain the `.sum()` function after the `.isna()` (so it looks like `.isna().sum()`) to get some aggregate counts.

Let's check if there's missing data in row 3 using the `iloc[]` function we learned last week.

Geez, this data is a mess! Let's clean it up.

#### Dropping empty columns

One of the columns, aptly titled `empty_column`, has no values at all! We can drop it using the `.dropna()` function.

#### Dropping empty rows

It also looks like there's a row with no data at all! We can drop that using `.dropna()` too, but this time we can set our axis to `rows`.

#### Dropping rows with missing values

Sometimes data can be incomplete. We can drop rows with incomplete data by using `.dropna()` in a slightly different way from above. 

Let's remove rows that have `NaN` or a null value, in the `mising_values` column.

#### Filling missing values

Sometimes we don't want to drop rows just because they're missing values. Sometimes we feel like we can adequately replace NaNs with an actual value.

Let's replace with the `NaN` values in the `missing_values_2` column with `.fillna()`!

#### Dropping individual rows

Sometimes people put things that shouldn't be in our data. What are a few unhelpful things in our data right now?

We can drop an individual row by using the `.drop()`

#### Dropping lists of rows

#### Dropping placeholder values by condition

#### Replacing placeholder values

Other times we want to replace placeholder values. To do this, we can use the `.replace()` function.

#### Fixing column types

Our data's looking much better! There's one last thing we need to do to make sure that it's ready for analysis. Let's check the types of each column. You can do this by appending  `.dtypes` to your dataframe's variable name.

Notice there are two different data types being used in this dataframe: `float64`, and `object`. Different types have different rules. These rules can help us create guardrails for ourselves. 

For instance, we probably want to be able to do math on all the numbers in the `placeholder_values` and `placeholder_values_2` columns. So let's fix that! Use the `.astype()` function to convert `placeholder_values` and `placeholder_values_2` from an `str` to a `float`.

Now check the types of each column again by using `.dtypes`. Notice a change?

One last thing! Right now, the column `participant_id` is a `float64`. We usually don't want or expect to do much on identification numbers, so let's convert that to a `str`.