# Handling Missing Data in Pandas
This notebook demonstrates various techniques to handle missing data (NaN) in Pandas DataFrames.

## Import Required Libraries
We start by importing the necessary libraries.

In [None]:
import pandas as pd

## Load the Dataset
Load the dataset into a Pandas DataFrame for analysis.

In [None]:
clients = pd.read_csv("../data/clients.csv")
clients.head()

## Removing Missing Data
Explore different methods to remove rows with missing data.

In [None]:
# Remove rows with at least one NaN value
clients.dropna() # This method is not recommended as it may remove important data.

In [None]:
# Remove rows where all values are NaN
clients.dropna(how="all")

In [None]:
# Remove rows with NaN in a specific column (e.g., 'dtAtualizacao')
clients.dropna(how="all", subset=["dtAtualizacao"])

## Replacing Missing Data
Learn how to replace NaN values with other values.

In [None]:
# Replace NaN in the 'dtAtualizacao' column with a default value
clients["dtAtualizacao"].fillna("0000-00-00 00:00:00.000")

## Creating a Sample DataFrame
Create a small DataFrame to demonstrate filling NaN values with statistical measures.

In [None]:
df = pd.DataFrame(
    {
        "nome": ["angela", "maria", None, "stacy"],
        "idade": [27, None, 45, None],
        "salario": [22000, 17890, None, 3456]
    }
)
df

## Filling Missing Data with Mean
Fill NaN values with the mean of the respective columns.

In [None]:
# Calculate the mean for numeric columns and fill NaN values
mean = df[["idade", "salario"]].mean() # This modifies NaN without affecting the mean but changes the standard deviation.
df.fillna(mean)