# **Handling Missing Data**

**Pandas**
1. Detecting Missing Data:

   - isnull(): Detect missing values.
   - notnull(): Detect existing (non-missing) values.

In [None]:
df.isnull()
df.notnull()

2. Dropping Missing Data:

   - dropna(): Remove missing values.

In [None]:
df.dropna(axis=0, how='any')  # Drop rows with any missing values
df.dropna(axis=1, how='all')  # Drop columns with all missing values

3. Filling Missing Data:

   - fillna(): Fill missing values.

In [None]:
df.fillna(0)  # Replace missing values with 0
df.fillna(method='ffill')  # Forward fill
df.fillna(method='bfill')  # Backward fill

4. Replacing Missing Data:

   - replace(): Replace values, including missing values.

In [None]:
df.replace(to_replace=np.nan, value=0)

5. Interpolating Missing Data:

   - interpolate(): Interpolate missing values.

In [None]:
df.interpolate(method='linear')

**NumPy**
1. Detecting Missing Data:

   - np.isnan(): Detect NaN values in an array.

In [None]:
np.isnan(arr)

2. Replacing Missing Data:

   - np.nan_to_num(): Replace NaN with zero or any specified value.

In [None]:
np.nan_to_num(arr, nan=0.0)

**Scikit-learn**
1. Imputation:

- SimpleImputer: Basic imputation strategies (mean, median, most frequent, constant).

In [None]:
from sklearn.impute import SimpleImputer

imputer = SimpleImputer(strategy='mean')
data_imputed = imputer.fit_transform(data)

- KNNImputer: Use k-Nearest Neighbors for imputation.

In [None]:
from sklearn.impute import KNNImputer

imputer = KNNImputer(n_neighbors=5)
data_imputed = imputer.fit_transform(data)

- IterativeImputer: Multivariate imputer that estimates each feature from all the others.

In [None]:
from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer

imputer = IterativeImputer()
data_imputed = imputer.fit_transform(data)

**Other Useful Functions**
1. Map and Apply Functions:

   - applymap(): Apply a function to a DataFrame elementwise.

In [None]:
df.applymap(lambda x: 0 if pd.isnull(x) else x)

2. Aggregation and Transformation:

   - groupby().apply(): Apply a function to each group.

In [None]:
df.groupby('key').apply(lambda x: x.fillna(x.mean()))