# Handling Missing Data & Uniques in Pandas

In real-world datasets, it's common to encounter:

- Missing values (NaN)
- Duplicate or repeated values
- Data filtering based on conditions

Let’s go through how Pandas helps solve them!

In [2]:
import numpy as np
import pandas as pd

## Create Sample DataFrame with NaN

In [3]:
df = pd.DataFrame(
    data=[[1, np.nan, 2], [1, 3.0, 4]],
    index=["Row1", "Row2"],
    columns=["Col1", "Col2", "Col3"]
)
df

Unnamed: 0,Col1,Col2,Col3
Row1,1,,2
Row2,1,3.0,4


## Detecting Missing Values


In [5]:
df.isnull()

Unnamed: 0,Col1,Col2,Col3
Row1,False,True,False
Row2,False,False,False


## Count Missing Values in Each Column

In [6]:
df.isnull().sum()

Col1    0
Col2    1
Col3    0
dtype: int64

## Check Which Columns Have No Missing Values

In [7]:
df.isnull().sum() == 0

Col1     True
Col2    False
Col3     True
dtype: bool

## Count Frequency of Each Value in a Column

In [8]:
df['Col3'].value_counts()

Col3
2    1
4    1
Name: count, dtype: int64

# UNIQUE VALUES & BOOLEAN FILTERING

## Create Another DataFrame for Unique Values & Filters

In [9]:
df2 = pd.DataFrame(
    data=np.arange(0, 20).reshape(5, 4),
    index=["Row1", "Row2", "Row3", "Row4", "Row5"],
    columns=["Col1", "Col2", "Col3", "Col4"]
)
df2


Unnamed: 0,Col1,Col2,Col3,Col4
Row1,0,1,2,3
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19


## Find Unique Values in a Column

In [10]:
df2['Col1'].unique()

array([ 0,  4,  8, 12, 16])

## Boolean Filtering: Find Elements Greater Than 2

In [11]:
df2 > 2

Unnamed: 0,Col1,Col2,Col3,Col4
Row1,False,False,False,True
Row2,True,True,True,True
Row3,True,True,True,True
Row4,True,True,True,True
Row5,True,True,True,True


## Filter Rows Where Col2 > 2

In [12]:
df2[df2['Col2'] > 2]

Unnamed: 0,Col1,Col2,Col3,Col4
Row2,4,5,6,7
Row3,8,9,10,11
Row4,12,13,14,15
Row5,16,17,18,19
