# Handling Missing Values in NumPy

# üîπ What are Missing Values?

### Missing values are data points that are:
- Not available
- Undefined
- Invalid

# missing values are usually represented as:
- np.nan ‚Üí for floating-point data
- None ‚Üí becomes nan in NumPy arrays (float type)

### üîπ Representation of Missing Values
# 1Ô∏è‚É£ np.nan
- üëâ np.nan = Not a Number (float type)

In [1]:
import numpy as np

a = np.array([1, 2, np.nan, 4])


### 2Ô∏è‚É£ None
- converted to nan (dtype becomes float)

In [2]:
b = np.array([1, 2, None, 4])

### üîπ Detecting Missing Values
### ‚úÖ np.isnan()
- Used to check missing values.

In [3]:
np.isnan(a)


array([False, False,  True, False])

# ‚úÖ Count Missing Values

In [4]:
np.isnan(a).sum()

np.int64(1)

## üîπ Removing Missing Values
### ‚úÖ Remove nan Values

In [5]:
clean = a[~np.isnan(a)]
print(clean)

[1. 2. 4.]


## üîπ Replacing Missing Values
### 1Ô∏è‚É£ Replace with a Fixed Value
- ‚û° Replaces nan with 0

In [6]:
np.nan_to_num(a, nan=0)

array([1., 2., 0., 4.])

# 2Ô∏è‚É£ Replace with Mean 

In [7]:
mean_value = np.nanmean(a)
a_filled = np.where(np.isnan(a), mean_value, a)

## 3Ô∏è‚É£ Replace with Median

In [8]:
median = np.nanmedian(a)
a_filled = np.where(np.isnan(a), median, a)

## üîπ NumPy Functions that Ignore nan
- NumPy provides special functions that automatically ignore missing values.

In [9]:
np.nanmean(a)
np.nansum(a)
np.nanmin(a)
np.nanmax(a)
np.nanstd(a)

np.float64(1.247219128924647)

# üîπ Missing Values in 2D Arrays

In [10]:
arr = np.array([[1, 2, np.nan],
                [4, np.nan, 6]])

Remove Rows with Missing Values

In [11]:
arr[~np.isnan(arr).any(axis=1)]

array([], shape=(0, 3), dtype=float64)

Replace Column-wise Mean

In [None]:
col_mean = np.nanmean(arr, axis=0)
inds = np.where(np.isnan(arr))
arr[inds] = np.take(col_mean, inds[1])

### üîπ Boolean Masking

In [13]:
mask = np.isnan(arr)
arr[mask] = 0