# **Treating Outliers Using IQR**

In [1]:
import pandas as pd
df = pd.read_csv("heights.csv")
df

Unnamed: 0,name,height
0,mohan,1.2
1,maria,2.3
2,sakib,4.9
3,tao,5.1
4,virat,5.2
5,khusbu,5.4
6,dmitry,5.5
7,selena,5.5
8,john,5.6
9,imran,5.6


In [2]:
df.describe()

Unnamed: 0,height
count,20.0
mean,8.39
std,8.782812
min,1.2
25%,5.35
50%,5.7
75%,6.275
max,40.2


**Calculating Q1 and Q3:**

In [3]:
Q1 = df.height.quantile(0.25)
Q3 = df.height.quantile(0.75)
Q1, Q3

(np.float64(5.3500000000000005), np.float64(6.275))

**Computing the IQR (Interquartile Range):**

In [4]:
IQR = Q3 - Q1
IQR

np.float64(0.9249999999999998)

**Defining the outlier limits:**

In [5]:
lower_limit = Q1 - 1.5 * IQR
upper_limit = Q3 + 1.5 * IQR
lower_limit, upper_limit

(np.float64(3.962500000000001), np.float64(7.6625))

**Finding the outliers:**

In [6]:
df[(df.height < lower_limit) | (df.height > upper_limit)]

Unnamed: 0,name,height
0,mohan,1.2
1,maria,2.3
17,aamir,14.5
18,ken,23.2
19,Liu,40.2


**Removing the outliers:**

In [7]:
df_no_outlier = df[(df.height > lower_limit) & (df.height < upper_limit)]
df_no_outlier

Unnamed: 0,name,height
2,sakib,4.9
3,tao,5.1
4,virat,5.2
5,khusbu,5.4
6,dmitry,5.5
7,selena,5.5
8,john,5.6
9,imran,5.6
10,jose,5.8
11,deepika,5.9
