**Z-Score**

*<- A Z-Score is a statistical measurement of a score's relationship to the mean in a group of scores.*

*<- A Z-score can reveal to a trader if a value is typical for a specified data set or if it is atypical.*

*<- In general, a Z-score of -3.0 to 3.0 suggests that a stock is trading within three standard deviations of its mean.*

*<- Traders have developed many methods that use z-score to identify correlations between trades, trading positions, and evaluate trading strategies.*

**Calculating a z-score requires that you first determine the mean and standard deviation of your data. Once you have these figures, you can calculate your z-score. So, assume you have the following variables:**

**z = Z-score**

**x = the value being evaluated**

**μ = the mean**

**σ = the standard deviation**

![zscore-56a8fa785f9b58b7d0f6e87b.gif](attachment:zscore-56a8fa785f9b58b7d0f6e87b.gif)

**Importing Pandas Library**

In [1]:
import pandas as pd

**Reading CSV File**

In [9]:
df = pd.read_csv("gender_height.csv")
df

Unnamed: 0,gender,height
0,Male,73.847017
1,Male,68.781904
2,Male,74.110105
3,Male,71.730978
4,Male,69.881796
...,...,...
9995,Female,66.172652
9996,Female,67.067155
9997,Female,63.867992
9998,Female,69.034243


**Calculating Mean and Standard Deviation for Z-Score**

In [10]:
df_mean = df.height.mean()
df_std = df.height.std()
df_mean,df_std

(66.367559754866, 3.847528120795573)

**Creating Z-Score Column**

In [11]:
df["ZScore"] = (df.height-df_mean)/df_std
df.head()

Unnamed: 0,gender,height,ZScore
0,Male,73.847017,1.943964
1,Male,68.781904,0.627505
2,Male,74.110105,2.012343
3,Male,71.730978,1.393991
4,Male,69.881796,0.913375


**Outlier Detection Using Z-Score**

In [12]:
df[(df.ZScore < -3) | (df.ZScore > 3)]

Unnamed: 0,gender,height,ZScore
994,Male,78.095867,3.048271
1317,Male,78.462053,3.143445
2014,Male,78.998742,3.282934
3285,Male,78.52821,3.16064
3757,Male,78.621374,3.184854
6624,Female,54.616858,-3.054091
9285,Female,54.263133,-3.146027


**Removing Outlier's from Data**

In [14]:
df_no_outlier = df[(df.ZScore > -3) & (df.ZScore < 3)]
df_no_outlier

Unnamed: 0,gender,height,ZScore
0,Male,73.847017,1.943964
1,Male,68.781904,0.627505
2,Male,74.110105,2.012343
3,Male,71.730978,1.393991
4,Male,69.881796,0.913375
...,...,...,...
9995,Female,66.172652,-0.050658
9996,Female,67.067155,0.181830
9997,Female,63.867992,-0.649655
9998,Female,69.034243,0.693090
