# What is Z-score

Z-score is a statistical measure how many standard deviations a data point is away from the mean of the dataset.

It calculated as below:
$Z=\frac{x−μ}{σ}$

where:
* x is the data point.
* μ is the mean of the dataset<br>
  * Mean: $μ=\frac{x_1 + x_2 + ... + x_n}{n}$
* σ is the standard deviation of the dataset
  * Squared Differences: $(x_i - μ)^2$ for $i=1$ to n
  * Variance: $σ^2=\frac{(x_1-μ)^2+(x_2-μ)^2+...+(x_n-μ)^2}{n}$
  * Standard Deviation: $σ=\sqrt{σ^2}$
* Z is the Z-score.

From the range of "3 sigma" formula below:
* upper_bound = μ + 3σ
* lower_bound = μ - 3σ

We can conclude if a Z-score of 
* Z-score>=3 means 3 standard deviation above the mean
* Z-score<=-3 means 3 standard deviation below the mean

# When to use Z-score?

It provides a standardized score that allows you to compare data points across different datasets from different distributions.

`3 Sigma (σ)` indicates about 99.7% data falls within three standard deviations (3σ) of the mean.

# Z-score for Python

## Using SciPy

In [1]:
from scipy.stats import zscore

# Sample dataset
data = [10, 20, 30, 40, 50]

# Calculate Z-scores
z_scores = zscore(data)

print("Original data:", data)
print("Z-scores:", z_scores)

Original data: [10, 20, 30, 40, 50]
Z-scores: [-1.41421356 -0.70710678  0.          0.70710678  1.41421356]


## Using NumPy:

In [2]:
import numpy as np

# Sample dataset
data = np.array([10, 20, 30, 40, 50])

# Calculate Z-scores
z_scores = (data - np.mean(data)) / np.std(data)

print("Original data:", data)
print("Z-scores:", z_scores)

Original data: [10 20 30 40 50]
Z-scores: [-1.41421356 -0.70710678  0.          0.70710678  1.41421356]
