### The **Z-score**, also known as the **standard score**,
### measures how many standard deviations a data point is from the mean of a dataset.
### It helps in understanding whether a value is typical or unusual compared to the rest of the data.

### **Formula for Z-score**:
$$
Z = \frac{X - \mu}{\sigma}
$$
Where:
- \( Z \) = Z-score
- \( X \) = Data point
- \( \mu \) = Mean of the dataset
- \( \sigma \) = Standard deviation of the dataset

### **Interpretation**:
- \( Z = 0 \) → The data point is exactly at the mean.
- \( Z > 0 \) → The data point is above the mean.
- \( Z < 0 \) → The data point is below the mean.
- \( |Z| > 2 \) → The data point is significantly different from the mean (unusual).
- \( |Z| > 3 \) → The data point is an extreme outlier.

### **Uses of Z-score**:
1. **Identifying Outliers**: Helps detect values that are too far from the mean.
2. **Standardizing Data**: Converts different datasets into a common scale for comparison.
3. **Probability & Normal Distribution**: Used in statistics to find probabilities in a standard normal distribution.
4. **Machine Learning & Data Science**: Helps in feature scaling and anomaly detection.

In [5]:
import numpy as np

# Function to calculate Z-score
def calculate_z_scores(data):
    mean = np.mean(data)
    std_dev = np.std(data)
    z_scores = [(x - mean) / std_dev for x in data]
    return z_scores

# Sample dataset
data = [10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

# Compute Z-scores
z_scores = calculate_z_scores(data)

# Display results
for i, z in enumerate(z_scores):
    print(f"Data point: {data[i]}, Z-score: {z:.2f}")


Data point: 10, Z-score: -1.57
Data point: 12, Z-score: -1.22
Data point: 14, Z-score: -0.87
Data point: 16, Z-score: -0.52
Data point: 18, Z-score: -0.17
Data point: 20, Z-score: 0.17
Data point: 22, Z-score: 0.52
Data point: 24, Z-score: 0.87
Data point: 26, Z-score: 1.22
Data point: 28, Z-score: 1.57


In [7]:
from scipy.stats import zscore

data = [10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

# Compute Z-scores
z_scores = zscore(data)

print("Z-scores:", z_scores)


Z-scores: [-1.5666989  -1.21854359 -0.87038828 -0.52223297 -0.17407766  0.17407766
  0.52223297  0.87038828  1.21854359  1.5666989 ]


In [9]:
import numpy as np

data = [10, 12, 14, 16, 18, 20, 22, 24, 26, 28]

# Compute Z-scores
z_scores = (data - np.mean(data)) / np.std(data)

print("Z-scores:", z_scores)


Z-scores: [-1.5666989  -1.21854359 -0.87038828 -0.52223297 -0.17407766  0.17407766
  0.52223297  0.87038828  1.21854359  1.5666989 ]


## Which One to Use?
## Use SciPy (zscore) → More readable and handles different cases (e.g., NaN values).
## Use NumPy → If you want to manually compute Z-scores.