# Automated Anomaly Detection
**Objective**: Understand and practice automated anomaly detection using various techniques.

**Task**: Automated Anomaly Detection using Z-score method

**Steps**:
1. Data Set: Download a dataset representing daily sales figures for a retail store.
2. Calculate Z-score: Compute the mean and standard deviation of the sales. Use these to
calculate the Z-score for each day's sales figure.
3. Identify Anomalies: Detect anomalies by identifying values with a Z-score above 3 or below -3.
4. Visualize: Plot a graph to visualize anomalies.

In [1]:
import pandas as pd
import numpy as np

# Simulate 100 days of sales data with a few outliers
np.random.seed(42)
sales = np.random.normal(loc=200, scale=20, size=100)
sales[[10, 50, 95]] = [400, 450, 420]  # injected anomalies

# Create DataFrame
df = pd.DataFrame({'day': range(1, 101), 'sales': sales})

print(df.head())
# Calculate mean and standard deviation
mean_sales = df['sales'].mean()
std_sales = df['sales'].std()

# Compute Z-score
df['z_score'] = (df['sales'] - mean_sales) / std_sales
# Label anomalies
df['anomaly'] = df['z_score'].apply(lambda x: 1 if np.abs(x) > 3 else 0)

# View anomalies
print(df[df['anomaly'] == 1])


   day       sales
0    1  209.934283
1    2  197.234714
2    3  212.953771
3    4  230.460597
4    5  195.316933
    day  sales   z_score  anomaly
10   11  400.0  4.567332        1
50   51  450.0  5.738104        1
95   96  420.0  5.035641        1
