## Using AI for Anomalies Detection in Data Quality
**Description**: Implement an AI-based approach to detect anomalies in data quality.

**Steps**:
1. Use an Anomaly Detection Algorithm:
    - Use sklearn's Isolation Forest for anomaly detection.

**Example data:**

data = np.array([[25, 50000], [30, 60000], [35, 75000], [40, None], [45, 100000]])

2. Integrate with Great Expectations:
    - Generate alerts if anomalies are detected:

In [1]:
# Write your code from here
!pip3 install numpy pandas scikit-learn great_expectations


Defaulting to user installation because normal site-packages is not writeable

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.0.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [3]:
import numpy as np
import pandas as pd
from sklearn.ensemble import IsolationForest
from sklearn.impute import SimpleImputer

# Sample data (age, income), with a missing value
data = np.array([[25, 50000], [30, 60000], [35, 75000], [40, None], [45, 100000]])

# Handle missing values using mean imputation
imputer = SimpleImputer(strategy='mean')
data_imputed = imputer.fit_transform(data)

# Convert to DataFrame
df = pd.DataFrame(data_imputed, columns=["age", "income"])

# Fit Isolation Forest
clf = IsolationForest(contamination=0.2, random_state=42)
df["anomaly"] = clf.fit_predict(data_imputed)

# Mark anomalies (−1 = anomaly, 1 = normal)
print(df)





    age    income  anomaly
0  25.0   50000.0        1
1  30.0   60000.0        1
2  35.0   75000.0        1
3  40.0   71250.0        1
4  45.0  100000.0       -1


In [7]:
# Simple anomaly alert
def alert_anomalies(df):
    if (df["anomaly"] == -1).any():
        print("🚨 ALERT: Anomalies detected by AI!")
        print(df[df["anomaly"] == -1])
    else:
        print("✅ No anomalies detected.")

alert_anomalies(df)



🚨 ALERT: Anomalies detected by AI!
    age    income  anomaly
4  45.0  100000.0       -1
