## Isolation Forest for Anomaly Detection
**Objective**: Understand and apply the Isolation Forest algorithm to identify anomalies in datasets.

### Task: Anomaly Detection in Sensor Data
**Steps**:
1. Load Dataset
2. Feature Selection
3. Isolation Forest Implementation
4. Plot Results

In [None]:
# write your code from here
import pandas as pd
import numpy as np
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt

# Step 1: Load Dataset (replace with your actual dataset path or method)
# For example, a CSV file with sensor data having numerical features
df = pd.read_csv('sensor_data.csv')

# Step 2: Feature Selection (select numerical features only)
# Assuming all columns except 'timestamp' or non-numeric are features
numeric_features = df.select_dtypes(include=[np.number])

# Step 3: Isolation Forest Implementation
iso_forest = IsolationForest(contamination=0.05, random_state=42)
iso_forest.fit(numeric_features)
df['anomaly'] = iso_forest.predict(numeric_features)  # -1 for anomaly, 1 for normal

# Step 4: Plot Results
# Plot first two numerical features colored by anomaly
plt.figure(figsize=(10,6))
plt.scatter(
    numeric_features.iloc[:, 0],
    numeric_features.iloc[:, 1],
    c=df['anomaly'].map({1: 'blue', -1: 'red'}),
    alpha=0.6,
    label='Data points'
)
plt.title('Isolation Forest Anomaly Detection on Sensor Data')
plt.xlabel(numeric_features.columns[0])
plt.ylabel(numeric_features.columns[1])
plt.legend(['Normal', 'Anomaly'])
plt.show()
