## Isolation Forest for Anomaly Detection
**Objective**: Understand and apply the Isolation Forest algorithm to identify anomalies in datasets.

### Task: Anomaly Detection in Network Traffic
**Steps**:
1. Extract Features from Dataset:
    - Load `network_traffic.csv` .
2. Isolation Forest Model
3. Display Anomalies

In [None]:

# write your code from here
import pandas as pd
from sklearn.ensemble import IsolationForest

# Step 1: Load the network traffic dataset
df = pd.read_csv('network_traffic.csv')

# Display first few rows to understand data structure
print("Dataset preview:")
print(df.head())

# Step 1 (cont.): Select numeric features for anomaly detection
# Replace these column names with your actual features
features = ['packet_size', 'duration', 'source_bytes', 'destination_bytes']  

X = df[features]

# Step 2: Initialize and fit Isolation Forest
iso_forest = IsolationForest(contamination=0.05, random_state=42)  # assume 5% anomalies expected
df['anomaly'] = iso_forest.fit_predict(X)

# Step 3: Display anomalies
# anomaly = -1 indicates anomaly, 1 indicates normal
anomalies = df[df['anomaly'] == -1]

print(f"Detected {len(anomalies)} anomalies in the network traffic data.")

print("Sample anomalies:")
print(anomalies.head())

# Optional: Save anomalies to CSV for further analysis
anomalies.to_csv('network_traffic_anomalies.csv', index=False)
