# ForenXAI Model Stress Testing - Google Colab

**Comprehensive stress testing using real synthetic network traffic data**

This notebook:
- Loads trained models from Google Drive
- Uses synthetic CSV data for realistic testing
- Measures performance (throughput, latency, accuracy)
- Provides detailed results with visualizations

**Setup Requirements:**
1. Upload models to: `My Drive/Featured Dataset/trained_models/`
2. Upload CSV to: `My Drive/Featured Dataset/processed/`
3. Run all cells in order

## 1. Setup & Mount Google Drive

In [None]:
# Mount Google Drive
from google.colab import drive
drive.mount('/content/drive')

print("‚úÖ Google Drive mounted successfully!")

In [None]:
# Install required packages
!pip install -q scikit-learn tensorflow pandas numpy joblib psutil

print("‚úÖ Packages installed!")

In [None]:
# Import libraries
import numpy as np
import pandas as pd
import joblib
import time
import psutil
import os
from tensorflow.keras.models import load_model
from sklearn.metrics import (
    accuracy_score, precision_score, recall_score, 
    f1_score, confusion_matrix
)
import warnings
warnings.filterwarnings('ignore')

print("‚úÖ Libraries imported!")

## 2. Configure Paths

In [None]:
# Configure paths (adjust if your folders are named differently)
MODELS_DIR = '/content/drive/MyDrive/Featured Dataset/trained_models'
DATA_DIR = '/content/drive/MyDrive/Featured Dataset/processed'

# CSV file to use for testing
TEST_CSV = 'synthetic_train_split.csv'  # or 'synthetic_val_split.csv'

# Model files
RF_MODEL = 'random_forest_pipeline.joblib'
MLP_MODEL = 'mlp_model.h5'
ISO_MODEL = 'isolation_forest_pipeline.joblib'

print(f"üìÇ Models Directory: {MODELS_DIR}")
print(f"üìÇ Data Directory: {DATA_DIR}")
print(f"üìÑ Test CSV: {TEST_CSV}")

In [None]:
# Verify files exist
csv_path = os.path.join(DATA_DIR, TEST_CSV)
rf_path = os.path.join(MODELS_DIR, RF_MODEL)
mlp_path = os.path.join(MODELS_DIR, MLP_MODEL)
iso_path = os.path.join(MODELS_DIR, ISO_MODEL)

print("\nüîç Checking files...")
print(f"CSV File: {'‚úÖ' if os.path.exists(csv_path) else '‚ùå'} {csv_path}")
print(f"RF Model: {'‚úÖ' if os.path.exists(rf_path) else '‚ùå'} {rf_path}")
print(f"MLP Model: {'‚úÖ' if os.path.exists(mlp_path) else '‚ùå'} {mlp_path}")
print(f"ISO Model: {'‚úÖ' if os.path.exists(iso_path) else '‚ùå'} {iso_path}")

if not os.path.exists(csv_path):
    print("\n‚ö†Ô∏è CSV file not found! Upload synthetic_train_split.csv to your Google Drive.")

## 3. Load and Prepare Data

In [None]:
# Load CSV data
print("üìä Loading data from CSV...")
df = pd.read_csv(csv_path)

print(f"‚úÖ Loaded {len(df):,} samples")
print(f"   Features: {df.shape[1] - 2} (excluding Label and Attack)")
print(f"\nFirst few rows:")
df.head()

In [None]:
# Check data distribution
print("üìà Data Distribution:")
print(f"\nLabel Distribution:")
print(df['Label'].value_counts())
print(f"\nAttack Types:")
print(df['Attack'].value_counts())

In [None]:
# Prepare features and labels
print("üîß Preparing features and labels...")

# Extract features (drop Label and Attack columns)
X = df.drop(columns=['Label', 'Attack'], errors='ignore').values

# Extract labels (0=Normal/Benign, 1=Attack)
y = df['Label'].values

# Extract attack types
attack_types = df['Attack'].values

# Clean data (handle NaN and Inf)
X = np.nan_to_num(X, nan=0.0, posinf=1e10, neginf=-1e10)

print(f"‚úÖ Prepared {len(X):,} samples")
print(f"   Features shape: {X.shape}")
print(f"   Labels shape: {y.shape}")
print(f"   Normal: {np.sum(y==0):,} ({np.mean(y==0)*100:.1f}%)")
print(f"   Attack: {np.sum(y==1):,} ({np.mean(y==1)*100:.1f}%)")

## 4. Sample Data for Faster Testing (Optional)

In [None]:
# For faster testing, sample a subset
# Set to None to use all data
SAMPLE_SIZE = 10000  # Change to None for full dataset

if SAMPLE_SIZE and SAMPLE_SIZE < len(X):
    np.random.seed(42)
    indices = np.random.choice(len(X), SAMPLE_SIZE, replace=False)
    X_test = X[indices]
    y_test = y[indices]
    attack_test = attack_types[indices]
    print(f"üì¶ Using {SAMPLE_SIZE:,} samples for testing")
else:
    X_test = X
    y_test = y
    attack_test = attack_types
    print(f"üì¶ Using all {len(X):,} samples for testing")

## 5. Define Testing Functions

In [None]:
def test_model(model_name, model, X, y, attack_types):
    """
    Test a model and return comprehensive metrics
    """
    print(f"\n{'='*70}")
    print(f"TESTING: {model_name}")
    print(f"{'='*70}")
    
    # Measure memory before
    process = psutil.Process(os.getpid())
    mem_before = process.memory_info().rss / 1024 / 1024  # MB
    
    # Make predictions with timing
    print(f"\n‚è±Ô∏è  Making {len(X):,} predictions...")
    start_time = time.time()
    
    if model_name == 'MLP Neural Network':
        predictions_proba = model.predict(X, verbose=0).flatten()
        predictions = (predictions_proba > 0.5).astype(int)
    elif model_name == 'Isolation Forest':
        preds = model.predict(X)
        predictions = np.where(preds == -1, 1, 0)
    else:
        predictions = model.predict(X)
    
    elapsed_time = time.time() - start_time
    
    # Measure memory after
    mem_after = process.memory_info().rss / 1024 / 1024
    
    # Calculate metrics
    throughput = len(X) / elapsed_time
    latency = (elapsed_time / len(X)) * 1000  # ms per sample
    
    accuracy = accuracy_score(y, predictions)
    precision = precision_score(y, predictions, zero_division=0)
    recall = recall_score(y, predictions, zero_division=0)
    f1 = f1_score(y, predictions, zero_division=0)
    
    cm = confusion_matrix(y, predictions)
    tn, fp, fn, tp = cm.ravel()
    
    # Print results
    print(f"\n‚ö° PERFORMANCE:")
    print(f"   Time: {elapsed_time:.3f} seconds")
    print(f"   Throughput: {throughput:,.0f} samples/sec")
    print(f"   Latency: {latency:.4f} ms/sample")
    print(f"   Memory: {mem_after - mem_before:.2f} MB")
    
    print(f"\nüìä ACCURACY:")
    print(f"   Accuracy:  {accuracy:.4f} ({accuracy*100:.2f}%)")
    print(f"   Precision: {precision:.4f}")
    print(f"   Recall:    {recall:.4f}")
    print(f"   F1-Score:  {f1:.4f}")
    
    print(f"\nüéØ CONFUSION MATRIX:")
    print(f"   True Negatives:  {tn:,}")
    print(f"   False Positives: {fp:,}")
    print(f"   False Negatives: {fn:,}")
    print(f"   True Positives:  {tp:,}")
    
    # Attack type breakdown
    print(f"\nüîç ATTACK TYPE BREAKDOWN:")
    for attack in np.unique(attack_types):
        mask = attack_types == attack
        if np.sum(mask) > 0:
            attack_acc = accuracy_score(y[mask], predictions[mask])
            print(f"   {attack:<20}: {attack_acc:.4f} ({np.sum(mask):,} samples)")
    
    return {
        'model': model_name,
        'samples': len(X),
        'time': elapsed_time,
        'throughput': throughput,
        'latency': latency,
        'memory_mb': mem_after - mem_before,
        'accuracy': accuracy,
        'precision': precision,
        'recall': recall,
        'f1_score': f1,
        'tn': int(tn),
        'fp': int(fp),
        'fn': int(fn),
        'tp': int(tp)
    }

print("‚úÖ Testing functions defined!")

## 6. Test Random Forest Model

In [None]:
# Load and test Random Forest
if os.path.exists(rf_path):
    print("üì¶ Loading Random Forest model...")
    rf_model = joblib.load(rf_path)
    print("‚úÖ Model loaded!")
    
    rf_results = test_model('Random Forest', rf_model, X_test, y_test, attack_test)
else:
    print("‚ùå Random Forest model not found")
    rf_results = None

## 7. Test MLP Neural Network Model

In [None]:
# Load and test MLP
if os.path.exists(mlp_path):
    print("üì¶ Loading MLP Neural Network model...")
    mlp_model = load_model(mlp_path, compile=False)
    print("‚úÖ Model loaded!")
    
    mlp_results = test_model('MLP Neural Network', mlp_model, X_test, y_test, attack_test)
else:
    print("‚ùå MLP model not found")
    mlp_results = None

## 8. Test Isolation Forest Model

In [None]:
# Load and test Isolation Forest
if os.path.exists(iso_path):
    print("üì¶ Loading Isolation Forest model...")
    iso_model = joblib.load(iso_path)
    print("‚úÖ Model loaded!")
    
    iso_results = test_model('Isolation Forest', iso_model, X_test, y_test, attack_test)
else:
    print("‚ùå Isolation Forest model not found")
    iso_results = None

## 9. Summary Comparison

In [None]:
# Compile results
all_results = [r for r in [rf_results, mlp_results, iso_results] if r is not None]

if all_results:
    print("\n" + "="*80)
    print("STRESS TEST SUMMARY")
    print("="*80)
    
    # Create comparison table
    results_df = pd.DataFrame(all_results)
    
    print(f"\n{'Model':<25} {'Throughput':<18} {'Latency':<15} {'Accuracy':<12} {'F1-Score':<12}")
    print("-"*82)
    
    for _, row in results_df.iterrows():
        print(f"{row['model']:<25} {row['throughput']:>8,.0f} s/s      "
              f"{row['latency']:>6.4f} ms    "
              f"{row['accuracy']:>6.4f}      "
              f"{row['f1_score']:>6.4f}")
    
    # Best performers
    print("\n" + "="*80)
    print("BEST PERFORMERS")
    print("="*80)
    
    best_acc = results_df.loc[results_df['accuracy'].idxmax()]
    best_throughput = results_df.loc[results_df['throughput'].idxmax()]
    best_f1 = results_df.loc[results_df['f1_score'].idxmax()]
    
    print(f"üéØ Best Accuracy:  {best_acc['model']} ({best_acc['accuracy']:.4f})")
    print(f"‚ö° Best Throughput: {best_throughput['model']} ({best_throughput['throughput']:,.0f} s/s)")
    print(f"üìä Best F1-Score:  {best_f1['model']} ({best_f1['f1_score']:.4f})")
    
    print("\n" + "="*80)
    print("‚úÖ STRESS TESTING COMPLETE!")
    print("="*80)
    
    # Display full results dataframe
    print("\nüìä Detailed Results:")
    results_df
else:
    print("\n‚ùå No models were tested. Check that models are uploaded to Google Drive.")

## 10. Save Results (Optional)

In [None]:
# Save results to CSV
if all_results:
    output_path = '/content/drive/MyDrive/stress_test_results.csv'
    results_df.to_csv(output_path, index=False)
    print(f"\nüíæ Results saved to: {output_path}")
else:
    print("\n‚ö†Ô∏è No results to save")

---

## üìù Notes:

**Interpreting Results:**
- **Throughput**: Higher is better (>5,000 s/s is excellent)
- **Latency**: Lower is better (<0.5 ms is excellent)
- **Accuracy**: Higher is better (>95% is good, >98% is excellent)
- **F1-Score**: Balanced metric (>0.90 is good, >0.95 is excellent)

**What to Check:**
1. All models should have accuracy >90%
2. F1-Score should be >0.85 for production
3. Throughput should be >1,000 samples/sec
4. False negatives (missed attacks) should be minimized

**Next Steps:**
- If results look good: Models are production-ready ‚úÖ
- If accuracy is low: Retrain models with more data
- If throughput is low: Consider model optimization or hardware upgrade