# Frontend Integration Guide
## How to Use the Trained Model with Frontend Data

This notebook demonstrates how to load the trained model and preprocessing artifacts, then use them to make predictions on new data from the frontend.

**Output Format:**
- **0 = NORMAL** (No attack detected - safe traffic)
- **1 = ATTACK** (Anomaly detected - suspicious traffic)

In [1]:
import numpy as np
import pandas as pd
import joblib
import json
from datetime import datetime

print("‚úÖ Libraries loaded successfully!")

‚úÖ Libraries loaded successfully!


## Step 1: Load All Saved Artifacts

In [2]:
# Load all saved model and preprocessing artifacts
print("Loading saved artifacts...\n")

# Load the trained model
model = joblib.load('isolation_forest_frontend.pkl')
print("‚úÖ Model loaded: isolation_forest_frontend.pkl")

# Load feature list
frontend_features = joblib.load('frontend_features.pkl')
print(f"‚úÖ Features loaded: {len(frontend_features)} features")

# Load preprocessors
scaler = joblib.load('scaler_frontend.pkl')
print("‚úÖ Scaler loaded: scaler_frontend.pkl")

encoder = joblib.load('encoder_frontend.pkl')
print("‚úÖ Encoder loaded: encoder_frontend.pkl")

freq_encoding = joblib.load('freq_encoding_frontend.pkl')
print("‚úÖ Frequency encoding loaded: freq_encoding_frontend.pkl")

print(f"\nüìã Frontend Features Used:")
for i, feat in enumerate(frontend_features, 1):
    print(f"   {i:2d}. {feat}")

Loading saved artifacts...

‚úÖ Model loaded: isolation_forest_frontend.pkl
‚úÖ Features loaded: 17 features
‚úÖ Scaler loaded: scaler_frontend.pkl
‚úÖ Encoder loaded: encoder_frontend.pkl
‚úÖ Frequency encoding loaded: freq_encoding_frontend.pkl

üìã Frontend Features Used:
    1. srv_count
    2. service
    3. dst_host_srv_count
    4. dst_host_same_srv_rate
    5. count
    6. dst_host_count
    7. rerror_rate
    8. logged_in
    9. flag_SF
   10. srv_rerror_rate
   11. protocol_type_tcp
   12. dst_host_srv_rerror_rate
   13. dst_host_rerror_rate
   14. src_bytes
   15. dst_bytes
   16. dst_host_same_src_port_rate
   17. protocol_type_udp


## Step 2: Create Dummy Data (Simulating Frontend Input)

In [3]:
# Create 5 dummy records with ONLY the 17 frontend features
# These are ALREADY PREPROCESSED features that your frontend sends
# NO preprocessing needed - frontend already sends processed data

print("="*70)
print("üìã DUMMY DATA - 17 PRE-PROCESSED FRONTEND FEATURES")
print("="*70 + "\n")

dummy_data = {
    'srv_count': [8, 1, 5, 2, 200],
    'service': [0.08, 0.15, 0.08, 0.02, 0.15],  # Already frequency encoded
    'dst_host_srv_count': [255, 255, 255, 1, 5],
    'dst_host_same_srv_rate': [1.0, 1.0, 1.0, 1.0, 0.05],
    'count': [8, 10, 5, 2, 100],
    'dst_host_count': [255, 255, 255, 1, 10],
    'rerror_rate': [0.0, 0.0, 0.0, 0.0, 0.5],
    'logged_in': [1, 0, 1, 0, 0],
    'flag_SF': [1, 1, 1, 1, 0],  # Already one-hot encoded
    'srv_rerror_rate': [0.0, 0.0, 0.0, 0.0, 0.5],
    'protocol_type_tcp': [1, 1, 0, 0, 1],  # Already one-hot encoded
    'dst_host_srv_rerror_rate': [0.0, 0.0, 0.0, 0.0, 0.3],
    'dst_host_rerror_rate': [0.0, 0.0, 0.0, 0.0, 0.3],
    'src_bytes': [-0.5, 0.2, 0.1, -0.8, 0.5],  # Already scaled
    'dst_bytes': [-0.3, 0.8, 0.4, -0.9, 1.5],  # Already scaled
    'dst_host_same_src_port_rate': [0.0, 0.0, 0.0, 0.0, 1.0],
    'protocol_type_udp': [0, 0, 1, 0, 0]  # Already one-hot encoded
}

df_raw = pd.DataFrame(dummy_data)
print(f"‚úÖ Created {len(df_raw)} dummy records with 17 pre-processed features")
print(f"\nDataset shape: {df_raw.shape}")
print(f"\nFeature names ({len(df_raw.columns)} features):")
for i, col in enumerate(df_raw.columns, 1):
    print(f"   {i:2d}. {col}")
print(f"\nDummy Data Sample (first record):")
print(df_raw.iloc[0])

üìã DUMMY DATA - 17 PRE-PROCESSED FRONTEND FEATURES

‚úÖ Created 5 dummy records with 17 pre-processed features

Dataset shape: (5, 17)

Feature names (17 features):
    1. srv_count
    2. service
    3. dst_host_srv_count
    4. dst_host_same_srv_rate
    5. count
    6. dst_host_count
    7. rerror_rate
    8. logged_in
    9. flag_SF
   10. srv_rerror_rate
   11. protocol_type_tcp
   12. dst_host_srv_rerror_rate
   13. dst_host_rerror_rate
   14. src_bytes
   15. dst_bytes
   16. dst_host_same_src_port_rate
   17. protocol_type_udp

Dummy Data Sample (first record):
srv_count                        8.00
service                          0.08
dst_host_srv_count             255.00
dst_host_same_srv_rate           1.00
count                            8.00
dst_host_count                 255.00
rerror_rate                      0.00
logged_in                        1.00
flag_SF                          1.00
srv_rerror_rate                  0.00
protocol_type_tcp                1.00
dst_

## Step 3: Preprocess Frontend Data

In [4]:
# ‚≠ê IMPORTANT: Frontend sends 17 ALREADY-PROCESSED features
# No preprocessing needed! Data is ready for model prediction

print("="*70)
print("üìù DATA STATUS")
print("="*70)
print("\n‚úÖ Data is ALREADY preprocessed:")
print("   - Numeric features: Already scaled (mean=0, std=1)")
print("   - Categorical features: Already one-hot encoded (flag_SF, protocol_type_tcp, protocol_type_udp)")
print("   - Ordinal features: Already frequency encoded (service)")
print("\n‚úÖ Ready for prediction!\n")

üìù DATA STATUS

‚úÖ Data is ALREADY preprocessed:
   - Numeric features: Already scaled (mean=0, std=1)
   - Categorical features: Already one-hot encoded (flag_SF, protocol_type_tcp, protocol_type_udp)
   - Ordinal features: Already frequency encoded (service)

‚úÖ Ready for prediction!



## Step 4: Select Only Frontend Features

In [5]:
# Data is already preprocessed with exactly 17 features
df_processed = df_raw.copy()

# Select the 17 features for prediction
X_frontend = df_processed[frontend_features]

print("="*70)
print("‚úÖ DATA READY FOR PREDICTION")
print("="*70)
print(f"\nFeature matrix shape: {X_frontend.shape}")
print(f"Expected features: {len(frontend_features)}")
print(f"Actual features: {X_frontend.shape[1]}")

if X_frontend.shape[1] == len(frontend_features):
    print("\n‚úÖ All 17 features present and verified!")
else:
    print("\n‚ùå Warning: Feature count mismatch!")

‚úÖ DATA READY FOR PREDICTION

Feature matrix shape: (5, 17)
Expected features: 17
Actual features: 17

‚úÖ All 17 features present and verified!


## Step 5: Make Predictions

In [6]:
# Make predictions using the trained model
raw_predictions = model.predict(X_frontend)

# Convert from model format (-1 for outlier, 1 for inlier) to binary (0=normal, 1=attack)
predictions = np.where(raw_predictions == -1, 1, 0)

print("="*70)
print("üîç PREDICTION RESULTS")
print("="*70)
print(f"\nüìå OUTPUT MAPPING:")
print(f"   0 = NORMAL   ‚úÖ (No attack - safe traffic)")
print(f"   1 = ATTACK   ‚ö†Ô∏è  (Anomaly detected - suspicious traffic)")
print(f"\n" + "-"*70)
print(f"{'Record':<10} {'Prediction':<15} {'Interpretation':<30}")
print("-"*70)

for i, pred in enumerate(predictions):
    if pred == 0:
        interpretation = "‚úÖ NORMAL - Safe"
    else:
        interpretation = "‚ö†Ô∏è  ATTACK - Suspicious"
    print(f"Record {i+1:<3} {pred:<15} {interpretation:<30}")

print("-"*70)

üîç PREDICTION RESULTS

üìå OUTPUT MAPPING:
   0 = NORMAL   ‚úÖ (No attack - safe traffic)
   1 = ATTACK   ‚ö†Ô∏è  (Anomaly detected - suspicious traffic)

----------------------------------------------------------------------
Record     Prediction      Interpretation                
----------------------------------------------------------------------
Record 1   1               ‚ö†Ô∏è  ATTACK - Suspicious       
Record 2   1               ‚ö†Ô∏è  ATTACK - Suspicious       
Record 3   1               ‚ö†Ô∏è  ATTACK - Suspicious       
Record 4   1               ‚ö†Ô∏è  ATTACK - Suspicious       
Record 5   1               ‚ö†Ô∏è  ATTACK - Suspicious       
----------------------------------------------------------------------


## Step 6: Create Production Response Format

In [7]:
# Create response in JSON format for frontend
print("\nüì§ RESPONSE FORMAT FOR FRONTEND:\n")

# Create detailed predictions with confidence and labels
results = []
for i, pred in enumerate(predictions):
    result = {
        "record_id": i + 1,
        "prediction": int(pred),
        "label": "ATTACK" if pred == 1 else "NORMAL",
        "risk_level": "HIGH" if pred == 1 else "LOW",
        "timestamp": datetime.now().isoformat()
    }
    results.append(result)

# Create final response
response = {
    "status": "success",
    "model": "IsolationForest",
    "model_type": "Anomaly Detection",
    "total_records": len(predictions),
    "total_attacks_detected": int(np.sum(predictions)),
    "total_normal": int(len(predictions) - np.sum(predictions)),
    "predictions": results
}

print(json.dumps(response, indent=2))


üì§ RESPONSE FORMAT FOR FRONTEND:

{
  "status": "success",
  "model": "IsolationForest",
  "model_type": "Anomaly Detection",
  "total_records": 5,
  "total_attacks_detected": 5,
  "total_normal": 0,
  "predictions": [
    {
      "record_id": 1,
      "prediction": 1,
      "label": "ATTACK",
      "risk_level": "HIGH",
      "timestamp": "2026-02-02T14:57:44.671783"
    },
    {
      "record_id": 2,
      "prediction": 1,
      "label": "ATTACK",
      "risk_level": "HIGH",
      "timestamp": "2026-02-02T14:57:44.671783"
    },
    {
      "record_id": 3,
      "prediction": 1,
      "label": "ATTACK",
      "risk_level": "HIGH",
      "timestamp": "2026-02-02T14:57:44.671783"
    },
    {
      "record_id": 4,
      "prediction": 1,
      "label": "ATTACK",
      "risk_level": "HIGH",
      "timestamp": "2026-02-02T14:57:44.671783"
    },
    {
      "record_id": 5,
      "prediction": 1,
      "label": "ATTACK",
      "risk_level": "HIGH",
      "timestamp": "2026-02-02T14:57:4

## Step 7: Complete Workflow Function

In [8]:
# ============================================================
# PRODUCTION FUNCTION - Copy this to your backend/API
# ============================================================

def predict_network_anomaly(data_dict):
    """
    Predict if network traffic is normal or anomalous.
    
    Args:
        data_dict (dict): Dictionary containing the 17 required features
                         Can be a single record or multiple records
    
    Returns:
        dict: Predictions with labels (0=NORMAL, 1=ATTACK)
    """
    # Convert to DataFrame
    if isinstance(data_dict, dict) and not isinstance(list(data_dict.values())[0], list):
        # Single record
        df = pd.DataFrame([data_dict])
    else:
        # Multiple records
        df = pd.DataFrame(data_dict)
    
    # Preprocess
    numeric_features = ['duration', 'src_bytes', 'dst_bytes', 'land', 'wrong_fragment',
                        'urgent', 'hot', 'num_failed_logins', 'logged_in', 'num_compromised',
                        'root_shell', 'su_attempted', 'num_root', 'num_file_creations',
                        'num_shells', 'num_access_files', 'num_outbound_cmds', 'is_host_login',
                        'is_guest_login', 'count', 'srv_count', 'dst_host_count',
                        'dst_host_srv_count', 'level']
    
    df[numeric_features] = scaler.transform(df[numeric_features])
    
    categorical_features = ['flag', 'protocol_type']
    encoded_features = encoder.transform(df[categorical_features])
    encoded_df = pd.DataFrame(encoded_features, columns=encoder.get_feature_names_out(categorical_features))
    df = df.drop(columns=categorical_features)
    df = pd.concat([df, encoded_df], axis=1)
    
    df['service'] = df['service'].map(freq_encoding)
    df['service'].fillna(freq_encoding.median(), inplace=True)
    
    # Select features
    X = df[frontend_features]
    
    # Predict
    raw_preds = model.predict(X)
    predictions = np.where(raw_preds == -1, 1, 0)
    
    # Format response
    return {
        "predictions": predictions.tolist(),
        "labels": ["ATTACK" if p == 1 else "NORMAL" for p in predictions],
        "is_attack": bool(any(predictions == 1))
    }


# Test the function
print("\n" + "="*70)
print("üß™ TESTING THE PRODUCTION FUNCTION")
print("="*70)

# Test with single record
single_record = {
    'duration': 100, 'protocol_type': 'tcp', 'service': 'http', 'flag': 'SF',
    'src_bytes': 200, 'dst_bytes': 1000, 'land': 0, 'wrong_fragment': 0,
    'urgent': 0, 'hot': 0, 'num_failed_logins': 0, 'logged_in': 1,
    'num_compromised': 0, 'root_shell': 0, 'su_attempted': 0, 'num_root': 0,
    'num_file_creations': 0, 'num_shells': 0, 'num_access_files': 0,
    'num_outbound_cmds': 0, 'is_host_login': 0, 'is_guest_login': 0,
    'count': 10, 'srv_count': 10, 'serror_rate': 0.0, 'srv_serror_rate': 0.0,
    'rerror_rate': 0.0, 'srv_rerror_rate': 0.0, 'same_srv_rate': 1.0,
    'diff_srv_rate': 0.0, 'srv_diff_host_rate': 0.0, 'dst_host_count': 255,
    'dst_host_srv_count': 255, 'dst_host_same_srv_rate': 1.0,
    'dst_host_diff_srv_rate': 0.0, 'dst_host_same_src_port_rate': 0.0,
    'dst_host_srv_diff_host_rate': 0.0, 'dst_host_serror_rate': 0.0,
    'dst_host_srv_rerror_rate': 0.0, 'dst_host_rerror_rate': 0.0,
    'dst_host_srv_rerror_rate': 0.0, 'level': 0
}

result = predict_network_anomaly(single_record)
print(f"\nSingle Record Prediction:")
print(json.dumps(result, indent=2))


üß™ TESTING THE PRODUCTION FUNCTION

Single Record Prediction:
{
  "predictions": [
    0
  ],
  "labels": [
    "NORMAL"
  ],
  "is_attack": false
}


C:\Users\PRESTIGE\AppData\Local\Temp\ipykernel_23908\673364787.py:41: ChainedAssignmentError: A value is being set on a copy of a DataFrame or Series through chained assignment using an inplace method.
Such inplace method never works to update the original DataFrame or Series, because the intermediate object on which we are setting values always behaves as a copy (due to Copy-on-Write).

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' instead, to perform the operation inplace on the original object, or try to avoid an inplace operation using 'df[col] = df[col].method(value)'.

See the documentation for a more detailed explanation: https://pandas.pydata.org/pandas-docs/stable/user_guide/copy_on_write.html
  df['service'].fillna(freq_encoding.median(), inplace=True)
