

# ***Lightweight Hybrid CNN & ConvNeXt-Tiny IDS for IoT Networks***


## üìå**Description**

This project presents a **lightweight hybrid intrusion detection system (IDS)** for securing IoT networks by combining **CNN** and **ConvNeXt-Tiny** architectures. The model efficiently detects malicious network traffic while maintaining low computational cost, making it suitable for real-time IoT and edge environments.


### üîë **Key Points**

* Hybrid CNN + ConvNeXt-Tiny architecture
* Low-level and deep feature fusion
* Trained on CICIoT2023 dataset
* Cross-dataset validation on CICIDS2017, BoT-IoT, and UNSW-NB15
* High accuracy with low inference latency
* Lightweight and edge-deployable IDS



## üîπ **Cell 1 ‚Äî Install and Import Required Libraries**

### üìå Description

This cell sets up the software environment required for implementing the intrusion detection system.

### üîë Key Points

* Installs deep learning libraries such as PyTorch and Torchvision
* Imports data handling tools like NumPy and Pandas
* Loads visualization libraries for result analysis
* Includes Scikit-learn utilities for preprocessing and evaluation
* Ensures a reproducible and research-ready environment

In [1]:

# Install required libraries
!pip install torch torchvision torchaudio
!pip install pandas numpy scikit-learn matplotlib seaborn tqdm

# Imports
import os
import time
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score, precision_score, recall_score, f1_score

from tqdm import tqdm




## üîπ **Cell 2 ‚Äî Load Datasets**

### üìå Description

This cell loads the primary and cross-dataset files required for training and evaluation.

### üîë Key Points

* Mounts Google Drive to handle large datasets
* Loads CICIoT2023 as the primary training dataset
* Defines file paths for CICIDS2017, BoT-IoT, and UNSW-NB15
* Enables seamless dataset switching for cross-dataset validation
* Supports scalable experimentation

In [2]:
#Load Datasets (CICIoT2023 + Paths for Others)
from google.colab import drive
drive.mount('/content/drive')

# Dataset paths (modify based on your Drive structure)
CICIOT_PATH = "/content/drive/MyDrive/Datasets/CICIoT2023/CICIoT2023.csv"
BOTIOT_PATH = "/content/drive/MyDrive/Datasets/BOT-IOT/UNSW_2018_IoT_Botnet_Final.csv"
UNSW_PATH = "/content/drive/MyDrive/Datasets/UNSW-NB15/UNSW_NB15_testing-set.csv"

# Load primary dataset
df = pd.read_csv(CICIOT_PATH)
print("CICIoT2023 Loaded Successfully")


CHECKPOINT_DIR = "/content/drive/MyDrive/IDS_Checkpoints"
os.makedirs(CHECKPOINT_DIR, exist_ok=True)

CHECKPOINT_PATH = os.path.join(CHECKPOINT_DIR, "ids_checkpoint.pth")

print("Checkpoint path:", CHECKPOINT_PATH)


ValueError: mount failed

## üîπ **Cell 3 ‚Äî Dataset Exploration**

### üìå Description

This cell explores the structure and distribution of the CICIoT2023 dataset.

### üîë Key Points

* Displays dataset shape and feature count
* Lists all feature and label columns
* Examines attack class distribution
* Identifies potential class imbalance
* Helps understand traffic behavior before modeling

In [None]:
#Dataset Exploration
import matplotlib.pyplot as plt
import seaborn as sns


print("Dataset Shape:", df.shape)
print("\nColumns:\n", df.columns)


label_column = 'label'

# Get counts
class_counts = df[label_column].value_counts()

plt.figure(figsize=(12,6))
sns.barplot(x=class_counts.index, y=class_counts.values)

plt.title("Attack Class Distribution (CICIoT2023)", fontsize=14)
plt.xlabel("Attack Class")
plt.ylabel("Number of Samples")
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.6)

plt.show()


plt.figure(figsize=(12,6))
sns.barplot(x=class_counts.index, y=class_counts.values)

plt.yscale("log")   # log scale makes small classes visible
plt.title("Attack Class Distribution (Log Scale)", fontsize=14)
plt.xlabel("Attack Class")
plt.ylabel("Number of Samples (log scale)")
plt.xticks(rotation=45)
plt.grid(axis='y', linestyle='--', alpha=0.6)

plt.show()



## üîπ **Cell 4 ‚Äî Data Preprocessing**

### üìå Description

This cell prepares raw network traffic data for deep learning.

### üîë Key Points

* Handles missing and null values
* Normalizes feature values for stable training
* Encodes attack labels into numeric form
* Performs stratified train‚Äìtest split
* Ensures fair and unbiased evaluation


In [None]:
#Data Preprocessing
# Handle missing values
df.fillna(0, inplace=True)

X = df.drop(columns=[label_column])
y = df[label_column]

# Encode labels
le = LabelEncoder()
y = le.fit_transform(y)

# Normalize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X_scaled, y, test_size=0.2, random_state=42, stratify=y
)

print("Train shape:", X_train.shape)
print("Test shape:", X_test.shape)


## üîπ **Cell 5 ‚Äî Tensor Preparation**

### üìå Description

This cell converts preprocessed data into tensors suitable for CNN-based models.

### üîë Key Points

* Converts NumPy arrays into PyTorch tensors
* Reshapes data into 3D format for 1D CNN input
* Creates efficient DataLoader objects
* Enables batch-wise training and inference
* Improves memory and computational efficiency

In [None]:
#Tensor Preparation

# Convert to tensors
X_train_t = torch.tensor(X_train, dtype=torch.float32)
X_test_t = torch.tensor(X_test, dtype=torch.float32)
y_train_t = torch.tensor(y_train, dtype=torch.long)
y_test_t = torch.tensor(y_test, dtype=torch.long)

# Reshape for CNN (N, C, L)
X_train_t = X_train_t.unsqueeze(1)
X_test_t = X_test_t.unsqueeze(1)

train_loader = DataLoader(TensorDataset(X_train_t, y_train_t), batch_size=128, shuffle=True)
test_loader = DataLoader(TensorDataset(X_test_t, y_test_t), batch_size=128)


## üîπ **Cell 6 ‚Äî Build CNN Feature Extractor**

### üìå Description

This cell defines a lightweight CNN module for low-level feature extraction.

### üîë Key Points

* Extracts spatial and statistical traffic patterns
* Captures local intrusion signatures
* Uses minimal convolution layers to reduce complexity
* Ensures fast inference
* Suitable for IoT and edge devices

In [None]:
#Build Lightweight CNN Feature Extractor

class CNNFeatureExtractor(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv1d(1, 32, kernel_size=3, padding=1),
            nn.BatchNorm1d(32),
            nn.ReLU(),
            nn.MaxPool1d(2),

            nn.Conv1d(32, 64, kernel_size=3, padding=1),
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.AdaptiveAvgPool1d(1)
        )

    def forward(self, x):
        x = self.conv(x)
        return x.view(x.size(0), -1)


## üîπ **Cell 7 ‚Äî Build ConvNeXt-Tiny Backbone**

### üìå Description

This cell implements a ConvNeXt-Tiny inspired architecture for deep feature learning.

### üîë Key Points

* Applies modern CNN design principles
* Uses depthwise and large-kernel convolutions
* Captures high-level semantic traffic features
* Improves representational power
* Maintains a lightweight computational footprint


In [None]:
#Build ConvNeXt-Tiny Backbone

class ConvNeXtTiny1D(nn.Module):
    def __init__(self, in_channels=1):
        super().__init__()
        self.stem = nn.Conv1d(in_channels, 64, kernel_size=7, stride=2, padding=3)
        self.block = nn.Sequential(
            nn.BatchNorm1d(64),
            nn.ReLU(),
            nn.Conv1d(64, 128, kernel_size=7, padding=3, groups=64),
            nn.ReLU(),
            nn.AdaptiveAvgPool1d(1)
        )

    def forward(self, x):
        x = self.stem(x)
        x = self.block(x)
        return x.view(x.size(0), -1)


## üîπ **Cell 8 ‚Äî Hybrid Model Construction**

### üìå Description

This cell constructs the hybrid IDS by combining CNN and ConvNeXt-Tiny features.

### üîë Key Points

* Fuses low-level and deep feature representations
* Improves detection accuracy and robustness
* Reduces overfitting through complementary learning
* Uses fully connected layers for classification
* Aligns with hybrid IDS research methodology

In [None]:
#Hybrid CNN + ConvNeXt-Tiny Model

class HybridIDS(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.cnn = CNNFeatureExtractor()
        self.convnext = ConvNeXtTiny1D()

        self.classifier = nn.Sequential(
            nn.Linear(64 + 128, 128),
            nn.ReLU(),
            nn.Dropout(0.3),
            nn.Linear(128, num_classes)
        )

    def forward(self, x):
        f1 = self.cnn(x)
        f2 = self.convnext(x)
        fused = torch.cat((f1, f2), dim=1)
        return self.classifier(fused)


## üîπ **Cell 9 ‚Äî Model Compilation**

### üìå Description

This cell configures the model for training.

### üîë Key Points

* Initializes the hybrid IDS architecture
* Uses cross-entropy loss for multi-class detection
* Applies Adam optimizer for faster convergence
* Supports GPU acceleration when available
* Verifies model architecture and parameter count


In [None]:
#Model Compilation

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model = HybridIDS(num_classes=len(np.unique(y))).to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

print(model)



## üîπ **Cell 10 ‚Äî Checkpoint Function**

### üìå Description

This cell saves and loads the trained Hybrid CNN & ConvNeXt-Tiny IDS model using a checkpoint. It preserves model weights and preprocessing objects to ensure reproducibility and enable reuse without retraining.

### üîë Points

* Saves model in `.pth` format
* Stores scaler and label encoder
* Enables model reuse and cross-dataset testing
* Supports deployment readiness


In [None]:
#IMPLEMENTING CHECKPONT FUNCTION

def save_checkpoint(epoch, model, optimizer, best_val_loss):
    torch.save({
        "epoch": epoch,
        "model_state": model.state_dict(),
        "optimizer_state": optimizer.state_dict(),
        "best_val_loss": best_val_loss
    }, CHECKPOINT_PATH)

    print(f"‚úÖ Checkpoint saved at epoch {epoch+1}")


In [None]:
def load_checkpoint(model, optimizer):
    if os.path.exists(CHECKPOINT_PATH):
        checkpoint = torch.load(CHECKPOINT_PATH, map_location=device)

        model.load_state_dict(checkpoint["model_state"])
        optimizer.load_state_dict(checkpoint["optimizer_state"])

        start_epoch = checkpoint["epoch"] + 1
        best_val_loss = checkpoint["best_val_loss"]

        print(f"‚úÖ Resumed training from epoch {start_epoch}")
        return start_epoch, best_val_loss
    else:
        print("‚ö†Ô∏è No checkpoint found. Starting fresh.")
        return 0, float("inf")


In [None]:
from sklearn.model_selection import train_test_split

# --------------------------------------
# Split TRAIN ‚Üí TRAIN + VALIDATION
# --------------------------------------
X_train_split, X_val_split, y_train_split, y_val_split = train_test_split(
    X_train_t, y_train_t,
    test_size=0.15,        # 15% validation
    random_state=42,
    stratify=y_train_t
)

# --------------------------------------
# Create DataLoaders
# --------------------------------------
train_loader = DataLoader(
    TensorDataset(X_train_split, y_train_split),
    batch_size=128,
    shuffle=True
)

val_loader = DataLoader(
    TensorDataset(X_val_split, y_val_split),
    batch_size=128,
    shuffle=False
)

print("‚úÖ train_loader and val_loader created")
print("Train batches:", len(train_loader))
print("Validation batches:", len(val_loader))


## üîπ **Cell 11 ‚Äî Model Training**

### üìå Description

This cell trains the hybrid IDS model using labeled network traffic data.

### üîë Key Points

* Implements supervised training loop
* Updates model parameters across epochs
* Tracks training loss
* Visualizes convergence behavior
* Ensures stable and effective learning

In [None]:
#COMPLETE TRAINING CELL

# ======================================
# Training Configuration
# ======================================
epochs = 100
patience = 10

train_losses = []
val_losses = []
val_accuracies = []
val_precisions = []
val_recalls = []
val_f1s = []

# --------------------------------------
# Resume from checkpoint (if exists)
# --------------------------------------
start_epoch, best_val_loss = load_checkpoint(model, optimizer)
early_stop_counter = 0

# ======================================
# Training Loop
# ======================================
for epoch in range(start_epoch, epochs):

    # -------- TRAINING --------
    model.train()
    running_loss = 0.0

    for Xb, yb in tqdm(train_loader, desc=f"Epoch {epoch+1}/{epochs} [Training]"):
        Xb, yb = Xb.to(device), yb.to(device)

        optimizer.zero_grad()
        outputs = model(Xb)
        loss = criterion(outputs, yb)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    train_loss = running_loss / len(train_loader)
    train_losses.append(train_loss)

    # -------- VALIDATION --------
    model.eval()
    val_running_loss = 0.0
    all_preds, all_labels = [], []

    with torch.no_grad():
        for Xb, yb in tqdm(val_loader, desc=f"Epoch {epoch+1}/{epochs} [Validation]"):
            Xb, yb = Xb.to(device), yb.to(device)

            outputs = model(Xb)
            loss = criterion(outputs, yb)
            val_running_loss += loss.item()

            preds = torch.argmax(outputs, dim=1)
            all_preds.extend(preds.cpu().numpy())
            all_labels.extend(yb.cpu().numpy())

    val_loss = val_running_loss / len(val_loader)
    val_losses.append(val_loss)

    # -------- METRICS --------
    acc = accuracy_score(all_labels, all_preds)
    precision = precision_score(all_labels, all_preds, average="weighted", zero_division=0)
    recall = recall_score(all_labels, all_preds, average="weighted", zero_division=0)
    f1 = f1_score(all_labels, all_preds, average="weighted", zero_division=0)

    val_accuracies.append(acc)
    val_precisions.append(precision)
    val_recalls.append(recall)
    val_f1s.append(f1)

    # -------- LOGGING --------
    print("\n" + "="*60)
    print(f"Epoch [{epoch+1}/{epochs}]")
    print(f"Train Loss : {train_loss:.4f}")
    print(f"Val Loss   : {val_loss:.4f}")
    print(f"Accuracy   : {acc:.4f}")
    print(f"Precision  : {precision:.4f}")
    print(f"Recall     : {recall:.4f}")
    print(f"F1 Score   : {f1:.4f}")
    print("="*60)

    # -------- CHECKPOINT + EARLY STOPPING --------
    if val_loss < best_val_loss:
        best_val_loss = val_loss
        early_stop_counter = 0
        save_checkpoint(epoch, model, optimizer, best_val_loss)
    else:
        early_stop_counter += 1
        print(f"Early stopping counter: {early_stop_counter}/{patience}")

        if early_stop_counter >= patience:
            print("üõë Early stopping triggered. Training stopped.")
            break


## üîπ **Cell 12 ‚Äî In-Dataset Evaluation**

### üìå Description

This cell evaluates model performance on the CICIoT2023 test dataset.

### üîë Key Points

* Computes accuracy, precision, recall, and F1-score
* Generates confusion matrix for detailed analysis
* Evaluates detection effectiveness per attack class
* Validates model performance on seen data
* Provides quantitative IDS metrics

In [None]:
# ---------- Plot All Metrics ----------
plt.figure(figsize=(14,10))

plt.subplot(2,2,1)
plt.plot(train_losses, label="Train Loss")
plt.plot(val_losses, label="Val Loss")
plt.title("Loss Curve")
plt.xlabel("Epoch")
plt.ylabel("Loss")
plt.legend()
plt.grid(True)

plt.subplot(2,2,2)
plt.plot(val_accuracies, label="Accuracy", color="green")
plt.title("Validation Accuracy")
plt.xlabel("Epoch")
plt.ylabel("Accuracy")
plt.grid(True)

plt.subplot(2,2,3)
plt.plot(val_f1s, label="F1 Score", color="purple")
plt.title("Validation F1 Score")
plt.xlabel("Epoch")
plt.ylabel("F1 Score")
plt.grid(True)

plt.subplot(2,2,4)
plt.plot(val_precisions, label="Precision", color="orange")
plt.plot(val_recalls, label="Recall", color="red")
plt.title("Precision & Recall")
plt.xlabel("Epoch")
plt.ylabel("Score")
plt.legend()
plt.grid(True)

plt.tight_layout()

# ---------- SAVE FIGURE ----------
save_path = "/content/drive/MyDrive/IDS_Results/"
import os
os.makedirs(os.path.dirname(save_path), exist_ok=True)

plt.savefig(save_path, dpi=300, bbox_inches="tight")
print(f"‚úÖ Metrics plot saved at: {save_path}")

plt.show()


## üîπ **Cell 13 ‚Äî Lightweight Analysis**

### üìå Description

This cell evaluates the model‚Äôs suitability for IoT and edge deployment.

### üîë Key Points

* Calculates total and trainable model parameters
* Measures inference time and latency
* Assesses memory and computational efficiency
* Demonstrates real-time detection capability
* Justifies lightweight IDS design

In [None]:
#Lightweight Analysis (IoT Feasibility Evaluation)

# Model Size Calculation
# -------------------------------
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print(f"Total Parameters: {total_params:,}")
print(f"Trainable Parameters: {trainable_params:,}")
print(f"Model Size: {total_params / 1e6:.2f} Million Parameters")

# -------------------------------
# Inference Time Measurement
# -------------------------------
model.eval()

# Warm-up (important for fair timing on GPU)
with torch.no_grad():
    _ = model(X_test_t[:100].to(device))

start_time = time.time()

with torch.no_grad():
    _ = model(X_test_t[:100].to(device))

end_time = time.time()

print(f"Inference Time for 100 samples: {end_time - start_time:.4f} seconds")
print(f"Average inference time per sample: {(end_time - start_time)/100:.6f} seconds")


## üîπ **Cell 14 ‚Äî Baseline Comparison**

### üìå Description

This cell compares the proposed hybrid model with standard deep learning baselines.

### üîë Key Points

* Implements CNN-only baseline
* Evaluates CNN-GRU hybrid baseline
* Compares against Deep BiLSTM model
* Analyzes accuracy vs complexity trade-off
* Demonstrates superiority of the proposed hybrid IDS

In [None]:
#Baseline Comparison

#Baseline 1 ‚Äî CNN-only Model

class CNNOnlyIDS(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.features = nn.Sequential(
            nn.Conv1d(1, 32, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(2),

            nn.Conv1d(32, 64, 3, padding=1),
            nn.ReLU(),
            nn.AdaptiveAvgPool1d(1)
        )
        self.classifier = nn.Linear(64, num_classes)

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size(0), -1)
        return self.classifier(x)

#Baseline 2 ‚Äî CNN-GRU Model

class CNNGRUIDS(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.cnn = nn.Sequential(
            nn.Conv1d(1, 32, 3, padding=1),
            nn.ReLU(),
            nn.MaxPool1d(2)
        )
        self.gru = nn.GRU(input_size=32, hidden_size=64, batch_first=True)
        self.fc = nn.Linear(64, num_classes)

    def forward(self, x):
        x = self.cnn(x)
        x = x.permute(0, 2, 1)  # (batch, seq, features)
        _, h = self.gru(x)
        return self.fc(h[-1])

#Baseline 3 ‚Äî Deep BiLSTM Model

class BiLSTMIDS(nn.Module):
    def __init__(self, num_classes):
        super().__init__()
        self.lstm = nn.LSTM(
            input_size=1,        # each feature as a time step
            hidden_size=64,
            num_layers=2,
            batch_first=True,
            bidirectional=True
        )
        self.fc = nn.Linear(128, num_classes)

    def forward(self, x):
        # x shape: (batch, 1, features)
        x = x.permute(0, 2, 1)   # (batch, features, 1)
        _, (h, _) = self.lstm(x)
        h = torch.cat((h[-2], h[-1]), dim=1)
        return self.fc(h)




In [None]:
# Training & Evaluation Function (Reusable)

def train_and_evaluate(model, train_loader, test_loader, epochs=5):
    model.to(device)
    criterion = nn.CrossEntropyLoss()
    optimizer = optim.Adam(model.parameters(), lr=0.001)

    for _ in range(epochs):
        model.train()
        for Xb, yb in train_loader:
            Xb, yb = Xb.to(device), yb.to(device)
            optimizer.zero_grad()
            loss = criterion(model(Xb), yb)
            loss.backward()
            optimizer.step()

    # Evaluation
    model.eval()
    y_true, y_pred = [], []

    with torch.no_grad():
        for Xb, yb in test_loader:
            outputs = model(Xb.to(device))
            preds = outputs.argmax(dim=1).cpu().numpy()
            y_pred.extend(preds)
            y_true.extend(yb.numpy())

    return {
        "Accuracy": accuracy_score(y_true, y_pred),
        "Precision": precision_score(y_true, y_pred, average='weighted'),
        "Recall": recall_score(y_true, y_pred, average='weighted'),
        "F1": f1_score(y_true, y_pred, average='weighted'),
        "Params (M)": sum(p.numel() for p in model.parameters()) / 1e6
    }


In [None]:
# Baseline Experiments
results = []

results.append(("CNN-only",
    train_and_evaluate(CNNOnlyIDS(len(np.unique(y))), train_loader, test_loader)))

results.append(("CNN-GRU",
    train_and_evaluate(CNNGRUIDS(len(np.unique(y))), train_loader, test_loader)))

results.append(("BiLSTM",
    train_and_evaluate(BiLSTMIDS(len(np.unique(y))), train_loader, test_loader)))

results.append(("Hybrid CNN + ConvNeXt-Tiny",
    train_and_evaluate(model, train_loader, test_loader, epochs=0)))  # already trained


In [None]:
# Convert baseline results into DataFrame
baseline_df = pd.DataFrame([
    {
        "Model": name,
        "Accuracy": metrics["Accuracy"],
        "Precision": metrics["Precision"],
        "Recall": metrics["Recall"],
        "F1-Score": metrics["F1"],
        "Parameters (M)": metrics["Params (M)"]
    }
    for name, metrics in results
])

baseline_df



In [None]:
#Accuracy vs Model Complexity Plot
plt.figure(figsize=(8,6))

plt.scatter(
    baseline_df["Parameters (M)"],
    baseline_df["Accuracy"],
    s=120
)

for i, model_name in enumerate(baseline_df["Model"]):
    plt.text(
        baseline_df["Parameters (M)"][i],
        baseline_df["Accuracy"][i],
        model_name,
        fontsize=9,
        ha="right"
    )

plt.xlabel("Model Parameters (Millions)")
plt.ylabel("Accuracy")
plt.title("Accuracy vs Model Complexity")
plt.grid(True)
plt.show()


## üîπ **Cell 15 ‚Äî Cross-Dataset Validation**

### üìå Description

This cell evaluates the trained model on unseen datasets without retraining.

### üîë Key Points

* Tests on BoT-IoT and UNSW-NB15 datasets
* Uses same scaler and label encoder
* Avoids retraining to ensure fair validation
* Measures generalization capability
* Demonstrates real-world applicability


In [None]:
#Cross-Dataset Validation

def preprocess_cross_dataset(
    csv_path,
    reference_columns,
    scaler,
    label_column="Label"
):
    df = pd.read_csv(csv_path)

    # Drop label
    if label_column in df.columns:
        y = df[label_column]
        X = df.drop(columns=[label_column])
    else:
        raise ValueError("Label column not found!")

    # Keep only common features
    common_cols = list(set(reference_columns) & set(X.columns))
    X = X[common_cols]

    # Reorder columns to match training
    X = X[reference_columns[:len(common_cols)]]

    # Handle missing values
    X.fillna(0, inplace=True)

    # Normalize using TRAIN scaler
    X_scaled = scaler.transform(X)

    # Tensor reshape
    X_tensor = torch.tensor(X_scaled, dtype=torch.float32).unsqueeze(1)

    return X_tensor, y


In [None]:
# Load BoT-IoT
X_bot, y_bot = preprocess_cross_dataset(
    BOTIOT_PATH,
    reference_columns=X.columns,
    scaler=scaler
)

# Encode labels using TRAIN encoder
y_bot = le.transform(y_bot)

# Inference
model.eval()
bot_preds = []

with torch.no_grad():
    for i in range(0, len(X_bot), 128):
        batch = X_bot[i:i+128].to(device)
        outputs = model(batch)
        preds = outputs.argmax(dim=1).cpu().numpy()
        bot_preds.extend(preds)

# Metrics
print("BoT-IoT Cross-Dataset Results")
print("Accuracy:", accuracy_score(y_bot, bot_preds))
print("Precision:", precision_score(y_bot, bot_preds, average='weighted'))
print("Recall:", recall_score(y_bot, bot_preds, average='weighted'))
print("F1:", f1_score(y_bot, bot_preds, average='weighted'))


In [None]:
# Load UNSW-NB15
X_unsw, y_unsw = preprocess_cross_dataset(
    UNSW_PATH,
    reference_columns=X.columns,
    scaler=scaler
)

# Encode labels
y_unsw = le.transform(y_unsw)

# Inference
unsw_preds = []

with torch.no_grad():
    for i in range(0, len(X_unsw), 128):
        batch = X_unsw[i:i+128].to(device)
        outputs = model(batch)
        preds = outputs.argmax(dim=1).cpu().numpy()
        unsw_preds.extend(preds)

# Metrics
print("UNSW-NB15 Cross-Dataset Results")
print("Accuracy:", accuracy_score(y_unsw, unsw_preds))
print("Precision:", precision_score(y_unsw, unsw_preds, average='weighted'))
print("Recall:", recall_score(y_unsw, unsw_preds, average='weighted'))
print("F1:", f1_score(y_unsw, unsw_preds, average='weighted'))


# FINAL CELL ‚Äî Save & Load Trained Model (.pth)
üìå Explanation (Markdown for Colab)

After training and evaluation, the trained Hybrid CNN + ConvNeXt-Tiny IDS model is saved in PyTorch .pth format.
This allows:


*   Reproducibility of results
*   Cross-dataset testing without retraining

*   Future deployment on IoT / edge devices


We save:
1. Model weights
2. Label encoder
3. Feature scaler
4. Model metadata


In [None]:
#Save & Load Trained Model

# Create directory
save_dir = "/content/drive/MyDrive/IDS_Models"
os.makedirs(save_dir, exist_ok=True)

# Model save path
model_path = os.path.join(save_dir, "Hybrid_CNN_ConvNeXtTiny_IDS.pth")

# Save checkpoint
torch.save({
    "model_state_dict": model.state_dict(),
    "num_classes": len(np.unique(y)),
    "scaler": scaler,
    "label_encoder": le
}, model_path)

print(f"Model saved successfully at:\n{model_path}")


#Load Model(For Reuse / Cross-Dataset Testing)

# Load checkpoint
checkpoint = torch.load(model_path, map_location=device)

# Rebuild model
loaded_model = HybridIDS(num_classes=checkpoint["num_classes"]).to(device)
loaded_model.load_state_dict(checkpoint["model_state_dict"])
loaded_model.eval()

# Load preprocessing objects
scaler = checkpoint["scaler"]
le = checkpoint["label_encoder"]

print("Model loaded successfully and ready for inference.")
