# Graph Convolutional Network with All Features Combined

This notebook implements a Graph Convolutional Network (GCN) for syscall-based malware detection using **all features combined**:
- **Syscall**: Categorical feature (embedded)
- **Return Value (Ret)**: Categorical feature (embedded)
- **Parameters**: Text feature (sentence transformer embeddings)

## Graph Construction
Each syscall sequence is modeled as a **temporal graph**:
- **Nodes**: Each syscall in the sequence is a node
- **Edges**: Consecutive syscalls are connected (temporal ordering)
- **Node Features**: Concatenation of syscall, return value, and parameter embeddings

The GCN learns to aggregate neighborhood information through message passing, then applies global pooling for graph-level classification.


In [1]:
import sys
sys.path.insert(0, '../../configs')
from config_loader import get_split_with_labels

import time
import pandas as pd
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import Dataset, DataLoader
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import classification_report, confusion_matrix, f1_score
from sentence_transformers import SentenceTransformer
from tqdm import tqdm
import warnings
warnings.filterwarnings('ignore')

# PyTorch Geometric imports
from torch_geometric.data import Data, Batch
from torch_geometric.nn import GCNConv, global_mean_pool, global_max_pool
from torch_geometric.loader import DataLoader as PyGDataLoader

# Config
SPLIT = '70'
WINDOW_SIZES = [250, 500, 1000, 2000]  # Different sliding window lengths to test
# WINDOW_SIZES = [500]  # For quick testing
BATCH_SIZE = 32
EPOCHS = 20
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {DEVICE}")

# Embedding dimensions
SYSCALL_EMBED_DIM = 32
RETVAL_EMBED_DIM = 32

# Load sentence transformer model for parameter embeddings
print("Loading sentence transformer model...")
sentence_model = SentenceTransformer('all-MiniLM-L6-v2', device='cpu')
PARAM_EMBED_DIM = sentence_model.get_sentence_embedding_dimension()
print(f"Parameter embedding dimension: {PARAM_EMBED_DIM}")
print(f"Total combined node feature dimension: {SYSCALL_EMBED_DIM + RETVAL_EMBED_DIM + PARAM_EMBED_DIM}")


  from .autonotebook import tqdm as notebook_tqdm


Using device: cpu
Loading sentence transformer model...
Parameter embedding dimension: 384
Total combined node feature dimension: 448


In [2]:
# Load data
train_files, test_files = get_split_with_labels(SPLIT)
print(f"Train files: {len(train_files)}, Test files: {len(test_files)}")

def load_runs_all_features(file_path):
    """Load all features (syscall, ret, parameters) grouped by run.
    
    Returns list of runs, where each run is a dict with:
        - 'syscalls': list of syscall names
        - 'retvals': list of return values
        - 'params': list of parameter strings
    """
    df = pd.read_csv(file_path)
    runs = []
    for run_id, group in df.groupby('run'):
        run_data = {
            'syscalls': group['syscall'].tolist(),
            'retvals': group['Ret'].tolist(),
            'params': group['parameters'].tolist()
        }
        runs.append(run_data)
    return runs

# Count runs per label for training and test sets
def count_runs_per_label(file_label_pairs):
    """Count total runs per label."""
    counts = {'benign': 0, 'malicious': 0}
    for path, label in file_label_pairs:
        runs = load_runs_all_features(path)
        counts[label] += len(runs)
    return counts

train_counts = count_runs_per_label(train_files)
test_counts = count_runs_per_label(test_files)

print(f"\nTraining set:")
print(f"  Total runs: {sum(train_counts.values())}")
print(f"  Benign runs: {train_counts['benign']}")
print(f"  Malicious runs: {train_counts['malicious']}")

print(f"\nTest set:")
print(f"  Total runs: {sum(test_counts.values())}")
print(f"  Benign runs: {test_counts['benign']}")
print(f"  Malicious runs: {test_counts['malicious']}")


Train files: 21, Test files: 9

Training set:
  Total runs: 1986
  Benign runs: 1484
  Malicious runs: 502

Test set:
  Total runs: 810
  Benign runs: 489
  Malicious runs: 321


In [3]:
# Build encoders for syscalls and return values
print("Building encoders for syscalls and return values...")

all_syscalls = []
all_retvals = []
all_params = set()

for path, _ in train_files + test_files:
    for run_data in load_runs_all_features(path):
        all_syscalls.extend(run_data['syscalls'])
        all_retvals.extend(run_data['retvals'])
        for param in run_data['params']:
            if pd.isna(param):
                all_params.add('<EMPTY>')
            else:
                all_params.add(str(param))

# Build syscall encoder
syscall_encoder = LabelEncoder()
syscall_encoder.fit(all_syscalls)
syscall_vocab_size = len(syscall_encoder.classes_) + 1  # +1 for PAD token
print(f"Syscall vocabulary size: {syscall_vocab_size} (including PAD)")

# Build return value encoder
retval_encoder = LabelEncoder()
retval_encoder.fit(all_retvals)
retval_vocab_size = len(retval_encoder.classes_) + 1  # +1 for PAD token
print(f"Return value vocabulary size: {retval_vocab_size} (including PAD)")

# PAD index for embeddings
PAD_IDX = 0

print(f"\nUnique parameter strings: {len(all_params)}")


Building encoders for syscalls and return values...
Syscall vocabulary size: 81 (including PAD)
Return value vocabulary size: 42586 (including PAD)

Unique parameter strings: 266019


In [4]:
# Pre-compute sentence embeddings for all unique parameter strings
print("Computing sentence embeddings for parameters (this may take a few minutes)...")

all_params_list = list(all_params)
param_embeddings = sentence_model.encode(
    all_params_list,
    show_progress_bar=True,
    batch_size=256,
    convert_to_numpy=True
)

# Create a mapping from parameter string to embedding
param_to_embedding = {param: emb for param, emb in zip(all_params_list, param_embeddings)}
print(f"Parameter embeddings computed. Shape per embedding: {PARAM_EMBED_DIM}")

# Create zero embedding for padding
PAD_PARAM_EMBEDDING = np.zeros(PARAM_EMBED_DIM, dtype=np.float32)


Computing sentence embeddings for parameters (this may take a few minutes)...


Batches: 100%|██████████| 1040/1040 [19:11<00:00,  1.11s/it]


Parameter embeddings computed. Shape per embedding: 384


In [5]:
def create_temporal_graph(syscalls, retvals, params, syscall_encoder, retval_encoder, 
                          param_to_embedding, window_size, label):
    """
    Create a temporal graph from a syscall sequence.
    
    Graph structure:
    - Each syscall in the sequence becomes a node
    - Consecutive syscalls are connected by edges (temporal ordering)
    - Node features: [syscall_idx, retval_idx, param_embedding]
    
    Returns:
        PyG Data object with node features, edge index, and label
    """
    # Truncate to window_size
    seq_len = min(len(syscalls), window_size)
    
    syscalls = syscalls[:seq_len]
    retvals = retvals[:seq_len]
    params = params[:seq_len]
    
    # Encode syscalls and return values (+1 to reserve 0 for PAD)
    encoded_syscalls = syscall_encoder.transform(syscalls) + 1
    encoded_retvals = retval_encoder.transform(retvals) + 1
    
    # Get parameter embeddings
    param_embs = []
    for param in params:
        if pd.isna(param):
            key = '<EMPTY>'
        else:
            key = str(param)
        param_embs.append(param_to_embedding[key])
    param_embs = np.array(param_embs, dtype=np.float32)
    
    # Create edge index for temporal graph (consecutive connections)
    # Edges: 0->1, 1->2, 2->3, ..., (n-2)->(n-1)
    # For undirected graph, also add reverse edges
    if seq_len > 1:
        # Forward edges
        src = list(range(seq_len - 1))
        dst = list(range(1, seq_len))
        # Backward edges (undirected)
        src_back = list(range(1, seq_len))
        dst_back = list(range(seq_len - 1))
        # Combine
        edge_index = torch.tensor([src + src_back, dst + dst_back], dtype=torch.long)
    else:
        # Single node graph - no edges
        edge_index = torch.zeros((2, 0), dtype=torch.long)
    
    # Create node features
    # We'll store syscall and retval indices separately, then embed in the model
    syscall_tensor = torch.tensor(encoded_syscalls, dtype=torch.long)
    retval_tensor = torch.tensor(encoded_retvals, dtype=torch.long)
    param_tensor = torch.tensor(param_embs, dtype=torch.float32)
    
    # Create PyG Data object
    data = Data(
        syscall_idx=syscall_tensor,
        retval_idx=retval_tensor,
        param_emb=param_tensor,
        edge_index=edge_index,
        y=torch.tensor([label], dtype=torch.long),
        num_nodes=seq_len
    )
    
    return data


def create_graph_dataset(file_label_pairs, syscall_encoder, retval_encoder, 
                         param_to_embedding, window_size):
    """
    Create a list of PyG Data objects from file_label_pairs.
    """
    label_map = {'benign': 0, 'malicious': 1}
    data_list = []
    
    for path, label in file_label_pairs:
        runs = load_runs_all_features(path)
        for run_data in runs:
            graph = create_temporal_graph(
                run_data['syscalls'],
                run_data['retvals'],
                run_data['params'],
                syscall_encoder,
                retval_encoder,
                param_to_embedding,
                window_size,
                label_map[label]
            )
            data_list.append(graph)
    
    return data_list


# Test graph creation
print("Testing graph creation...")
test_graphs = create_graph_dataset(train_files[:1], syscall_encoder, retval_encoder, 
                                   param_to_embedding, window_size=100)
sample_graph = test_graphs[0]
print(f"Sample graph:")
print(f"  Num nodes: {sample_graph.num_nodes}")
print(f"  Num edges: {sample_graph.edge_index.shape[1]}")
print(f"  Syscall indices shape: {sample_graph.syscall_idx.shape}")
print(f"  Retval indices shape: {sample_graph.retval_idx.shape}")
print(f"  Param embeddings shape: {sample_graph.param_emb.shape}")
print(f"  Label: {sample_graph.y.item()}")


Testing graph creation...
Sample graph:
  Num nodes: 100
  Num edges: 198
  Syscall indices shape: torch.Size([100])
  Retval indices shape: torch.Size([100])
  Param embeddings shape: torch.Size([100, 384])
  Label: 0


In [6]:
class GCNAllFeatures(nn.Module):
    """
    Graph Convolutional Network that combines all features:
    - Syscall embeddings (learned)
    - Return value embeddings (learned)
    - Parameter embeddings (pre-computed sentence embeddings)
    
    Architecture:
    1. Embed categorical features (syscall, retval)
    2. Concatenate with parameter embeddings
    3. Project to hidden dimension
    4. Apply GCN layers for message passing
    5. Global pooling (mean + max) for graph-level representation
    6. Classification head
    """
    def __init__(self, syscall_vocab_size, retval_vocab_size,
                 syscall_embed_dim, retval_embed_dim, param_embed_dim,
                 hidden_dim=128, num_gcn_layers=3, dropout=0.3):
        super().__init__()
        
        self.syscall_embed_dim = syscall_embed_dim
        self.retval_embed_dim = retval_embed_dim
        self.param_embed_dim = param_embed_dim
        
        # Embedding layers for categorical features
        self.syscall_embedding = nn.Embedding(syscall_vocab_size, syscall_embed_dim, padding_idx=PAD_IDX)
        self.retval_embedding = nn.Embedding(retval_vocab_size, retval_embed_dim, padding_idx=PAD_IDX)
        
        # Total input dimension after concatenation
        total_embed_dim = syscall_embed_dim + retval_embed_dim + param_embed_dim
        
        # Project concatenated embeddings to hidden dimension
        self.input_projection = nn.Linear(total_embed_dim, hidden_dim)
        
        # GCN layers
        self.gcn_layers = nn.ModuleList()
        self.gcn_layers.append(GCNConv(hidden_dim, hidden_dim))
        for _ in range(num_gcn_layers - 1):
            self.gcn_layers.append(GCNConv(hidden_dim, hidden_dim))
        
        # Batch normalization for each GCN layer
        self.batch_norms = nn.ModuleList([
            nn.BatchNorm1d(hidden_dim) for _ in range(num_gcn_layers)
        ])
        
        self.dropout = nn.Dropout(dropout)
        
        # Classification head (using both mean and max pooling)
        self.fc = nn.Sequential(
            nn.Linear(hidden_dim * 2, hidden_dim),  # *2 for mean + max pooling
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(hidden_dim, 64),
            nn.ReLU(),
            nn.Dropout(dropout),
            nn.Linear(64, 2)
        )
    
    def forward(self, data):
        syscall_idx = data.syscall_idx
        retval_idx = data.retval_idx
        param_emb = data.param_emb
        edge_index = data.edge_index
        batch = data.batch
        
        # Embed categorical features
        syscall_emb = self.syscall_embedding(syscall_idx)  # (num_nodes, syscall_embed_dim)
        retval_emb = self.retval_embedding(retval_idx)      # (num_nodes, retval_embed_dim)
        
        # Concatenate all embeddings
        x = torch.cat([syscall_emb, retval_emb, param_emb], dim=1)  # (num_nodes, total_embed_dim)
        
        # Project to hidden dimension
        x = self.input_projection(x)
        x = F.relu(x)
        x = self.dropout(x)
        
        # Apply GCN layers with residual connections
        for i, (gcn, bn) in enumerate(zip(self.gcn_layers, self.batch_norms)):
            x_prev = x
            x = gcn(x, edge_index)
            x = bn(x)
            x = F.relu(x)
            x = self.dropout(x)
            # Residual connection
            x = x + x_prev
        
        # Global pooling (combine mean and max for richer representation)
        x_mean = global_mean_pool(x, batch)  # (batch_size, hidden_dim)
        x_max = global_max_pool(x, batch)    # (batch_size, hidden_dim)
        x = torch.cat([x_mean, x_max], dim=1)  # (batch_size, hidden_dim * 2)
        
        return self.fc(x)


In [7]:
# Run experiments with different window sizes
results = []

for window_size in WINDOW_SIZES:
    print(f"\n{'='*60}")
    print(f"EXPERIMENT: Window Size = {window_size}")
    print(f"{'='*60}")
    
    # Create graph datasets
    print("Creating graph datasets...")
    train_graphs = create_graph_dataset(
        train_files, syscall_encoder, retval_encoder,
        param_to_embedding, window_size
    )
    test_graphs = create_graph_dataset(
        test_files, syscall_encoder, retval_encoder,
        param_to_embedding, window_size
    )
    
    # Create PyG DataLoaders
    train_loader = PyGDataLoader(train_graphs, batch_size=BATCH_SIZE, shuffle=True)
    test_loader = PyGDataLoader(test_graphs, batch_size=BATCH_SIZE)
    
    print(f"Train samples: {len(train_graphs)}, Test samples: {len(test_graphs)}")
    
    # Print graph statistics
    avg_nodes = np.mean([g.num_nodes for g in train_graphs])
    avg_edges = np.mean([g.edge_index.shape[1] for g in train_graphs])
    print(f"\n--- Graph Statistics ---")
    print(f"  Avg nodes per graph: {avg_nodes:.1f}")
    print(f"  Avg edges per graph: {avg_edges:.1f}")
    print(f"  Max nodes (window size): {window_size}")
    
    # Create model
    model = GCNAllFeatures(
        syscall_vocab_size=syscall_vocab_size,
        retval_vocab_size=retval_vocab_size,
        syscall_embed_dim=SYSCALL_EMBED_DIM,
        retval_embed_dim=RETVAL_EMBED_DIM,
        param_embed_dim=PARAM_EMBED_DIM,
        hidden_dim=128,
        num_gcn_layers=3,
        dropout=0.3
    ).to(DEVICE)
    
    criterion = nn.CrossEntropyLoss()
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    
    # Print model parameter sizes per layer
    print(f"\n--- Model Architecture & Parameters ---")
    total_params = 0
    trainable_params = 0
    for name, param in model.named_parameters():
        param_count = param.numel()
        total_params += param_count
        if param.requires_grad:
            trainable_params += param_count
        print(f"  {name}: {list(param.shape)} = {param_count:,} params")
    print(f"  {'─'*50}")
    print(f"  Total parameters:     {total_params:,}")
    print(f"  Trainable parameters: {trainable_params:,}")
    
    # Training
    print(f"\nTraining...")
    train_start_time = time.time()
    for epoch in range(EPOCHS):
        model.train()
        total_loss = 0
        correct = 0
        total = 0
        
        pbar = tqdm(train_loader, desc=f"Epoch {epoch+1}/{EPOCHS}", leave=True)
        for data in pbar:
            data = data.to(DEVICE)
            
            optimizer.zero_grad()
            outputs = model(data)
            loss = criterion(outputs, data.y)
            loss.backward()
            optimizer.step()
            
            total_loss += loss.item() * data.num_graphs
            _, predicted = outputs.max(1)
            correct += (predicted == data.y).sum().item()
            total += data.num_graphs
            
            pbar.set_postfix({'loss': f'{total_loss/total:.4f}', 'acc': f'{correct/total:.4f}'})
        
        train_acc = correct / total
        avg_loss = total_loss / total
    
    train_time = time.time() - train_start_time
    print(f"Training time: {train_time:.2f}s")
    
    # Testing
    print(f"\nEvaluating...")
    test_start_time = time.time()
    model.eval()
    all_preds = []
    all_labels = []
    
    with torch.no_grad():
        for data in test_loader:
            data = data.to(DEVICE)
            outputs = model(data)
            _, predicted = outputs.max(1)
            all_preds.extend(predicted.cpu().numpy())
            all_labels.extend(data.y.cpu().numpy())
    
    test_time = time.time() - test_start_time
    print(f"Test time: {test_time:.2f}s")
    
    # Calculate metrics
    cm = confusion_matrix(all_labels, all_preds)
    tn, fp, fn, tp = cm.ravel()
    detection_rate = tp / (tp + fn) if (tp + fn) > 0 else 0.0
    false_positive_rate = fp / (fp + tn) if (fp + tn) > 0 else 0.0
    f1_weighted = f1_score(all_labels, all_preds, average='weighted')
    
    # Store results
    results.append({
        'Window Size': window_size,
        'Detection Rate': detection_rate,
        'False Positive Rate': false_positive_rate,
        'F1-score (weighted)': f1_weighted,
        'Train Time (s)': train_time,
        'Test Time (s)': test_time
    })
    
    # Print detailed results
    print(f"\nClassification Report:")
    print(classification_report(all_labels, all_preds, target_names=['benign', 'malicious']))
    
    labels_names = ['benign', 'malicious']
    cm_df = pd.DataFrame(cm, index=[f'True: {l}' for l in labels_names], columns=[f'Pred: {l}' for l in labels_names])
    print(f"Confusion Matrix:")
    print(cm_df)
    
    print(f"\nDetection Rate: {detection_rate:.4f}")
    print(f"False Positive Rate: {false_positive_rate:.4f}")
    print(f"F1-score (weighted): {f1_weighted:.4f}")
    
    # Clear GPU memory between experiments
    del model
    torch.cuda.empty_cache()



EXPERIMENT: Window Size = 250
Creating graph datasets...
Train samples: 1986, Test samples: 810

--- Graph Statistics ---
  Avg nodes per graph: 245.8
  Avg edges per graph: 489.6
  Max nodes (window size): 250

--- Model Architecture & Parameters ---
  syscall_embedding.weight: [81, 32] = 2,592 params
  retval_embedding.weight: [42586, 32] = 1,362,752 params
  input_projection.weight: [128, 448] = 57,344 params
  input_projection.bias: [128] = 128 params
  gcn_layers.0.bias: [128] = 128 params
  gcn_layers.0.lin.weight: [128, 128] = 16,384 params
  gcn_layers.1.bias: [128] = 128 params
  gcn_layers.1.lin.weight: [128, 128] = 16,384 params
  gcn_layers.2.bias: [128] = 128 params
  gcn_layers.2.lin.weight: [128, 128] = 16,384 params
  batch_norms.0.weight: [128] = 128 params
  batch_norms.0.bias: [128] = 128 params
  batch_norms.1.weight: [128] = 128 params
  batch_norms.1.bias: [128] = 128 params
  batch_norms.2.weight: [128] = 128 params
  batch_norms.2.bias: [128] = 128 params
  fc.

Epoch 1/20: 100%|██████████| 63/63 [00:06<00:00, 10.42it/s, loss=0.2730, acc=0.9003]
Epoch 2/20: 100%|██████████| 63/63 [00:06<00:00, 10.43it/s, loss=0.0570, acc=0.9869]
Epoch 3/20: 100%|██████████| 63/63 [00:05<00:00, 10.50it/s, loss=0.0261, acc=0.9945]
Epoch 4/20: 100%|██████████| 63/63 [00:05<00:00, 10.55it/s, loss=0.0195, acc=0.9960]
Epoch 5/20: 100%|██████████| 63/63 [00:06<00:00, 10.46it/s, loss=0.0170, acc=0.9955]
Epoch 6/20: 100%|██████████| 63/63 [00:06<00:00, 10.44it/s, loss=0.0161, acc=0.9950]
Epoch 7/20: 100%|██████████| 63/63 [00:06<00:00, 10.24it/s, loss=0.0166, acc=0.9935]
Epoch 8/20: 100%|██████████| 63/63 [00:06<00:00, 10.47it/s, loss=0.0147, acc=0.9919]
Epoch 9/20: 100%|██████████| 63/63 [00:06<00:00, 10.17it/s, loss=0.0162, acc=0.9950]
Epoch 10/20: 100%|██████████| 63/63 [00:06<00:00, 10.43it/s, loss=0.0153, acc=0.9924]
Epoch 11/20: 100%|██████████| 63/63 [00:06<00:00, 10.44it/s, loss=0.0078, acc=0.9950]
Epoch 12/20: 100%|██████████| 63/63 [00:06<00:00, 10.42it/s, lo

Training time: 122.49s

Evaluating...
Test time: 0.79s

Classification Report:
              precision    recall  f1-score   support

      benign       0.99      0.99      0.99       489
   malicious       0.98      0.99      0.98       321

    accuracy                           0.99       810
   macro avg       0.99      0.99      0.99       810
weighted avg       0.99      0.99      0.99       810

Confusion Matrix:
                 Pred: benign  Pred: malicious
True: benign              483                6
True: malicious             4              317

Detection Rate: 0.9875
False Positive Rate: 0.0123
F1-score (weighted): 0.9877

EXPERIMENT: Window Size = 500
Creating graph datasets...
Train samples: 1986, Test samples: 810

--- Graph Statistics ---
  Avg nodes per graph: 377.6
  Avg edges per graph: 753.1
  Max nodes (window size): 500

--- Model Architecture & Parameters ---
  syscall_embedding.weight: [81, 32] = 2,592 params
  retval_embedding.weight: [42586, 32] = 1,362,752

Epoch 1/20: 100%|██████████| 63/63 [00:08<00:00,  7.10it/s, loss=0.3332, acc=0.8681]
Epoch 2/20: 100%|██████████| 63/63 [00:08<00:00,  7.04it/s, loss=0.0222, acc=0.9965]
Epoch 3/20: 100%|██████████| 63/63 [00:08<00:00,  7.07it/s, loss=0.0149, acc=0.9980]
Epoch 4/20: 100%|██████████| 63/63 [00:09<00:00,  6.90it/s, loss=0.0092, acc=0.9985]
Epoch 5/20: 100%|██████████| 63/63 [00:08<00:00,  7.05it/s, loss=0.0143, acc=0.9970]
Epoch 6/20: 100%|██████████| 63/63 [00:09<00:00,  6.84it/s, loss=0.0073, acc=0.9985]
Epoch 7/20: 100%|██████████| 63/63 [00:09<00:00,  6.90it/s, loss=0.0066, acc=0.9980]
Epoch 8/20: 100%|██████████| 63/63 [00:09<00:00,  6.84it/s, loss=0.0093, acc=0.9990]
Epoch 9/20: 100%|██████████| 63/63 [00:09<00:00,  6.95it/s, loss=0.0033, acc=0.9985]
Epoch 10/20: 100%|██████████| 63/63 [00:09<00:00,  6.79it/s, loss=0.0028, acc=0.9995]
Epoch 11/20: 100%|██████████| 63/63 [00:09<00:00,  6.87it/s, loss=0.0040, acc=0.9995]
Epoch 12/20: 100%|██████████| 63/63 [00:09<00:00,  6.66it/s, lo

Training time: 183.01s

Evaluating...
Test time: 1.34s

Classification Report:
              precision    recall  f1-score   support

      benign       1.00      0.82      0.90       489
   malicious       0.78      1.00      0.88       321

    accuracy                           0.89       810
   macro avg       0.89      0.91      0.89       810
weighted avg       0.91      0.89      0.89       810

Confusion Matrix:
                 Pred: benign  Pred: malicious
True: benign              399               90
True: malicious             1              320

Detection Rate: 0.9969
False Positive Rate: 0.1840
F1-score (weighted): 0.8889

EXPERIMENT: Window Size = 1000
Creating graph datasets...
Train samples: 1986, Test samples: 810

--- Graph Statistics ---
  Avg nodes per graph: 475.5
  Avg edges per graph: 949.1
  Max nodes (window size): 1000

--- Model Architecture & Parameters ---
  syscall_embedding.weight: [81, 32] = 2,592 params
  retval_embedding.weight: [42586, 32] = 1,362,7

Epoch 1/20: 100%|██████████| 63/63 [00:11<00:00,  5.41it/s, loss=0.3120, acc=0.8822]
Epoch 2/20: 100%|██████████| 63/63 [00:11<00:00,  5.52it/s, loss=0.0835, acc=0.9713]
Epoch 3/20: 100%|██████████| 63/63 [00:11<00:00,  5.34it/s, loss=0.0319, acc=0.9940]
Epoch 4/20: 100%|██████████| 63/63 [00:11<00:00,  5.37it/s, loss=0.0097, acc=0.9985]
Epoch 5/20: 100%|██████████| 63/63 [00:12<00:00,  5.18it/s, loss=0.0064, acc=0.9990]
Epoch 6/20: 100%|██████████| 63/63 [00:11<00:00,  5.28it/s, loss=0.0043, acc=0.9985]
Epoch 7/20: 100%|██████████| 63/63 [00:12<00:00,  5.12it/s, loss=0.0029, acc=0.9995]
Epoch 8/20: 100%|██████████| 63/63 [00:11<00:00,  5.28it/s, loss=0.0027, acc=0.9995]
Epoch 9/20: 100%|██████████| 63/63 [00:12<00:00,  5.16it/s, loss=0.0003, acc=1.0000]
Epoch 10/20: 100%|██████████| 63/63 [00:12<00:00,  5.14it/s, loss=0.0009, acc=0.9995]
Epoch 11/20: 100%|██████████| 63/63 [00:12<00:00,  5.15it/s, loss=0.0058, acc=0.9985]
Epoch 12/20: 100%|██████████| 63/63 [00:11<00:00,  5.25it/s, lo

Training time: 239.97s

Evaluating...
Test time: 1.86s

Classification Report:
              precision    recall  f1-score   support

      benign       1.00      0.98      0.99       489
   malicious       0.97      1.00      0.99       321

    accuracy                           0.99       810
   macro avg       0.99      0.99      0.99       810
weighted avg       0.99      0.99      0.99       810

Confusion Matrix:
                 Pred: benign  Pred: malicious
True: benign              480                9
True: malicious             0              321

Detection Rate: 1.0000
False Positive Rate: 0.0184
F1-score (weighted): 0.9889

EXPERIMENT: Window Size = 2000
Creating graph datasets...
Train samples: 1986, Test samples: 810

--- Graph Statistics ---
  Avg nodes per graph: 570.7
  Avg edges per graph: 1139.5
  Max nodes (window size): 2000

--- Model Architecture & Parameters ---
  syscall_embedding.weight: [81, 32] = 2,592 params
  retval_embedding.weight: [42586, 32] = 1,362,

Epoch 1/20: 100%|██████████| 63/63 [00:14<00:00,  4.31it/s, loss=0.2584, acc=0.9104]
Epoch 2/20: 100%|██████████| 63/63 [00:14<00:00,  4.26it/s, loss=0.0241, acc=0.9945]
Epoch 3/20: 100%|██████████| 63/63 [00:14<00:00,  4.25it/s, loss=0.0184, acc=0.9965]
Epoch 4/20: 100%|██████████| 63/63 [00:14<00:00,  4.21it/s, loss=0.0139, acc=0.9980]
Epoch 5/20: 100%|██████████| 63/63 [00:14<00:00,  4.22it/s, loss=0.0052, acc=0.9985]
Epoch 6/20: 100%|██████████| 63/63 [00:15<00:00,  4.15it/s, loss=0.0014, acc=0.9995]
Epoch 7/20: 100%|██████████| 63/63 [00:15<00:00,  4.15it/s, loss=0.0030, acc=0.9995]
Epoch 8/20: 100%|██████████| 63/63 [00:15<00:00,  4.14it/s, loss=0.0033, acc=0.9995]
Epoch 9/20: 100%|██████████| 63/63 [00:15<00:00,  4.13it/s, loss=0.0183, acc=0.9940]
Epoch 10/20: 100%|██████████| 63/63 [00:15<00:00,  4.15it/s, loss=0.0057, acc=0.9985]
Epoch 11/20: 100%|██████████| 63/63 [00:15<00:00,  4.09it/s, loss=0.0061, acc=0.9990]
Epoch 12/20: 100%|██████████| 63/63 [00:15<00:00,  4.13it/s, lo

Training time: 304.64s

Evaluating...
Test time: 2.35s

Classification Report:
              precision    recall  f1-score   support

      benign       1.00      1.00      1.00       489
   malicious       0.99      1.00      1.00       321

    accuracy                           1.00       810
   macro avg       1.00      1.00      1.00       810
weighted avg       1.00      1.00      1.00       810

Confusion Matrix:
                 Pred: benign  Pred: malicious
True: benign              487                2
True: malicious             0              321

Detection Rate: 1.0000
False Positive Rate: 0.0041
F1-score (weighted): 0.9975


In [8]:
# Summary Results Table
print("\n" + "="*80)
print("SUMMARY OF RESULTS (GCN - All Features Combined)")
print("="*80)

results_df = pd.DataFrame(results)
results_df['Detection Rate'] = results_df['Detection Rate'].apply(lambda x: f"{x:.4f}")
results_df['False Positive Rate'] = results_df['False Positive Rate'].apply(lambda x: f"{x:.4f}")
results_df['F1-score (weighted)'] = results_df['F1-score (weighted)'].apply(lambda x: f"{x:.4f}")
results_df['Train Time (s)'] = results_df['Train Time (s)'].apply(lambda x: f"{x:.2f}")
results_df['Test Time (s)'] = results_df['Test Time (s)'].apply(lambda x: f"{x:.2f}")

print(results_df.to_string(index=False))

print(f"\nGraph Construction:")
print(f"  - Node type: Syscall event")
print(f"  - Edge type: Temporal (consecutive syscalls)")
print(f"  - Graph type: Undirected")

print(f"\nNode Features combined:")
print(f"  - Syscall embedding: {SYSCALL_EMBED_DIM} dims")
print(f"  - Return value embedding: {RETVAL_EMBED_DIM} dims")
print(f"  - Parameter embedding (sentence transformer): {PARAM_EMBED_DIM} dims")
print(f"  - Total node features: {SYSCALL_EMBED_DIM + RETVAL_EMBED_DIM + PARAM_EMBED_DIM} dims")
print(f"  - Projected to hidden dim: 128 dims")

print(f"\nGCN Architecture:")
print(f"  - Number of GCN layers: 3")
print(f"  - Hidden dimension: 128")
print(f"  - Pooling: Mean + Max")
print(f"  - Dropout: 0.3")
print(f"  - Residual connections: Yes")



SUMMARY OF RESULTS (GCN - All Features Combined)
 Window Size Detection Rate False Positive Rate F1-score (weighted) Train Time (s) Test Time (s)
         250         0.9875              0.0123              0.9877         122.49          0.79
         500         0.9969              0.1840              0.8889         183.01          1.34
        1000         1.0000              0.0184              0.9889         239.97          1.86
        2000         1.0000              0.0041              0.9975         304.64          2.35

Graph Construction:
  - Node type: Syscall event
  - Edge type: Temporal (consecutive syscalls)
  - Graph type: Undirected

Node Features combined:
  - Syscall embedding: 32 dims
  - Return value embedding: 32 dims
  - Parameter embedding (sentence transformer): 384 dims
  - Total node features: 448 dims
  - Projected to hidden dim: 128 dims

GCN Architecture:
  - Number of GCN layers: 3
  - Hidden dimension: 128
  - Pooling: Mean + Max
  - Dropout: 0.3
  - Re