# CortexAI ‚Äî Inference & Model Explainability

This notebook demonstrates how **CortexAI** performs behavioral ransomware detection
and provides **instance-level explanations** using SHAP-based feature attribution.

The focus is on **interpreting inference outcomes**, not exposing training logic,
model internals, or deployment code.

## Context

CortexAI uses a **trained XGBoost behavioral classifier**
to analyze aggregated network traffic features and classify activity as:

- **0 ‚Üí Benign**
- **1 ‚Üí Malicious**

Key characteristics of the inference process:

- Accepts **multiple preprocessed CSV inputs**
- Performs **feature alignment** with the trained model
- Generates **probability-based predictions**
- Applies **SHAP-based local explanations**
- Produces a **single consolidated inference output**

This notebook reflects the **exact inference behavior** of the CortexAI system.


## Inference Decision Logic (High-Level)

At inference time, CortexAI performs the following steps:

1. Merge multiple feature datasets into a unified input
2. Separate **metadata** (host, time window) from behavioral features
3. Align incoming features with the model‚Äôs trained feature space
4. Generate class probabilities using a trained XGBoost model
5. Apply a **probability threshold (‚â• 0.85)** for malicious classification
6. Compute **SHAP values** for local explainability
7. Return top contributing features per prediction

All steps are automated and containerized within the inference engine.

## üîç Model Explainability (SHAP-Based Local Explanations)

CortexAI integrates **SHAP (SHapley Additive exPlanations)** at inference time
to explain *why* a given network instance was classified as malicious or benign.

For each prediction, the system outputs:

- Predicted label (0 = Benign, 1 = Malicious)
- Prediction probability
- **Top 5 contributing behavioral features**
- Direction and magnitude of each feature‚Äôs contribution

This enables **transparent, behavior-driven detection**
without reliance on signature-based rules.

## üìÑ Sample Inference Output (Unseen Ransomware Family)

The table below represents a **sanitized sample**
from the consolidated inference output generated by CortexAI.

The evaluated ransomware family was **not included during training**.

In [None]:
import pandas as pd

data = {
    "host": ["0.0.0.0", "192.168.1.3", "192.168.1.4"],
    "minute": [12, 12, 12],
    "predicted_label": [1, 1, 1],
    "predicted_probability": [0.978, 0.997, 0.999],
    "top_contributing_features": [
        "dhcp_req_count ‚Üë, packet_size_avg ‚Üë, conn_state_SF ‚Üë",
        "packet_size_avg ‚Üë, dhcp_activity ‚Üë, conn_state_S0 ‚Üë",
        "packet_size_avg ‚Üë, dhcp_activity ‚Üë, external_dns_ratio ‚Üë"
    ]
}

df = pd.DataFrame(data)
df

Unnamed: 0,host,minute,predicted_label,predicted_probability,top_contributing_features
0,0.0.0.0,12,1,0.978,"dhcp_req_count ‚Üë, packet_size_avg ‚Üë, conn_stat..."
1,192.168.1.3,12,1,0.997,"packet_size_avg ‚Üë, dhcp_activity ‚Üë, conn_state..."
2,192.168.1.4,12,1,0.999,"packet_size_avg ‚Üë, dhcp_activity ‚Üë, external_d..."


## Confidence Threshold Policy

CortexAI applies a **probability threshold of 0.85** to determine malicious behavior:

- **‚â• 0.85** ‚Üí Classified as *Malicious*
- **< 0.85** ‚Üí Classified as *Benign*

This threshold balances **detection sensitivity**
with **false-positive control** in behavioral analysis.

## üß† Local Explanation: Feature Contributions

Below is an example of SHAP-based feature contributions
for a single malicious prediction.

In [None]:
{
  "dhcp_req_count": +1.64,
  "conn_flow_pkt_size_avg": +1.12,
  "conn_state_SF_ratio": +1.05,
  "conn_state_S0_ratio": +0.90,
  "conn_avg_duration": -0.73
}

### Interpretation

- **Positive values** increase the probability of malicious classification
- **Negative values** suppress malicious confidence
- Contributions are **local and instance-specific**
- Only the **top 5 features** are exposed per prediction

## Analyst Observations

- ‚úî High-confidence detections across multiple hosts
- ‚úî Consistent behavioral indicators across time windows
- ‚úî No dependency on ransomware family labels
- ‚úî Explainability supports analyst validation

## üéØ Why This Matters

- The model learns **behavioral network patterns**, not malware signatures
- Detection generalizes to **previously unseen ransomware families**
- SHAP explanations confirm **decision rationale**
- Each prediction is **auditable, traceable, and defensible**

## üîí Intellectual Property Protection

To preserve research integrity and prevent misuse:

- Only **top contributing features** are exposed
- Full feature vectors remain internal
- Model internals and thresholds are abstracted
- Explanations focus on *why*, not *how*

This notebook is for **analytical demonstration only**.

## Final Assessment

This notebook demonstrates that CortexAI can:

- Perform **probability-driven behavioral inference**
- Detect **unseen ransomware families**
- Provide **transparent, SHAP-based explanations**
- Support real-world analyst workflows

CortexAI represents a **research-grade, explainable behavioral detection system**
designed for modern cyber threat analysis.