# ⚓ Cadet Cyber Mission: Threat Detector - Real Data Edition + Easy Testing
Welcome, cadet! You are now working with **real network traffic data** to build a threat detector.

Once your model is trained, you'll be able to test new traffic easily using sliders and dropdowns.

Let’s begin!


In [None]:
# ✅ STEP 1: Import Tools
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report


In [None]:
# ✅ STEP 2: Load Dataset
file_path = '/content/cicids2017_sample_trimmed.csv'  # Adjust this path if needed
df = pd.read_csv(file_path)
df.head()


In [None]:
# ✅ STEP 3: Preprocessing
X = df[['Flow_Duration', 'Tot_Fwd_Pkts', 'Tot_Bwd_Pkts', 'TotLen_Fwd_Pkts', 'TotLen_Bwd_Pkts', 'Pkt_Len_Var']]
y = df['Malicious']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


In [None]:
# ✅ STEP 4: Train the Model
clf = RandomForestClassifier(n_estimators=100, random_state=42)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)


In [None]:
# ✅ STEP 5: Evaluate the Model
print("🎯 Accuracy:", accuracy_score(y_test, y_pred))
print("\n🧾 Confusion Matrix:\n", confusion_matrix(y_test, y_pred))
print("\n📋 Classification Report:\n", classification_report(y_test, y_pred))


In [None]:
# ✅ STEP 6: Visualize Results
plt.figure(figsize=(6,4))
plt.bar(['Benign', 'Malicious'], [list(y_pred).count(0), list(y_pred).count(1)])
plt.title('Predicted Traffic Classification')
plt.ylabel('Number of Connections')
plt.tight_layout()
plt.show()


In [None]:
# ✅ STEP 7: Test It Yourself (Cadet-Friendly)
# Use sliders to simulate a new network connection and see what the model thinks

import ipywidgets as widgets
from IPython.display import display

def predict_traffic(flow_duration, tot_fwd, tot_bwd, len_fwd, len_bwd, var_pkt):
    test_input = pd.DataFrame([{
        'Flow_Duration': flow_duration,
        'Tot_Fwd_Pkts': tot_fwd,
        'Tot_Bwd_Pkts': tot_bwd,
        'TotLen_Fwd_Pkts': len_fwd,
        'TotLen_Bwd_Pkts': len_bwd,
        'Pkt_Len_Var': var_pkt
    }])
    result = clf.predict(test_input)[0]
    label = "🚨 MALICIOUS" if result == 1 else "✅ BENIGN"
    print(f"Prediction: {label}")

widgets.interact(
    predict_traffic,
    flow_duration=widgets.IntSlider(min=1000, max=100000, step=1000, value=50000),
    tot_fwd=widgets.IntSlider(min=1, max=100, step=1, value=50),
    tot_bwd=widgets.IntSlider(min=1, max=100, step=1, value=50),
    len_fwd=widgets.IntSlider(min=40, max=1500, step=10, value=600),
    len_bwd=widgets.IntSlider(min=40, max=1500, step=10, value=600),
    var_pkt=widgets.FloatSlider(min=0.1, max=10.0, step=0.1, value=5.0)
)


## 🧠 MISSION DEBRIEF
Great work, cadet! You trained a model, tested real traffic, and simulated your own inputs.

Think about:
- What inputs made it say "Malicious"?
- Can you trick it?
- How would this help in a real Navy cybersecurity center?

Dismissed with honors.
