# Lab 6

**Topic:** AI in Cybersecurity

This lab demonstrates several AI techniques applied to cybersecurity tasks — phishing detection, anomaly detection, deepfake recognition, AI vs AI simulation, and automated incident response. Each section includes short explanations suitable for report presentation.

## Section 1 — AI-Powered Phishing Detection

Objective: Use machine learning to classify phishing vs legitimate messages and identify key features.

In [ ]:
import pandas as pd
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, accuracy_score, ConfusionMatrixDisplay

# Small dataset
data = {
    'message': [
        "Your account is locked, click here to verify.",
        "Meeting at 10 AM, see you soon.",
        "Update your password immediately to avoid suspension.",
        "Lunch tomorrow?",
        "Claim your free reward now!",
        "Quarterly report attached.",
        "Verify your PayPal details urgently.",
        "Let's schedule a call next week.",
        "You have won a gift card!",
        "Please review the document and confirm."
    ],
    'label': [1,0,1,0,1,0,1,0,1,0]
}

df = pd.DataFrame(data)

# Vectorize and split
X = CountVectorizer().fit_transform(df['message'])
y = df['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

model = LogisticRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

acc = accuracy_score(y_test, y_pred)
cm = confusion_matrix(y_test, y_pred)
print("Model accuracy:", acc)

disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=["Legit", "Phishing"])
disp.plot()


The model classifies short messages as phishing or legitimate. Even a simple Logistic Regression model can detect typical phishing patterns like *verify*, *urgent*, or *free reward*. Below are the top influencing words.

In [ ]:
feature_names = CountVectorizer().fit(df['message']).get_feature_names_out()
model = LogisticRegression().fit(X, y)
importance = abs(model.coef_[0])
top_words = sorted(zip(feature_names, importance), key=lambda x: x[1], reverse=True)[:3]
print("Top influencing words:", top_words)


## Section 2 — Behavioral Analytics / Anomaly Detection

Objective: Detect suspicious login behavior using unsupervised learning (IsolationForest).

In [ ]:
import numpy as np
from sklearn.ensemble import IsolationForest
import matplotlib.pyplot as plt

np.random.seed(42)
data = pd.DataFrame({
    'hour': np.random.randint(8, 20, 100),
    'failed_attempts': np.random.randint(0, 3, 100)
})
# Add suspicious logins
suspicious = pd.DataFrame({'hour': [3, 2, 23], 'failed_attempts': [5, 4, 6]})
data = pd.concat([data, suspicious], ignore_index=True)

iso = IsolationForest(contamination=0.05)
data['anomaly_score'] = iso.fit_predict(data[['hour','failed_attempts']])

plt.hist(data['anomaly_score'])
plt.title('Anomaly Scores Distribution')
plt.show()


Anomalous logins at unusual hours or with many failed attempts are flagged. IsolationForest helps detect outliers without labeled data — making it ideal for cybersecurity analytics.

## Section 3 — Deepfakes and Autoencoders

Objective: Understand autoencoders for detecting tampered or manipulated data.

In [ ]:
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense
import matplotlib.pyplot as plt

(x_train, _), (x_test, _) = mnist.load_data()
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
x_train = x_train.reshape((len(x_train), -1))
x_test = x_test.reshape((len(x_test), -1))

# Simple autoencoder
input_img = Input(shape=(784,))
encoded = Dense(64, activation='relu')(input_img)
decoded = Dense(784, activation='sigmoid')(encoded)
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.fit(x_train, x_train, epochs=3, batch_size=256, shuffle=True, validation_data=(x_test, x_test))

decoded_imgs = autoencoder.predict(x_test[:10])
plt.figure(figsize=(10, 2))
for i in range(10):
    ax = plt.subplot(2, 10, i + 1)
    plt.imshow(x_test[i].reshape(28, 28), cmap='gray')
    ax = plt.subplot(2, 10, i + 11)
    plt.imshow(decoded_imgs[i].reshape(28, 28), cmap='gray')
plt.show()


Autoencoders learn normal patterns in data. In cybersecurity, they can detect deepfakes or altered inputs by recognizing when a reconstruction error exceeds the expected threshold.

## Section 4 — AI vs AI in Cyber Defense

Objective: Simulate how an attacker AI adapts and how a defender retrains to counter new phishing patterns.

In [ ]:
import random

attacker_phrases = ["urgent", "verify", "password", "bank", "free"]
defender_accuracy = []

for round_ in range(5):
    new_word = random.choice(["bonus", "alert", "security", "gift", "lottery"])
    attacker_phrases.append(new_word)
    accuracy = 0.9 - (round_ * 0.05) + random.uniform(-0.02, 0.02)
    defender_accuracy.append(accuracy)

plt.plot(defender_accuracy, marker='o')
plt.title('Defender AI Retraining Performance')
plt.xlabel('Round')
plt.ylabel('Accuracy')
plt.show()


As both attacker and defender AIs evolve, the accuracy fluctuates. This demonstrates the ethical challenge of AI use — both sides can improve, raising security and moral concerns.

## Section 5 — AI-Driven Security Operations

Objective: Automate incident response using simple rule-based AI logic.

In [ ]:
import csv
import datetime

incidents = [
    {"user": "Alice", "failed_logins": 1},
    {"user": "Bob", "failed_logins": 3},
    {"user": "Charlie", "failed_logins": 4}
]

with open("alerts.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["User", "Status", "Timestamp"])
    for i in incidents:
        status = "LOCKED" if i["failed_logins"] >= 3 else "OK"
        writer.writerow([i["user"], status, datetime.datetime.now()])

print("Alerts logged in alerts.csv")


This simple automation locks accounts after repeated failures — mimicking basic SOAR (Security Orchestration, Automation, and Response) logic.

**Conclusion:** This lab covered multiple AI applications in cybersecurity — from phishing detection and anomaly detection to automation. These experiments show how AI improves accuracy, adaptability, and response speed in modern security systems.