# Building an AI Security System (Hybrid Deep Learning)

## The Mission
You are the Chief Security Officer (CSO) of a major bank.
Fraudsters are getting smarter. They don't just steal cards; they steal *identities*.
Your job is to build a **Hybrid AI System** that can:
1.  **Detect** strange behavior patterns (Unsupervised Learning: SOM).
2.  **Learn** to recognize future threats (Supervised Learning: ANN).

## The Pipeline
We will combine two powerful models:
1.  **Stage 1 (The Watchdog)**: A **Self-Organizing Map (SOM)** monitors all customers. It groups them by behavior. It spots "Outliers" (people who don't fit in).
2.  **Stage 2 (The Judge)**: We assume the SOM is right. We label the outliers as "High Risk".
3.  **Stage 3 (The Police)**: An **Artificial Neural Network (ANN)** trains on these findings. It learns to predict the **Probability of Fraud** for any new application in milliseconds.

---

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from sklearn.neural_network import MLPClassifier
from ipywidgets import interact, FloatSlider, IntSlider, Button, Output, VBox, Label, Dropdown, FloatProgress, HTML
import warnings

# Suppress annoying warnings for cleaner output
warnings.filterwarnings('ignore')
np.random.seed(42) # Ensure the "Watchdog" learns the same patterns every time

## Part 1: The Watchdog (SOM Training)
We start by feeding raw credit card data into the Self-Organizing Map.

In [2]:
# 1. Download and Clean Data
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/credit-screening/crx.data"
dataset = pd.read_csv(url, header=None)

# Handle Missing Values & Encode (The messy reality of data)
dataset = dataset.replace('?', np.nan)
for col in dataset.columns:
    # Try numeric, fallback to original if error
    dataset[col] = pd.to_numeric(dataset[col], errors='ignore')
    if dataset[col].dtype == 'object':
        # Fill missing categorical with mode
        dataset[col] = dataset[col].fillna(dataset[col].mode()[0])
        # Simple label encoding
        dataset[col] = dataset[col].astype('category').cat.codes
    else:
        # Fill missing numeric with mean
        dataset[col] = dataset[col].fillna(dataset[col].mean())

# Extract Features (X) and Labels (y)
X = dataset.iloc[:, :-1].values
y = dataset.iloc[:, -1].values

# Scale to 0-1 (Neural Networks love small numbers)
sc = MinMaxScaler(feature_range=(0, 1))
X_scaled = sc.fit_transform(X)

print(f"‚úÖ Data Secured. Processed {len(X)} applications.")

‚úÖ Data Secured. Processed 690 applications.


In [3]:
# The Brain of the Watchdog (SimpleSOM)
class SimpleSOM:
    def __init__(self, x, y, input_len):
        self.x = x
        self.y = y
        # Initialize weights randomly
        self.weights = np.random.random((x, y, input_len))
        self.learning_rate = 0.5
        self.radius = max(x, y) / 2
        self.time_constant = 1.0

    def find_winner(self, sample):
        # Find the neuron closest to the input sample
        diff = self.weights - sample
        sq_dist = np.sum(diff**2, axis=2)
        return np.unravel_index(np.argmin(sq_dist), (self.x, self.y))

    def update_weights(self, sample, winner, iteration):
        # Pull the winner and its neighbors closer to the sample
        rad = self.radius * np.exp(-iteration / self.time_constant)
        lr = self.learning_rate * np.exp(-iteration / self.time_constant)
        if rad < 1e-10: rad = 1e-10
        
        for i in range(self.x):
            for j in range(self.y):
                dist = np.sqrt((i - winner[0])**2 + (j - winner[1])**2)
                if dist <= rad:
                    influence = np.exp(-(dist**2) / (2 * (rad**2)))
                    self.weights[i, j] += lr * influence * (sample - self.weights[i, j])

    def train(self, data, num_epochs):
        total_steps = len(data) * num_epochs
        self.time_constant = total_steps / np.log(self.radius)
        step = 0
        for epoch in range(num_epochs):
            np.random.shuffle(data)
            for sample in data:
                winner = self.find_winner(sample)
                self.update_weights(sample, winner, step)
                step += 1

    def distance_map(self):
        # Calculate how different each neuron is from its neighbors
        # High Distance = Outlier (The "Red" zones)
        dmap = np.zeros((self.x, self.y))
        for i in range(self.x):
            for j in range(self.y):
                neighbors = []
                if i > 0: neighbors.append(self.weights[i-1, j])
                if i < self.x-1: neighbors.append(self.weights[i+1, j])
                if j > 0: neighbors.append(self.weights[i, j-1])
                if j < self.y-1: neighbors.append(self.weights[i, j+1])
                
                if len(neighbors) > 0:
                    dists = [np.linalg.norm(self.weights[i, j] - n) for n in neighbors]
                    dmap[i, j] = np.mean(dists)
        # Normalize to 0-1
        return (dmap - dmap.min()) / (dmap.max() - dmap.min())

# Train the Watchdog
som = SimpleSOM(10, 10, X_scaled.shape[1])
som.train(X_scaled, num_epochs=100)
print("‚úÖ Watchdog Trained. The Map has formed.")

‚úÖ Watchdog Trained. The Map has formed.


## Part 2: Interactive Security Radar

This is where you make the call.
The Map shows the **topology** of customers.
-   **Dark Areas**: Safe clusters of normal customers.
-   **Bright Areas (White)**: Anomalous behavior. Potential fraud.

### Your Control: The Security Threshold
-   **Low Threshold**: You are paranoid. You flag everyone near a bright spot. (High False Positives).
-   **High Threshold**: You are relaxed. You only flag the absolute worst outliers. (High False Negatives).

Adjust the slider below to set the bank's security policy.

In [None]:
def plot_interactive_radar(threshold=0.8):
    dmap = som.distance_map()
    outlier_map = dmap > threshold
    
    plt.figure(figsize=(12, 6))
    
    # 1. The Topology Map
    plt.subplot(1, 2, 1)
    plt.title(f"Security Radar (Threshold: {threshold})")
    plt.imshow(dmap, cmap='bone', interpolation='none')
    plt.colorbar(label='Mean Distance (Anomaly Score)')
    
    # Plot Outliers
    count_outliers = 0
    for i, x in enumerate(X_scaled):
        w = som.find_winner(x)
        is_outlier = outlier_map[w[0], w[1]]
        
        if is_outlier:
            count_outliers += 1
            plt.plot(w[1], w[0], 'o', markeredgecolor='red', markerfacecolor='None', markersize=10, markeredgewidth=2)
    
    # 2. Stats Panel
    plt.subplot(1, 2, 2)
    plt.axis('off')
    total = len(X_scaled)
    percent = (count_outliers / total) * 100
    
    text_color = 'green' if percent < 10 else 'orange' if percent < 20 else 'red'
    
    plt.text(0.1, 0.8, f"Total Customers: {total}", fontsize=14)
    plt.text(0.1, 0.6, f"Flagged Risky: {count_outliers}", fontsize=14, fontweight='bold', color=text_color)
    plt.text(0.1, 0.4, f"Risk Percentage: {percent:.1f}%", fontsize=14)
    
    if percent > 25:
        plt.text(0.1, 0.2, "‚ö†Ô∏è SYSTEM OVERLOAD!", fontsize=12, color='red')
        plt.text(0.1, 0.1, "Too many false positives.", fontsize=10)
    elif percent < 1:
        plt.text(0.1, 0.2, "‚ö†Ô∏è SECURITY LAX!", fontsize=12, color='orange')
        plt.text(0.1, 0.1, "You might miss fraud.", fontsize=10)
    else:
        plt.text(0.1, 0.2, "‚úÖ OPTIMAL RANGE", fontsize=12, color='green')

    plt.tight_layout()
    plt.show()

interact(plot_interactive_radar, 
         threshold=FloatSlider(min=0.1, max=0.99, step=0.01, value=0.85, description='Sensitivity'));

interactive(children=(FloatSlider(value=0.85, description='Sensitivity', max=0.99, min=0.1, step=0.01), Output‚Ä¶

## Part 3: The Police (Supervised Training)

We will now **freeze** your decision.
We take the outliers you identified above (at Threshold 0.85) and use them to train a Neural Network.
This creates a model that can run 1000x faster than the map.

In [5]:
# 1. Generate Labels based on "Standard" threshold of 0.85
dmap = som.distance_map()
outlier_map = dmap > 0.85

y_hybrid = np.zeros(len(X_scaled))
for i, x in enumerate(X_scaled):
    w = som.find_winner(x)
    if outlier_map[w[0], w[1]]:
        y_hybrid[i] = 1 # Mark as Fraud

# 2. Train the ANN
ann = MLPClassifier(hidden_layer_sizes=(16, 16), 
                    activation='relu', 
                    solver='adam', 
                    max_iter=200, 
                    random_state=42)

ann.fit(X_scaled, y_hybrid)
print(f"‚úÖ Neural Police Model Trained. Ready for deployment.")
print(f"Training Data: {len(X_scaled)} records. Fraud Rate in training set: {sum(y_hybrid)/len(y_hybrid)*100:.1f}%")

‚úÖ Neural Police Model Trained. Ready for deployment.
Training Data: 690 records. Fraud Rate in training set: 0.1%


## Part 4: Deployment (The Neural Detective)

The system is live.
We have upgraded the dashboard to **"Neural Detective 2.0"**.
Instead of random IDs, you can now simulate specific scenarios based on customer profiles.

In [6]:
# --- Neural Detective 2.0 ---

# 1. Feature Flavoring (Making data readable)
# We map the cryptic dataset columns to real-world banking terms for the display.
feature_names = {
    1: "Credit Score",      # Originally A2
    2: "Debt Load (k$)",    # Originally A3
    7: "Years Employed",    # Originally A8
    14: "Annual Income (k$)" # Originally A15
}

# 2. Scenario Database (Dynamic Selection)
# We auto-detect interesting cases based on the model's actual predictions.
# This ensures 'The Model Citizen' is actually safe, even if the model retrains randomly.
print("üîç Analyzing database for scenarios...")
probs = ann.predict_proba(X_scaled)[:, 1]

# Find best examples
safe_id = int(np.argmin(probs)) # Lowest probability
risky_id = int(np.argmax(probs)) # Highest probability
borderline_id = int((np.abs(probs - 0.5)).argmin()) # Closest to 0.5

scenarios = {
    'The Model Citizen (Lowest Risk)': safe_id,
    'The Identity Thief (Highest Risk)': risky_id,
    'The Borderline Case (~50%)': borderline_id
}

# Widgets
scenario_dropdown = Dropdown(
    options=scenarios,
    value=safe_id,
    description='Scenario:',
    style={'description_width': 'initial'}
)

btn_scan = Button(description="Run Security Scan", button_style='primary', icon='search')
risk_meter = FloatProgress(
    value=0.0,
    min=0.0,
    max=100.0,
    description='Fraud Probability:',
    bar_style='success', # 'success', 'info', 'warning', 'danger'
    style={'bar_color': 'green', 'description_width': 'initial'}
)

output_kiosk = Output()

def run_scan(b):
    with output_kiosk:
        output_kiosk.clear_output()
        
        cust_id = scenario_dropdown.value
        customer_data = X_scaled[cust_id].reshape(1, -1)
        
        # 1. Get Prediction
        prob = ann.predict_proba(customer_data)[0][1]
        percent = prob * 100
        
        # 2. Update Risk Meter Visuals
        risk_meter.value = percent
        if percent < 20:
            risk_meter.bar_style = 'success'
        elif percent < 60:
            risk_meter.bar_style = 'warning'
        else:
            risk_meter.bar_style = 'danger'
            
        # 3. Print "Dossier"
        print(f"--- SUBJECT DOSSIER: #{cust_id} ---")
        
        # Show Key Features
        raw_row = dataset.iloc[cust_id]
        print("Key Indicators:")
        for idx, name in feature_names.items():
            val = raw_row[idx]
            print(f"  > {name}: {val}")
            
        print("\n--- AI ANALYSIS ---")
        if percent > 85:
            print(f"üö® FINAL VERDICT: HIGH RISK ({percent:.1f}%)")
            print("RECOMMENDATION: IMMEDIATE FREEZE.")
        elif percent > 50:
            print(f"‚ö†Ô∏è FINAL VERDICT: ELEVATED RISK ({percent:.1f}%)")
            print("RECOMMENDATION: MANUAL REVIEW REQUIRED.")
        else:
            print(f"‚úÖ FINAL VERDICT: SAFE ({percent:.1f}%)")
            print("RECOMMENDATION: APPROVE.")

btn_scan.on_click(run_scan)

display(VBox([
    HTML("<h3>üëÆ‚Äç‚ôÇÔ∏è Neural Detective Dashboard 2.0</h3>"),
    HTML("<i>Select a profile to analyze. The system will retrieve their financial history and calculate risk.</i>"),
    scenario_dropdown,
    btn_scan,
    risk_meter,
    output_kiosk
]))

üîç Analyzing database for scenarios...


VBox(children=(HTML(value='<h3>üëÆ\u200d‚ôÇÔ∏è Neural Detective Dashboard 2.0</h3>'), HTML(value='<i>Select a profil‚Ä¶