# DeepKT + Wide&Deep IRT Visualizations

This notebook generates static visualizations for the key components of the system:
1. **Attention Heatmap**: Visualizing the self-attention mechanism in SAKT.
2. **WD-IRT Parameters**: Item Difficulty vs. Discrimination.
3. **Student Mastery Trajectory**: Evolution of skill mastery over time.
4. **Gaming Detection**: Identifying rapid guessing behavior.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import json
from pathlib import Path

# Set style
sns.set_theme(style="whitegrid")
plt.rcParams['figure.figsize'] = [10, 6]
plt.rcParams['figure.dpi'] = 100

# Paths
DATA_DIR = Path('../docs/data')
REPORTS_DIR = Path('../reports')


## 1. Attention Heatmap
Visualizing how the model attends to past interactions (Keys) when predicting the current one (Query).

In [3]:
def plot_attention_heatmap():
    # Load data
    with open(DATA_DIR / 'attention_sample.json') as f:
        data = json.load(f)['data']
    
    # Extract weights (assuming sparse format or matrix)
    # For this demo, we'll construct a matrix from the links if it's graph data,
    # or use the matrix directly if available. 
    # The current attention_sample.json format from our exporter is likely a list of links.
    
    # Let's assume we have nodes and links from the exporter
    if 'links' in data:
        nodes = data['nodes']
        links = data['links']
        n = len(nodes)
        matrix = np.zeros((n, n))
        
        # Map IDs to indices
        id_map = {node['id']: i for i, node in enumerate(nodes)}
        labels = [node['label'] for node in nodes]
        
        for link in links:
            src = id_map.get(link['source'])
            tgt = id_map.get(link['target'])
            if src is not None and tgt is not None:
                matrix[tgt, src] = link['weight']  # Target attends to Source
                
        plt.figure(figsize=(10, 8))
        sns.heatmap(matrix, xticklabels=labels, yticklabels=labels, cmap="viridis", annot=True, fmt=".2f")
        plt.title("Self-Attention Weights (Query vs Key)")
        plt.xlabel("Key (Past Interaction)")
        plt.ylabel("Query (Current Prediction)")
        plt.show()
    else:
        print("Data format not recognized for heatmap.")

plot_attention_heatmap()


TypeError: list indices must be integers or slices, not str

## 2. WD-IRT Parameters
Scatter plot of Item Difficulty vs. Discrimination, color-coded by item health.

In [None]:
def plot_irt_params():
    # Load data
    with open(DATA_DIR / 'wd_irt_params.json') as f:
        raw_data = json.load(f)['data']
    
    df = pd.DataFrame(raw_data)
    
    plt.figure(figsize=(10, 6))
    sns.scatterplot(
        data=df, 
        x='difficulty', 
        y='discrimination', 
        hue='health_status',
        palette={'good': 'green', 'warning': 'orange', 'critical': 'red'},
        alpha=0.7
    )
    
    plt.title("Item Response Theory Parameters")
    plt.axvline(0, color='gray', linestyle='--', alpha=0.5)
    plt.axhline(1, color='gray', linestyle='--', alpha=0.5)
    plt.show()

plot_irt_params()


## 3. Student Mastery Trajectory
Tracking how a student's mastery of a skill evolves over time.

In [None]:
def plot_mastery_trajectory():
    # Load data
    with open(DATA_DIR / 'sakt_mastery.json') as f:
        raw_data = json.load(f)['data']
    
    df = pd.DataFrame(raw_data)
    
    # Filter for a single student and top skills
    if not df.empty:
        student_id = df['user_id'].iloc[0]
        student_df = df[df['user_id'] == student_id]
        
        plt.figure(figsize=(12, 6))
        sns.lineplot(
            data=student_df, 
            x='sequence_position', 
            y='mastery_score', 
            hue='skill_id',
            marker='o'
        )
        
        plt.title(f"Mastery Trajectory for Student {student_id}")
        plt.ylim(0, 1)
        plt.ylabel("Mastery Probability")
        plt.xlabel("Interaction Sequence")
        plt.show()
    else:
        print("No mastery data available.")

plot_mastery_trajectory()


## 4. Gaming Detection
Identifying rapid guessing behavior by plotting Response Time vs. Correctness.

In [None]:
def plot_gaming_detection():
    # Load data
    with open(DATA_DIR / 'gaming_alerts.json') as f:
        raw_data = json.load(f)['data']
        
    # This file contains alerts, but for the plot we might want the underlying events.
    # If we only have alerts, we can visualize them.
    # For a scatter plot, we'd ideally want (latency, correct) pairs.
    # Let's simulate or use what we have.
    
    df = pd.DataFrame(raw_data)
    
    if not df.empty:
        plt.figure(figsize=(10, 6))
        
        # Plot alerts
        sns.scatterplot(
            data=df,
            x='rapid_guess_pct',
            y='severity',
            hue='severity',
            size='total_events',
            palette='viridis'
        )
        
        plt.title("Gaming Detection Alerts")
        plt.xlabel("Rapid Guess Percentage")
        plt.ylabel("Severity Score")
        plt.show()
    else:
        print("No gaming alerts found.")

plot_gaming_detection()
