# Neural Network Visualization

- 📺 **Video:** [https://youtu.be/rdohzaGa8aE](https://youtu.be/rdohzaGa8aE)

## Overview
- Interpret hidden representations by probing activations and decision boundaries.
- Use visualization to debug saturated neurons or dead ReLU units.

## Key ideas
- **Activation patterns:** inspect hidden-layer outputs to understand feature learning.
- **Saliency:** gradients highlight input dimensions that drive predictions.
- **Dimensionality reduction:** project hidden states with PCA/TSNE for qualitative analysis.
- **Diagnostics:** spotting dead neurons or exploding activations informs architecture tweaks.

## Demo
Train a small MLP, collect hidden activations, and compute a simple saliency map to mirror the lecture (https://youtu.be/v44sQ0IpVDs) on visualization strategies.

In [1]:
import numpy as np
from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.decomposition import PCA

rng = np.random.default_rng(5)
X, y = make_moons(n_samples=400, noise=0.25, random_state=5)
y_onehot = np.eye(2)[y]
X_train, X_test, y_train, y_test = train_test_split(X, y_onehot, test_size=0.3, random_state=5)

input_dim, hidden_dim, output_dim = 2, 10, 2
W1 = rng.normal(scale=0.4, size=(input_dim, hidden_dim))
B1 = np.zeros(hidden_dim)
W2 = rng.normal(scale=0.4, size=(hidden_dim, output_dim))
B2 = np.zeros(output_dim)

lr = 0.2

def relu(x):
    return np.maximum(0, x)

for epoch in range(1, 301):
    z1 = X_train @ W1 + B1
    a1 = relu(z1)
    logits = a1 @ W2 + B2
    exp = np.exp(logits - logits.max(axis=1, keepdims=True))
    probs = exp / exp.sum(axis=1, keepdims=True)
    loss = -np.mean(np.sum(y_train * np.log(probs + 1e-8), axis=1))

    grad_logits = (probs - y_train) / len(X_train)
    grad_W2 = a1.T @ grad_logits
    grad_B2 = grad_logits.sum(axis=0)
    grad_a1 = grad_logits @ W2.T
    grad_z1 = grad_a1 * (z1 > 0)
    grad_W1 = X_train.T @ grad_z1
    grad_B1 = grad_z1.sum(axis=0)

    W2 -= lr * grad_W2
    B2 -= lr * grad_B2
    W1 -= lr * grad_W1
    B1 -= lr * grad_B1

    if epoch % 100 == 0:
        preds = np.argmax(probs, axis=1)
        acc = accuracy_score(np.argmax(y_train, axis=1), preds)
        print(f"epoch {epoch:3d} | loss {loss:.4f} | train acc {acc:.3f}")

hidden_train = relu(X_train @ W1 + B1)
pca = PCA(n_components=2)
proj = pca.fit_transform(hidden_train)
print()
print('First two PCA components of hidden states (first 5 rows):')
print(proj[:5])

sample = X_test[0]
sample_label = np.argmax(y_test[0])
perturb = np.zeros_like(sample)
eps = 1e-3

for i in range(len(sample)):
    pert = sample.copy()
    pert[i] += eps
    z1_pos = pert @ W1 + B1
    logits_pos = relu(z1_pos) @ W2 + B2
    prob_pos = np.exp(logits_pos - logits_pos.max())
    prob_pos /= prob_pos.sum()

    z1_neg = sample.copy()
    z1_neg[i] -= eps
    z1n = z1_neg @ W1 + B1
    logits_neg = relu(z1n) @ W2 + B2
    prob_neg = np.exp(logits_neg - logits_neg.max())
    prob_neg /= prob_neg.sum()

    perturb[i] = (prob_pos[sample_label] - prob_neg[sample_label]) / (2 * eps)

print()
print('Saliency scores for first test point:', perturb)


epoch 100 | loss 0.3270 | train acc 0.854
epoch 200 | loss 0.3057 | train acc 0.850
epoch 300 | loss 0.2969 | train acc 0.861

First two PCA components of hidden states (first 5 rows):
[[ 1.53146837 -0.839845  ]
 [-1.26497468  0.39839427]
 [-1.11865191  0.23210566]
 [ 1.05886699  0.46724442]
 [ 1.70560633  0.13048269]]

Saliency scores for first test point: [-0.10720718  0.11882685]


## Try it
- Modify the demo
- Add a tiny dataset or counter-example


## References
- [Eisenstein 4.2](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Multiclass lecture note](https://www.cs.utexas.edu/~gdurrett/courses/online-course/multiclass.pdf)
- [A large annotated corpus for learning natural language inference](https://www.aclweb.org/anthology/D15-1075/)
- [Authorship Attribution of Micro-Messages](https://www.aclweb.org/anthology/D13-1193/)
- [50 Years of Test (Un)fairness: Lessons for Machine Learning](https://arxiv.org/pdf/1811.10104.pdf)
- [[Article] Amazon scraps secret AI recruiting tool that showed bias against women](https://www.reuters.com/article/us-amazon-com-jobs-automation-insight/amazon-scraps-secret-ai-recruiting-tool-that-showed-bias-against-women-idUSKCN1MK08G)
- [[Blog] Neural Networks, Manifolds, and Topology](http://colah.github.io/posts/2014-03-NN-Manifolds-Topology/)
- [Eisenstein Chapter 3.1-3.3](https://github.com/jacobeisenstein/gt-nlp-class/blob/master/notes/eisenstein-nlp-notes.pdf)
- [Dropout: a simple way to prevent neural networks from overfitting](https://dl.acm.org/doi/10.5555/2627435.2670313)
- [Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift](https://arxiv.org/abs/1502.03167)
- [Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980)
- [The Marginal Value of Adaptive Gradient Methods in Machine Learning](https://papers.nips.cc/paper/2017/hash/81b3833e2504647f9d794f7d7b9bf341-Abstract.html)


*Links only; we do not redistribute slides or papers.*