Interactive visualization of Transformer attention over real text, built with React + Vite + TypeScript and styled with Tailwind. It demonstrates how attention probabilities are formed, how sparsity patterns change compute, and how choices like positional encodings impact behavior.
The app takes user input text, tokenizes it, computes simple content embeddings, projects them into Q/K, builds an attention matrix under a chosen structural pattern (full, sliding window, dilated, Longformer/BigBird‑style, LogSparse), and renders an interactive heatmap. A sidebar reports approximate compute/memory metrics including KV‑cache size and prefill/decode pair counts. Controls allow you to toggle causal masks, positional strategies, simple prompt compression, and parameters of the sparsity pattern.
High‑level files under src/
:
App.tsx
— Main UI. Wires text input, attention computation, heatmap rendering, token tape highlighting, and the metrics panel. Contains controls for model dimensions, positional encodings (sin, ALiBi, RoPE, none), attentional sparsity (full, sliding, dilated, longformer, bigbird, logsparse), causal masking, and KV‑cache presets.main.tsx
— React bootstrap that mounts<App />
into#root
and imports global styles.index.css
— Global Tailwind styles and app‑level tweaks.
Library modules under src/lib/
:
tokenizer.ts
— Small, deterministic tokenizer returning{ raw, norm, isWord }
tokens; not a BPE, optimized for clarity. Includes a compactSTOP
set used to down‑weight frequent function words when desired.embed.ts
— Untrained embedding utilities:charTrigrams()
builds character trigrams with boundary markers.embedToken()
produces hashed embeddings (subword + identity features) with L2 normalization, enough for meaningful similarity without training.sinusoidalPE()
classic sin/cos position encodings.randomLinear()
seeded random projection matrix.matmulRows()
row‑wise matrix multiply helper.
attention.ts
— Core attention implementation:- Supports multiple structural patterns:
full
,sliding
,dilated
(LongNet‑style),longformer
(window + globals),bigbird
(window + globals + random),logsparse
(powers‑of‑two distances; O(n log n)). - Options for causal masking, locality bias, ALiBi bias, RoPE rotation of Q/K, key penalties, and prompt compression (key mask).
- Returns the probability matrix
A
after row‑wise softmax.
- Supports multiple structural patterns:
draw.ts
— Canvas heatmap rendering utilities:drawHeatmap()
maps probabilities to HSL colors and draws optional cell highlight.drawTiles()
overlays a Q×K tiling grid (useful when discussing tiled implementations like FlashAttention).
utils.ts
— Small helpers:clamp
,range
, a 32‑bit string hash, PRNG,randn
, and number/byte formatters.
Components under src/components/
:
Info.tsx
— Small “i” button with a floating tooltip/popover rendered via portal.Select.tsx
— Minimal selectable list with a popover listbox.GuidedTour.tsx
— Lightweight overlay that highlights key UI regions step‑by‑step.ConfigPresets.tsx
— Dropdown for applying mode presets (dense baseline, sliding window, sparse+compression, modern LLM approx.).
- Tokenize the prompt (
tokenizer.ts
). - For each token, compute a hashed content embedding (
embedToken
). - Build sinusoidal positional encodings and seed random projection matrices (
sinusoidalPE
,randomLinear
). - Form Q/K via row‑wise matmul; optionally inject positional bias/transform (ALiBi, RoPE).
- Compute attention logits with structural masks and biases; apply softmax → probabilities (
attention.ts
). - Draw the heatmap (
drawHeatmap
) and token tape overlays. Update metrics panel with pair counts, ops, and memory estimates (computed inApp.tsx
).
Install and start the dev server:
npm install
npm run dev
Open the printed local URL to interact with the UI.
npm run build # type‑check and bundle
npm run preview # static preview server
This repo is configured to deploy via GitHub Actions:
- Vite
base
is set to./
so assets resolve correctly when hosted under a subpath (e.g.,https://<user>.github.io/<repo>/
). - Workflow file:
.github/workflows/deploy.yml
builds the app and publishes thedist/
folder to GitHub Pages.
Steps:
- Push to
main
(ormaster
) to trigger the deploy workflow. - In the repository Settings → Pages, set Source to “GitHub Actions”.
- After deployment, your site will be available at
https://<user>.github.io/<repo>/
.
npm run test
Notes:
- Tests cover the attention math (including sparsity patterns), positional strategies/penalties, and metrics math. UI rendering is not unit‑tested.
- If your environment shows a tinypool shutdown warning after all tests pass, it’s benign; the suite completes and reports results correctly.
- Heatmap color scale defaults to row‑contrast with gamma 0.6 to emphasize within‑row differences. Switch to absolute scaling to compare magnitudes across rows.
- Toggle the causal mask to see autoregressive behavior (upper‑triangle masked out).
- Use the presets to load sample texts that surface repeated tokens, ambiguity, and long‑range references.
- Try “Start Guided Tour” for a quick walkthrough of the text, heatmap, and compute views, including a demo of quadratic scaling and sliding‑window optimization.
- Use “Mode Presets” to quickly switch between Baseline (dense), Efficient (sliding window), Sparse & Focused (LogSparse+compression), and Modern LLM (RoPE+GQA+FlashAttention) configurations.