Skip to content

prajwal-y/context-window-visualizer

Repository files navigation

Context Viz — Attention & Context Window Visualizer

Interactive visualization of Transformer attention over real text, built with React + Vite + TypeScript and styled with Tailwind. It demonstrates how attention probabilities are formed, how sparsity patterns change compute, and how choices like positional encodings impact behavior.

Overview

The app takes user input text, tokenizes it, computes simple content embeddings, projects them into Q/K, builds an attention matrix under a chosen structural pattern (full, sliding window, dilated, Longformer/BigBird‑style, LogSparse), and renders an interactive heatmap. A sidebar reports approximate compute/memory metrics including KV‑cache size and prefill/decode pair counts. Controls allow you to toggle causal masks, positional strategies, simple prompt compression, and parameters of the sparsity pattern.

Project Structure

High‑level files under src/:

  • App.tsx — Main UI. Wires text input, attention computation, heatmap rendering, token tape highlighting, and the metrics panel. Contains controls for model dimensions, positional encodings (sin, ALiBi, RoPE, none), attentional sparsity (full, sliding, dilated, longformer, bigbird, logsparse), causal masking, and KV‑cache presets.
  • main.tsx — React bootstrap that mounts <App /> into #root and imports global styles.
  • index.css — Global Tailwind styles and app‑level tweaks.

Library modules under src/lib/:

  • tokenizer.ts — Small, deterministic tokenizer returning { raw, norm, isWord } tokens; not a BPE, optimized for clarity. Includes a compact STOP set used to down‑weight frequent function words when desired.
  • embed.ts — Untrained embedding utilities:
    • charTrigrams() builds character trigrams with boundary markers.
    • embedToken() produces hashed embeddings (subword + identity features) with L2 normalization, enough for meaningful similarity without training.
    • sinusoidalPE() classic sin/cos position encodings.
    • randomLinear() seeded random projection matrix.
    • matmulRows() row‑wise matrix multiply helper.
  • attention.ts — Core attention implementation:
    • Supports multiple structural patterns: full, sliding, dilated (LongNet‑style), longformer (window + globals), bigbird (window + globals + random), logsparse (powers‑of‑two distances; O(n log n)).
    • Options for causal masking, locality bias, ALiBi bias, RoPE rotation of Q/K, key penalties, and prompt compression (key mask).
    • Returns the probability matrix A after row‑wise softmax.
  • draw.ts — Canvas heatmap rendering utilities:
    • drawHeatmap() maps probabilities to HSL colors and draws optional cell highlight.
    • drawTiles() overlays a Q×K tiling grid (useful when discussing tiled implementations like FlashAttention).
  • utils.ts — Small helpers: clamp, range, a 32‑bit string hash, PRNG, randn, and number/byte formatters.

Components under src/components/:

  • Info.tsx — Small “i” button with a floating tooltip/popover rendered via portal.
  • Select.tsx — Minimal selectable list with a popover listbox.
  • GuidedTour.tsx — Lightweight overlay that highlights key UI regions step‑by‑step.
  • ConfigPresets.tsx — Dropdown for applying mode presets (dense baseline, sliding window, sparse+compression, modern LLM approx.).

Data Flow

  1. Tokenize the prompt (tokenizer.ts).
  2. For each token, compute a hashed content embedding (embedToken).
  3. Build sinusoidal positional encodings and seed random projection matrices (sinusoidalPE, randomLinear).
  4. Form Q/K via row‑wise matmul; optionally inject positional bias/transform (ALiBi, RoPE).
  5. Compute attention logits with structural masks and biases; apply softmax → probabilities (attention.ts).
  6. Draw the heatmap (drawHeatmap) and token tape overlays. Update metrics panel with pair counts, ops, and memory estimates (computed in App.tsx).

Running Locally

Install and start the dev server:

npm install
npm run dev

Open the printed local URL to interact with the UI.

Building & Preview

npm run build   # type‑check and bundle
npm run preview # static preview server

Deploying to GitHub Pages

This repo is configured to deploy via GitHub Actions:

  • Vite base is set to ./ so assets resolve correctly when hosted under a subpath (e.g., https://<user>.github.io/<repo>/).
  • Workflow file: .github/workflows/deploy.yml builds the app and publishes the dist/ folder to GitHub Pages.

Steps:

  • Push to main (or master) to trigger the deploy workflow.
  • In the repository Settings → Pages, set Source to “GitHub Actions”.
  • After deployment, your site will be available at https://<user>.github.io/<repo>/.

Tests (Vitest)

npm run test

Notes:

  • Tests cover the attention math (including sparsity patterns), positional strategies/penalties, and metrics math. UI rendering is not unit‑tested.
  • If your environment shows a tinypool shutdown warning after all tests pass, it’s benign; the suite completes and reports results correctly.

Usage Tips

  • Heatmap color scale defaults to row‑contrast with gamma 0.6 to emphasize within‑row differences. Switch to absolute scaling to compare magnitudes across rows.
  • Toggle the causal mask to see autoregressive behavior (upper‑triangle masked out).
  • Use the presets to load sample texts that surface repeated tokens, ambiguity, and long‑range references.
  • Try “Start Guided Tour” for a quick walkthrough of the text, heatmap, and compute views, including a demo of quadratic scaling and sliding‑window optimization.
  • Use “Mode Presets” to quickly switch between Baseline (dense), Efficient (sliding window), Sparse & Focused (LogSparse+compression), and Modern LLM (RoPE+GQA+FlashAttention) configurations.

About

A tool to visualize context window (attention) in a transformer and the various optimizations

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages