🧠 LLM Internals Explorer

A visual, interactive, stage-by-stage explainer of exactly what happens when you send a prompt to a large language model. Built for learners who want to see under the hood.

✨ Key Features

🚀 Real Ollama Integration — Connects directly to your local ollama serve to visualize actual tokenization and generation.
🔥 Native PyTorch Backend — An optional Python FastAPI backend that runs small HuggingFace models (like GPT-2 or SmolLM) to extract real mathematical data like attention weights and per-token embeddings.
🎨 Beautiful Dark Theme — A sleek, modern UI designed for clarity, built with CSS variables.
🔄 Smart Mock Mode — Automatically falls back to high-quality mock data if no backends are running, perfect for quick demos or learning on the go.
📱 Responsive & Accessible — Navigate through stages using your keyboard arrow keys.
📚 Built-in Learning — "Why does this matter?" explainers on every single stage.

📖 The 8 Interactive Stages

Raw Input — See how your text is broken down into UTF-8 bytes with hex codes.
Tokenizer — Watch the text split into tokens with IDs, featuring color-coded BPE classifications.
Embeddings — View high-dimensional vectors projected to 2D using PCA. Inspect per-token dimensions and cosine similarities!
Self-Attention — The core of the Transformer. Visualize attention weights in a heatmap or arc view. Analyze head entropy and layer-by-layer attention stats.
Feed-Forward Network — Watch token data flow through the MLP layers with animated node activations.
Layer Stack — Trace a token's journey through all 32 layers of the deep neural network.
Softmax & Temperature — Interactively tweak Temperature, Top-K, and Top-P to see how the probability distribution bends and warps in real-time.
Autoregressive Generation — Watch the model generate text token-by-token with live streaming stats (tokens/sec, latency sparklines).

🛠️ Setup Process & Installation

You can run this project in Standard Mode (Node.js + Ollama) or Advanced Mode (adding the Python PyTorch backend for deep internal data extraction).

1. Prerequisites

Node.js v18+ — Download
Ollama — Download (Pull at least one model: ollama pull llama3)
(Optional) Python 3.9+ — For the PyTorch Native backend.

2. Standard Installation

# 1. Clone or extract the project
git clone https://github.com/yourusername/llm-internals-explorer.git
cd llm-internals-explorer

# 2. Install dependencies (installs root, server, and client packages)
npm install
npm --prefix client install
npm --prefix server install

# 3. Start Ollama in a separate terminal
ollama serve

# 4. Start the development server
npm run dev

The app will be available at http://localhost:5173. The Node proxy server runs on http://localhost:3001.

3. Advanced PyTorch Backend Setup (Highly Recommended)

To visualize real attention matrices and continuous vector embeddings, you can spin up the PyTorch server directly from the main app interface.

# 1. First time only: Install Python dependencies
cd pytorch_server
pip install -r requirements.txt

Once dependencies are installed, just run npm run dev and click the + Start PyTorch button in the app header to launch the companion backend automatically in a new window!

🏗️ Architecture & Project Structure

The project uses a dual-backend architecture to provide both conversational generation and deep internal tensor inspection.

llm-internals-explorer/
├── client/                      # React + Vite frontend
│   ├── src/stages/              # The 8 interactive stage components
│   ├── src/components/          # Shared UI (Nav, Toggles, Tour)
│   ├── src/workers/             # Web Workers (e.g., PCA calculations)
│   └── src/context/             # Global AppContext state
├── server/                      # Node.js / Express Proxy
│   ├── index.js                 # Routes to Ollama & Python backend
│   └── mock/                    # Fallback data when offline
└── pytorch_server/              # Native PyTorch API
    ├── main.py                  # FastAPI server extracting Tensors
    └── requirements.txt         # Python dependencies

📡 API Proxy Routes

The Node server at localhost:3001 intelligently proxies your frontend requests:

/api/generate, /api/model-info, /api/tags ➡️ Local Ollama Server
/api/pytorch/attention, /api/pytorch/embeddings ➡️ Local PyTorch Python Server

⚙️ Known Limitations

Ollama's API Limits: Ollama does not expose internal layer attentions or per-token continuous embeddings natively. Our solution is the pytorch_server backend. If you only use Ollama, Stages 3 and 4 will gracefully fall back to realistic mock data.
Logprobs Requirement: Visualizing alternate token paths in Stage 7 and 8 requires Ollama v0.1.33+.
Heavy Tensors: The PyTorch backend extracts massive tensors (output_attentions=True). On older machines, extracting data for long prompts might take a few seconds.

🤝 Contributing

We welcome contributions! Feel free to improve visualizations, add new stages, or optimize performance. Some open ideas:

Add KV-cache visualization stage.
Show gradient flow during autoregressive generation.
Support for multiple PyTorch models side-by-side.
Export raw mathematically accurate attention matrices as CSV.

📝 License

This project is licensed under the GNUv3 License.

Happy exploring! 🧠✨
Don't forget to take the guided tour in the app to get familiar with the interface!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
client		client
pytorch_server		pytorch_server
server		server
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
package.json		package.json
plan.md		plan.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 LLM Internals Explorer

✨ Key Features

📖 The 8 Interactive Stages

🛠️ Setup Process & Installation

1. Prerequisites

2. Standard Installation

3. Advanced PyTorch Backend Setup (Highly Recommended)

🏗️ Architecture & Project Structure

📡 API Proxy Routes

⚙️ Known Limitations

🤝 Contributing

📝 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🧠 LLM Internals Explorer

✨ Key Features

📖 The 8 Interactive Stages

🛠️ Setup Process & Installation

1. Prerequisites

2. Standard Installation

3. Advanced PyTorch Backend Setup (Highly Recommended)

🏗️ Architecture & Project Structure

📡 API Proxy Routes

⚙️ Known Limitations

🤝 Contributing

📝 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages