An interactive visualization tool for understanding the Transformer architecture and GPT principles. Based on the groundbreaking paper "Attention is All You Need", this educational web application provides step-by-step explanations with mathematical formulas, animations, and hands-on examples.
- Interactive Animations: Visualize the differences between RNN, CNN, and Transformer architectures
- Step-by-Step Attention Mechanism: 5-step detailed walkthrough of self-attention calculation
- Mathematical Foundation: LaTeX-rendered formulas with clear explanations
- Multi-Head Attention Visualization: Dynamic visualization of parallel attention heads
- Positional Encoding Heatmap: Real-time adjustable visualization
- GPT Generation Demo: Interactive text generation with temperature control
- Model Parameter Calculator: Estimate model size with different configurations
- Complete PyTorch Implementation: Production-ready code examples
- Self-Attention mechanism (Query, Key, Value)
- Multi-Head Attention
- Positional Encoding
- Encoder-Decoder architecture
- GPT and autoregressive language modeling
- Temperature in text generation
- Clone the repository:
git clone https://github.com/yourusername/transformer-visual-tutorial.git
- Open
index.html
in your browser (requires internet connection for MathJax)
The tutorial uses the Chinese phrase "我爱你" (I love you) as a simple example to demonstrate:
- Word embeddings (4-dimensional vectors)
- Q, K, V matrix calculations
- Attention score computation
- Softmax normalization
- Weighted sum output
- HTML5: Structure and content
- CSS3: Styling and animations
- JavaScript: Interactive functionality
- MathJax 3: LaTeX formula rendering
- Canvas API: Visualizations and animations
- Why Transformer? - Comparison with RNN/CNN
- Self-Attention - Core mechanism explained
- Multi-Head Attention - Parallel processing
- Positional Encoding - Injecting sequence order
- Complete Architecture - Encoder and Decoder
- GPT Principles - Autoregressive generation
- Hands-on Examples - Complete calculations
- Code Implementation - PyTorch examples
Contributions are welcome! Please feel free to submit a Pull Request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Based on "Attention is All You Need" by Vaswani et al.
- Inspired by various Transformer visualization projects
- Thanks to the open-source community
一个用于理解Transformer架构和GPT原理的交互式可视化工具。基于开创性论文"Attention is All You Need",这个教育性网页应用提供了包含数学公式、动画和实践示例的逐步讲解。
- 交互式动画:可视化RNN、CNN和Transformer架构的区别
- 逐步注意力机制:5步详细展示自注意力计算过程
- 数学基础:LaTeX渲染的公式和清晰解释
- 多头注意力可视化:动态展示并行注意力头
- 位置编码热力图:实时可调节的可视化
- GPT生成演示:带温度控制的交互式文本生成
- 模型参数计算器:估算不同配置下的模型大小
- 完整PyTorch实现:生产就绪的代码示例
- 自注意力机制(查询、键、值)
- 多头注意力
- 位置编码
- 编码器-解码器架构
- GPT和自回归语言建模
- 文本生成中的温度参数
- 克隆仓库:
git clone https://github.com/yourusername/transformer-visual-tutorial.git
- 在浏览器中打开
index.html
(需要网络连接以加载MathJax)
教程使用中文短语"我爱你"作为简单示例来演示:
- 词嵌入(4维向量)
- Q、K、V矩阵计算
- 注意力分数计算
- Softmax归一化
- 加权求和输出
- HTML5:结构和内容
- CSS3:样式和动画
- JavaScript:交互功能
- MathJax 3:LaTeX公式渲染
- Canvas API:可视化和动画
- 为什么需要Transformer? - 与RNN/CNN对比
- 自注意力机制 - 核心机制详解
- 多头注意力 - 并行处理
- 位置编码 - 注入序列顺序
- 完整架构 - 编码器和解码器
- GPT原理 - 自回归生成
- 实践示例 - 完整计算
- 代码实现 - PyTorch示例
欢迎贡献!请随时提交Pull Request。
本项目采用MIT许可证 - 详见 LICENSE 文件。
- 基于Vaswani等人的"Attention is All You Need"
- 受各种Transformer可视化项目启发
- 感谢开源社区