Skip to content

bugparty/gemm_visualizations

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

13 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ”ฌ GEMM Memory Access Pattern Visualizer

An interactive educational tool for visualizing and analyzing memory access patterns in General Matrix Multiply (GEMM) operations. Compare different loop orderings, understand cache behavior, and explore the impact of blocking/tiling optimizations.

โœจ Features

๐ŸŽฎ Interactive Web Interface

  • Real-time animation of memory access patterns
  • Multiple loop orderings: ijk, ikj, jik, jki, kij, kji
  • Blocking comparison: blocked vs unblocked implementations
  • Play/pause/step controls for detailed analysis

๐Ÿ“Š Comprehensive Visualization

  • Live memory access animation showing A, B, C matrix accesses
  • Cache hit rate tracking with real-time performance graphs
  • Access frequency heatmaps for pattern analysis
  • Detailed statistics on cache performance

๐ŸŽฅ Video Generation (Legacy)

  • Generate MP4 animations for presentations
  • Cross-platform video encoding support
  • Customizable animation speed and quality

๐Ÿš€ Quick Start

Installation

# Clone the repository
git clone https://github.com/bugparty/gemm_visualizations.git
cd gemm_visualizations

# Install dependencies
pip install -r requirements.txt

Run Interactive Visualizer

python interactive_viz.py

Then open your browser to: http://127.0.0.1:8050

Generate Videos (Original Method)

python gen.py

This will generate MP4 animations for different loop orderings.

๐Ÿ“ Project Structure

gemm_visualizations/
โ”œโ”€โ”€ interactive_viz.py      # ๐ŸŽฎ Main Dash web application
โ”œโ”€โ”€ gemm_simulator.py        # ๐Ÿงฎ Core GEMM simulation engine
โ”œโ”€โ”€ cache_simulator.py       # ๐Ÿ’พ Cache behavior simulator
โ”œโ”€โ”€ gen.py                   # ๐ŸŽฅ Video generation script (legacy)
โ”œโ”€โ”€ requirements.txt         # ๐Ÿ“ฆ Python dependencies
โ”œโ”€โ”€ README.md               # ๐Ÿ“– This file
โ”œโ”€โ”€ drawGemmForLoop.ipynb   # ๐Ÿ““ Jupyter notebook examples
โ””โ”€โ”€ cacheSimu.ipynb         # ๐Ÿ““ Cache simulation examples

๐ŸŽฏ Understanding Loop Orderings

Matrix multiplication: C[i][j] += A[i][k] * B[k][j]

Different loop orderings affect memory access patterns:

Loop Order Description Cache Behavior
IJK Row-Column-Depth Good for C, poor for B
IKJ Row-Depth-Column Good spatial locality
JIK Column-Row-Depth Poor for row-major matrices
JKI Column-Depth-Row Poor spatial locality
KIJ Depth-Row-Column Moderate performance
KJI Depth-Column-Row Good for blocked algorithms

๐ŸŽ›๏ธ Interactive Controls

Configuration Panel

  • Matrix Size: 4ร—4 to 32ร—32
  • Block Size: 2 to 16 (for blocked algorithms)
  • Loop Order: Select from 6 different orderings
  • Blocking: Toggle between blocked/unblocked

Animation Controls

  • โ–ถ Play: Start automatic animation
  • โธ Pause: Pause animation
  • ๐Ÿ”„ Reset: Reset to frame 0
  • Speed Slider: Control animation speed (1-100 fps)
  • Frame Slider: Jump to specific frame

๐Ÿ“Š Interpreting Results

Cache Hit Rate

  • > 90%: Excellent cache utilization
  • 70-90%: Good performance
  • < 70%: Poor cache behavior, consider optimization

Access Frequency Heatmap

  • Bright spots: Frequently accessed elements
  • Uniform color: Good spatial locality
  • Scattered patterns: Cache-unfriendly access

๐Ÿงช Examples

Compare Blocked vs Unblocked

  1. Set matrix size to 16ร—16, block size to 4
  2. Select "KJI" loop order, set "Blocked" to ON
  3. Note the cache hit rate
  4. Switch "Blocked" to OFF
  5. Compare performance!

Best vs Worst Loop Order

For blocked algorithms (16ร—16, block=4):

  • Best: KJI or IKJ (~85%+ hit rate)
  • Worst: JKI or JIK (~60-70% hit rate)

๐Ÿ’ก Educational Use

This tool is designed for:

  • Computer Architecture courses: Teaching cache hierarchies
  • Performance optimization: Understanding memory access patterns
  • Algorithm analysis: Comparing loop transformations
  • Research: Experimenting with blocking strategies

๐Ÿ”ง Advanced Usage

Customize Cache Configuration

Edit cache_simulator.py:

cache = CacheSimulator(
    cache_size=32768,      # 32KB L1 cache
    line_size=64,          # 64-byte cache lines
    associativity=8,       # 8-way set associative
    element_size=8         # 8 bytes per double
)

Add New Loop Orders

Edit gemm_simulator.py to add custom loop transformations.

Export Data

Modify interactive_viz.py to add CSV/JSON export functionality.

๐Ÿ› Troubleshooting

Port Already in Use

# Change port in interactive_viz.py
app.run_server(debug=True, host='0.0.0.0', port=8051)

Video Encoding Fails

# Install FFmpeg
# Ubuntu/Debian:
sudo apt-get install ffmpeg

# macOS:
brew install ffmpeg

# Windows:
# Download from https://ffmpeg.org/

Module Not Found

# Ensure all dependencies are installed
pip install -r requirements.txt --upgrade

๐Ÿ“š Technical Details

Memory Layout

  • Row-major order: C[i][j] stored at base + (i*n + j)*sizeof(element)
  • Cache line: 64 bytes (8 doubles)
  • Spatial locality: Sequential elements benefit from cache prefetch

Cache Simulation

  • LRU replacement: Least Recently Used eviction policy
  • Set-associative: Configurable N-way associativity
  • Address mapping: Tag-Index-Offset breakdown

๐Ÿค Contributing

Contributions welcome! Areas for improvement:

  • Additional cache replacement policies (FIFO, Random)
  • 3D visualization of memory access
  • Performance prediction models
  • Multi-level cache hierarchy
  • Non-square matrix support

๐Ÿ“„ License

MIT License - feel free to use for educational purposes!

๐Ÿ™ Acknowledgments

Based on classical GEMM optimization techniques from:

  • Computer Architecture: A Quantitative Approach (Hennessy & Patterson)
  • Optimizing Matrix Multiply (Goto & Geijn)

๐Ÿ“ž Contact

For questions or feedback, please open an issue on GitHub.


Happy Learning! ๐Ÿš€

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors