LEGO: An LLM-Enabled Hierarchical Optimizer for Tensor Computation Graphs

Artifact for reproducing Table 1 of the paper.

Structure

workloads/
  resnet/          ResNet-18 (CNN)
  qwen/            Qwen3-8B (LLM Decoder)
  sd3_mmdit/       SD3-MMDiT (Diffusion Transformer)
  mamba/           Mamba-2 SSD (State Space Model)
  ds_mhc_moe/      mHC-MoE (Mixture of Experts)

Each workload contains:
  model_ref.py     PyTorch Eager baseline
  model_new.py     LEGO-optimized (Triton kernels + system-level opts)

ab.py              A/B benchmark tool (eager + torch.compile modes)

Quick Start

Requirements

GPU: NVIDIA RTX 6000 Ada (48GB) or comparable
CUDA 12.9+, PyTorch 2.9+, Triton (bundled with PyTorch)
pip install einops

Benchmark a single workload

python ab.py \
  --ref workloads/qwen/model_ref.py \
  --test workloads/qwen/model_new.py \
  --bench-runs 30 --compile-run-times 1

Benchmark all workloads

for w in resnet qwen sd3_mmdit mamba ds_mhc_moe; do
  echo "=== $w ==="
  python ab.py \
    --ref workloads/$w/model_ref.py \
    --test workloads/$w/model_new.py \
    --bench-runs 30 --compile-run-times 1
done

License

Apache 2.0

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
workloads		workloads
README.md		README.md
ab.py		ab.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LEGO: An LLM-Enabled Hierarchical Optimizer for Tensor Computation Graphs

Structure

Quick Start

Requirements

Benchmark a single workload

Benchmark all workloads

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LEGO: An LLM-Enabled Hierarchical Optimizer for Tensor Computation Graphs

Structure

Quick Start

Requirements

Benchmark a single workload

Benchmark all workloads

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages