Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs Revamp #181

Open
21 of 44 tasks
Tracked by #252
msaroufim opened this issue Apr 26, 2024 · 0 comments
Open
21 of 44 tasks
Tracked by #252

Docs Revamp #181

msaroufim opened this issue Apr 26, 2024 · 0 comments
Labels
documentation Improvements or additions to documentation

Comments

@msaroufim
Copy link
Member

msaroufim commented Apr 26, 2024

Just listing out all the issues I'm seeing with our docs, feel free to pick something up and fix it. First step just add your documentation directly in a relevant subfolder in the repo directly and tag me to review

For API docstrings and end usage instructions that won't change a lot please put them here https://github.com/pytorch/ao/tree/main/docs so they get rendered on pytorch.org/docs

Numbers

The repo is primarily about performance so we should share performance tables directly in the README until we figure out a dashboard like solution

For each sparsity or quantization technique you're working on feel free to add another subsection

  • Autotuner
  • Quantization
    • Usage instructions
    • Performance benchmarks on llama2 or llama3 @HDCharles
    • Accuracy
  • Sparsity
    • Usage instructions
    • Performance benchmarks on llama2 or llama3
    • Accuracy

End to end tutorials

  • Revamp the main README.md to have the features we want to advertise the most broadly
  • How to configure compile for consumer GPUs
  • End to end tutorial with llama3
  • Run an evaluation with eleuther eval @andrewor14
  • torch.ao.pruning accuracy benchmarks on llama2 or llama3

Core concepts

  • Sparsity patterns and how they work
  • What are the different kinds of quantization algorithms
  • How to make quantization/sparsity kernels faster
  • Sparsity for LLMs overview @jcaip
  • docs for AffineQuantizedTensor in quantization.md @jerryzh168

Contributing

  • How to test
  • Version guards
  • How to benchmark

Features

  • AOT inductor and no python overhead tutorial @jerryzh168
  • Update autoquant tutorial to work OOB with llama2/3, should be some copy pastable snippet using our llama model in torchao
  • Smoothquant tutorial is placeholder code, needs an actual runnable snippet or be moved to prototype

Composability

  • We don't have an NF4 tutorial @drisspg
    • How it works
    • Benchmarks
    • FSDP 2 composition
  • How to support new smaller dtypes

Docstrings

e.g https://pytorch.org/ao/stable/generated/torchao.sparsity.WandaSparsifier.html#torchao.sparsity.WandaSparsifier

Confirm they're visible on pytorch.org

Completed

  • We don't have a wanda tutorial @jcaip
  • Sparsity we mention tons of algorithms but should suggest a simple one people should start with @jcaip
  • Our main goals are performance w/ composability with torch.compile and FSDP + performance. And also easy packaging for wide reach @msaroufim
  • In the main README when we talk about features we should link to usage instructions and code not papers @msaroufim
  • Mention HQQ, GaLore and prototype folder somewhere in main docs @msaroufim
  • A doc for how to register a new custom OP for both C++ and Triton @msaroufim
  • Mention tinygemm @msaroufim
@msaroufim msaroufim added the documentation Improvements or additions to documentation label May 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant