Skip to content

JeevanBhoot/llm-decoding

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLM Decoding

Installation

Install uv:

curl -LsSf https://astral.sh/uv/install.sh | sh

Install dependencies:

uv sync

Add your HuggingFace token to .env:

HF_TOKEN=hf_...

Decoding

Minimal Greedy Decoding

uv run --env-file .env greedy_decode.py

Results on 1x 4090:

  • Time taken: 0.75s
  • Output throughput: 33.44 tok/s

Prompt Lookup Speculative Decoding

uv run --env-file .env speculative_decode.py

Results on 1x 4090:

  • Time taken: 0.62s
  • Output throughput: 40.16 tok/s

About

Prompt Lookup Speculative Decoding vs Standard Decoding for LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages