draft-model

Here are 3 public repositories matching this topic...

adityakamat24 / Verifier-Guided-Speculative-Decoding

multiple tokens, and a verifier filters them using the main model’s confidence. Focuses on speed–accuracy tradeoffs, visualization, and modular design for easy benchmarking and research.

visualization benchmarking acceleration research rejection-sampling modular-design llm-inference speculative-decoding token-verification verifier-guided-decoding draft-model efficient-generation speed-accuracy-tradeoff

Updated Nov 9, 2025
Jupyter Notebook

DAWNCR0W / dflasher

Star

CLI for building and testing DFlash-style speculative decoding draft models.

cuda transformers mlx huggingface apple-silicon vllm llm-inference speculative-decoding draft-model dflash

Updated May 31, 2026
Python

hinanohart / speclattice

Star

Cross-vocabulary speculative decoding: a CPU-verifiable reference implementation and acceptance-length (tau) measurement harness.

python machine-learning transformers llm-inference speculative-decoding draft-model cross-vocabulary acceptance-length

Updated May 30, 2026
Python

Improve this page

Add a description, image, and links to the draft-model topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the draft-model topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly