Cross-vocabulary speculative decoding: a CPU-verifiable reference implementation and acceptance-length (tau) measurement harness.
python machine-learning transformers llm-inference speculative-decoding draft-model cross-vocabulary acceptance-length
-
Updated
May 30, 2026 - Python