Skip to content

SHA-4096/EntropyInfer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EntropyInfer

An efficient train-free system that accelerates LLM inference, utilizing entropy information of the context.

Experiments on Llama, Qwen and Pangu model series have shown that our method achieve high end-to-end speedup while maintaining generation quality. We currently release implementation that supports openPangu model series, including openPangu-Embedded-1B-v1.1 and openPangu-Embedded-7B-v1.1.

framework

Environment Preparation

pip install -r requirements.txt

Conducting Experiments

To conduct experiment on LongBench dataset, run the following script:

# Run experiment
bash experiment/LongBench/evaluate_longbench.sh
# Run evaluation
python experiment/LongBench/eval.py

About

Code implementation for From Rigid to Dynamic: Entropy-Guided Adaptive Inference for Long-Context LLMs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors