EntropyInfer

An efficient train-free system that accelerates LLM inference, utilizing entropy information of the context.

Experiments on Llama, Qwen and Pangu model series have shown that our method achieve high end-to-end speedup while maintaining generation quality. We currently release implementation that supports openPangu model series, including openPangu-Embedded-1B-v1.1 and openPangu-Embedded-7B-v1.1.

Environment Preparation

pip install -r requirements.txt

Conducting Experiments

To conduct experiment on LongBench dataset, run the following script:

# Run experiment
bash experiment/LongBench/evaluate_longbench.sh
# Run evaluation
python experiment/LongBench/eval.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
entropy_infer		entropy_infer
experiment		experiment
resources		resources
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

EntropyInfer

Environment Preparation

Conducting Experiments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

EntropyInfer

Environment Preparation

Conducting Experiments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages