✂️ SustainableKV

This repo is the official implementation of the KV cache eviction algorithm --- SustainableKV. We guarantee all experiment results are reproducible.

🛠️ Environment Setup

conda create --name sustainablekv python=3.11
conda activate sustainablekv

git clone https://github.com/YUECHE77/SustainableKV.git
cd SustainableKV
pip install -e .
pip install -r requirements.txt

⚠️ We recommend manually installing PyTorch and FlashAttention to avoid version conflicts.

Install PyTorch (CUDA 11.8):

pip uninstall torch torchvision torchaudio -y
pip install torch==2.3.0 torchvision==0.18.0 torchaudio==2.3.0 --index-url https://download.pytorch.org/whl/cu118

Install FlashAttention:

FlashAttention is sensitive to version mismatches. You can find all official wheels here.

A configuration that guaranteed to work: cuda11.8, python3.11, pytorch==2.3.0, flash_attn==2.5.8

Check your ABI setting:

python -c "import torch; print(torch._C._GLIBCXX_USE_CXX11_ABI)"  # Just to check to use abiFALSE or abiTRUE

Then install the correct wheel:

pip install https://github.com/Dao-AILab/flash-attention/releases/download/v2.5.8/flash_attn-2.5.8+cu118torch2.3cxx11abiFALSE-cp311-cp311-linux_x86_64.whl

🔥Quick Start

Download the models from HuggingFace (please refer to model). Currently, we support Mistral / Mixtral / LLaMA Family. Then replace your model path here.
We prepare a demo. You can modify the method, model_to_use, and model2path to test our methods. Please also modify the path to the SnapKV's paper in line 54, which is the input document.

💯Key Implementations

The detailed algorithm of SustainableKV is in the file sustainablekv_utils.py

You can easily integrate SustainableKV with other models. Just follow the same pattern as those existing models. Currently, we support Llama family/ Mistral/ Mixtral

🧪Reproduce the Experiments results

LongBench:

cd experiments/LongBench
bash longbench.sh

Needle In A Haystack:

cd experiments/NeedleInHaystack
python pred_sustainable.py \
    --model-name lwm-text-chat-1m \
    -s 1000 \
    -e 30000 \
    --num-intervals 15 \
    --compress \
    --save-folder /the/folder/path

✨Partial Results

Acknowledgement

Many thanks to SnapKV for their great work!!!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
SustainableKV		SustainableKV
assets		assets
experiments		experiments
notebooks		notebooks
snapkv/monkeypatch		snapkv/monkeypatch
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

✂️ SustainableKV

🛠️ Environment Setup

🔥Quick Start

💯Key Implementations

🧪Reproduce the Experiments results

✨Partial Results

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

License

YUECHE77/SustainableKV

Folders and files

Latest commit

History

Repository files navigation

✂️ SustainableKV

🛠️ Environment Setup

🔥Quick Start

💯Key Implementations

🧪Reproduce the Experiments results

✨Partial Results

Acknowledgement

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages