R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
-
Updated
Jun 13, 2025 - Python
R-KV: Redundancy-aware KV Cache Compression for Reasoning Models
(ACL 2025 main) SCOPE: Optimizing KV Cache Compression in Long-context Generation
PiKV: MoE KV Cache Management System [Efficient ML System]
Span Queries: What if we had a way to plan and optimize GenAI like we do for SQL?
[SIGMOD 2025] PQCache: Product Quantization-based KVCache for Long Context LLM Inference
This project implements an Emotion-Aware Music Generator (EAMG) that turns natural-language prompts into emotion-aligned music in real time. It uses a LoRA-tuned DistilBERT to classify emotions, maps them to musical parameters using music theory, and generates MIDI via a transformer model with KV caching for low-latency output.
Add a description, image, and links to the kvcache topic page so that developers can more easily learn about it.
To associate your repository with the kvcache topic, visit your repo's landing page and select "manage topics."