SqueezeAILab
SqueezeAI is part of Berkeley AI Research Lab at UC Berkeley focused on AI Systems research.
Popular repositories Loading
- 
      LLMCompilerLLMCompiler Public[ICML 2024] LLMCompiler: An LLM Compiler for Parallel Function Calling 
- 
      
- 
      SqueezedAttentionSqueezedAttention Public[ACL 2025] Squeezed Attention: Accelerating Long Prompt LLM Inference 
Repositories
    Showing 10 of 14 repositories
    
  
  
    
      
-           
-           
-           KVQuant Public[NeurIPS 2024] KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization 
Top languages
Loading…
Most used topics
Loading…