MIT HAN Lab
Pinned Loading
Repositories
- llm-awq Public
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
- torchquantum Public
A PyTorch-based framework for Quantum Classical Simulation, Quantum Machine Learning, Quantum Neural Networks, Parameterized Quantum Circuits with support for easy deployments on real quantum computers.
- vila-u Public
[ICLR 2025] VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation
- efficientvit Public
Efficient vision foundation models for high-resolution generation and perception.
- omniserve Public
[MLSys'25] QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving; [MLSys'25] LServe: Efficient Long-sequence LLM Serving with Unified Sparse Attention
- torchsparse Public
[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.
Top languages
Loading…
Most used topics
Loading…