-
Microsoft Research
- Beijing
- https://baotonglu.github.io/
Highlights
- Pro
Lists (3)
Sort Name ascending (A-Z)
Stars
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
FlashInfer: Kernel Library for LLM Serving
This repo contains the source code for RULER: What’s the Real Context Size of Your Long-Context Language Models?
Low-Latency Transaction Scheduling via Userspace Interrupts: Why Wait or Yield When You Can Preempt? (SIGMOD 2025)
📰 Must-read papers on KV Cache Compression (constantly updating 🤗).
[NeurIPS'24 Spotlight, ICLR'25] To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an …
[EMNLP'23, ACL'24] To speed up LLMs' inference and enhance LLM's perceive of key information, compress the prompt and KV-Cache, which achieves up to 20x compression with minimal performance loss.
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
Header-only C++/python library for fast approximate nearest neighbors
VLDB 2024 paper repo. RoarGraph: A Projected Bipartite Graph for Efficient Cross-Modal Approximate Nearest Neighbor Search
A high-throughput and memory-efficient inference and serving engine for LLMs
A library for efficient similarity search and clustering of dense vectors.
《Machine Learning Systems: Design and Implementation》- Chinese Version
Learning material for CMU10-714: Deep Learning System
12 weeks, 26 lessons, 52 quizzes, classic Machine Learning for all
DINOMO: An Elastic, Scalable, High-Performance Key-Value Store for Disaggregated Persistent Memory (PVLDB 2022, VLDB 2023)
The Art of Latency Hiding in Modern Database Engines (VLDB 2024)
CLHT is a very fast and scalable (lock-based and lock-free) concurrent hash table with cache-line sized buckets.
MICA: A Fast In-memory Key-Value Store (see isca2015 branch for the ISCA2015 version)
Easy flamegraphs for Rust projects and everything else, without Perl or pipes <3
Source code for the book Exploring BeagleBone, by Derek Molloy (see www.exploringbeaglebone.com)