mlsys
Here are 35 public repositories matching this topic...
🚀 Awesome System for Machine Learning ⚡️ AI System Papers and Industry Practice. ⚡️ System for Machine Learning, LLM (Large Language Model), GenAI (Generative AI). 🍻 OSDI, NSDI, SIGCOMM, SoCC, MLSys, etc. 🗃️ Llama3, Mistral, etc. 🧑💻 Video Tutorials.
-
Updated
Aug 14, 2024
Quantized Attention that achieves speedups of 2.1-3.1x and 2.7-5.1x compared to FlashAttention2 and xformers, respectively, without lossing end-to-end metrics across various models.
-
Updated
Mar 21, 2025 - Cuda
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
-
Updated
Mar 25, 2025 - Cuda
FedScale is a scalable and extensible open-source federated learning (FL) platform.
-
Updated
Dec 18, 2023 - Python
SpargeAttention: A training-free sparse attention that can accelerate any model inference.
-
Updated
Mar 14, 2025 - Cuda
Measure and optimize the energy consumption of your AI applications!
-
Updated
Mar 26, 2025 - Python
Machine Learning Framework for Operating Systems - Brings ML to Linux kernel
-
Updated
Dec 13, 2021 - C
An acceleration library that supports arbitrary bit-width combinatorial quantization operations
-
Updated
Sep 30, 2024 - C++
A scalable & efficient active learning/data selection system for everyone.
-
Updated
Jul 8, 2024 - Python
The repository has collected a batch of noteworthy MLSys bloggers (Algorithms/Systems)
-
Updated
Jan 5, 2025 - HTML
Distributed RL System for LLM Reasoning
-
Updated
Mar 11, 2025 - Python
📚FFPA(Split-D): Yet another Faster Flash Prefill Attention with O(1) GPU SRAM complexity for headdim > 256, ~2x↑🎉vs SDPA EA.
-
Updated
Mar 25, 2025 - Cuda
A ChatGPT(GPT-3.5) & GPT-4 Workload Trace to Optimize LLM Serving Systems
-
Updated
Oct 15, 2024 - Python
Optimal Sparse Decision Trees
-
Updated
Apr 27, 2023 - Python
Federated Learning Systems Paper List
-
Updated
Feb 7, 2024
NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference
-
Updated
Dec 9, 2024 - Python
sensAI: ConvNets Decomposition via Class Parallelism for Fast Inference on Live Data
-
Updated
Jul 25, 2024 - Python
Improve this page
Add a description, image, and links to the mlsys topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the mlsys topic, visit your repo's landing page and select "manage topics."