[TBD] "m4: A Learned Flow-level Network Simulator" by Chenning Li, Anton A. Zabreyko, Arash Nasr-Esfahany, Kevin Zhao, Prateesh Goyal, Mohammad Alizadeh, Thomas Anderson.

C++ 6 Updated Mar 9, 2025

inferx-net / inferx

InferX is a Inference Function as a Service Platform

Rust 3 Updated Mar 11, 2025

yuyangJin / PerFlow-AI

PerFlow-AI is a programmable performance analysis, modeling, prediction tool for AI system.

Python 18 2 Updated Mar 20, 2025

TELOS-syslab / Pivot

C++ 2 Updated Dec 8, 2024

Mooncake-Labs / pg_mooncake

Postgres-Native Data Warehouse

C++ 1,208 32 Updated Mar 28, 2025

XiangpengHao / seen

Knowledge management for the impatient

Rust 23 3 Updated Mar 12, 2025

vllm-project / production-stack

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Python 924 121 Updated Mar 28, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 1,454 103 Updated Mar 27, 2025

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 1,761 114 Updated Mar 27, 2025

svissicchio / Repetita

REPETITA: Repeatable Experiments for Performance Evaluation of Traffic-Engineering Algorithms

Scala 32 17 Updated Sep 12, 2023

deepseek-ai / smallpond

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,434 388 Updated Mar 5, 2025

tamarin-prover / tamarin-prover

Main source code repository of the Tamarin prover for security protocol verification.

Haskell 452 137 Updated Mar 26, 2025

deepseek-ai / 3FS

A high-performance distributed file system designed to address the challenges of AI training and inference workloads.

C++ 8,389 814 Updated Mar 27, 2025

TUM-DSE / TNIC-main

2 Updated Feb 10, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

970 130 Updated Mar 21, 2025

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,108 177 Updated Mar 24, 2025

deepseek-ai / DualPipe

A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.

Python 2,675 281 Updated Mar 10, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,107 536 Updated Mar 28, 2025

ZiruiOu EnanaAwa

Lists (15)

distributed database

Distributed Systems

DMLSys

🤣 Funny

😁 Hands on MLSys

HW Design

LLM Sys

😅 ML for System

MLSys

Networking system

😆 OS

Privacy for MLSys

Programming Languages

😎 utility

Verification

Stars