Skip to content
View hecao's full-sized avatar

Block or report hecao

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

NVIDIA Inference Xfer Library (NIXL)

C++ 209 20 Updated Mar 26, 2025

NATS Streaming System Server

Go 2,522 286 Updated Apr 1, 2024

A Datacenter Scale Distributed Inference Serving Framework

Rust 3,367 232 Updated Mar 28, 2025

VS Code in the browser

TypeScript 70,531 5,825 Updated Mar 15, 2025

Cost-efficient and pluggable Infrastructure components for GenAI inference

Jupyter Notebook 3,340 310 Updated Mar 29, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,329 684 Updated Mar 28, 2025

FlashMLA: Efficient MLA decoding kernels

C++ 11,386 811 Updated Mar 1, 2025

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

C++ 678 56 Updated Jan 21, 2025

Stateful LLM Serving

Python 50 10 Updated Mar 11, 2025

A highly optimized LLM inference acceleration engine for Llama and its variants.

C++ 881 103 Updated Mar 24, 2025

Open Fabric Interfaces

C 641 405 Updated Mar 26, 2025

Collective communications library with various primitives for multi-machine training.

C++ 1,282 321 Updated Mar 28, 2025

Optimized primitives for collective multi-GPU communication

C++ 3,607 886 Updated Mar 24, 2025

目前已囊括203个大模型,覆盖chatgpt、gpt-4o、o3-mini、谷歌gemini、Claude3.5、智谱GLM-Zero、文心一言、qwen-max、百川、讯飞星火、商汤senseChat、minimax等商用模型, 以及DeepSeek-R1、qwq-32b、deepseek-v3、qwen2.5、llama3.3、phi-4、glm4、gemma3、mistral、书生in…

3,895 166 Updated Mar 29, 2025

Unified Communication X (mailing list - https://elist.ornl.gov/mailman/listinfo/ucx-group)

C 1,276 450 Updated Mar 29, 2025

Infiniband Verbs Performance Tests

C 719 318 Updated Mar 25, 2025

DRAM/SSD hybrid caching system

C++ 12 1 Updated Mar 13, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 2,944 188 Updated Mar 29, 2025

A programming framework for agentic AI 🤖 PyPi: autogen-agentchat Discord: https://aka.ms/autogen-discord Office Hour: https://aka.ms/autogen-officehour

Python 42,359 6,331 Updated Mar 29, 2025

Distributed Task Queue (development branch)

Python 25,938 4,759 Updated Mar 27, 2025

Simple, reliable, and efficient distributed task queue in Go

Go 10,860 777 Updated Mar 11, 2025

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting yo…

TypeScript 87,596 12,965 Updated Mar 29, 2025

a cluster solution for Janus WebRTC server, by API proxy approach

Python 209 50 Updated Sep 2, 2023

Janus WebRTC Server

C 8,523 2,521 Updated Mar 24, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 43,028 6,535 Updated Mar 29, 2025

Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚

Python 27,240 1,669 Updated Mar 21, 2025

Kolors Team

Python 4,304 322 Updated Nov 13, 2024

A generative speech model for daily dialogue.

Python 35,454 3,838 Updated Mar 14, 2025

使用selenium对Discuz建站的论坛发布资源进行爬取,自动评论获取隐藏内容,模拟滑动验证码拖动,转存飞猫云

Python 5 6 Updated Feb 22, 2022

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

C++ 5,336 630 Updated Mar 25, 2025
Next
Showing results