Skip to content
View Wind0121's full-sized avatar
  • Huazhong University of Science and Technology
  • Huazhong University of Science and Technology
  • 01:32 - 8h ahead

Highlights

  • Pro

Block or report Wind0121

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

10 Lessons to Get Started Building AI Agents

Jupyter Notebook 8,335 2,066 Updated Mar 27, 2025

整理开源的中文大语言模型,以规模较小、可私有化部署、训练成本较低的模型为主,包括底座模型,垂直领域微调及应用,数据集与教程等。

19,191 1,846 Updated Sep 19, 2024

Tensor library for machine learning

C++ 12,196 1,189 Updated Mar 27, 2025

Puzzles for learning Triton

Jupyter Notebook 1,540 122 Updated Nov 18, 2024

A repository to unravel the language of GPUs, making their kernel conversations easy to understand

Python 169 6 Updated Mar 21, 2025

Utilities intended for use with Llama models.

Python 5,952 1,013 Updated Mar 1, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 12,555 1,374 Updated Mar 27, 2025

High-speed Large Language Model Serving for Local Deployment

C++ 8,165 426 Updated Feb 19, 2025

Unified KV Cache Compression Methods for Auto-Regressive Models

Python 962 126 Updated Jan 4, 2025

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,287 569 Updated Oct 28, 2024

纯c++的全平台llm加速库,支持python调用,chatglm-6B级模型单卡可达10000+token / s,支持glm, llama, moss基座,手机端流畅运行

C++ 3,446 351 Updated Mar 19, 2025

A self-learning tutorail for CUDA High Performance Programing.

JavaScript 498 54 Updated Mar 6, 2025

📚200+ Tensor/CUDA Cores Kernels, ⚡️flash-attn-mma, ⚡️hgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 🎉🎉).

Cuda 3,032 318 Updated Mar 27, 2025

how to optimize some algorithm in cuda.

Cuda 2,046 183 Updated Mar 26, 2025

Transformer: PyTorch Implementation of "Attention Is All You Need"

Python 3,521 502 Updated Aug 6, 2024

An AI Hedge Fund Team

Python 19,468 3,546 Updated Mar 25, 2025

A PyTorch native library for large model training

Python 3,501 322 Updated Mar 27, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,943 229 Updated Mar 4, 2025

CPU inference for the DeepSeek family of large language models in pure C++

C++ 281 29 Updated Feb 11, 2025

🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!

Python 17,259 1,917 Updated Feb 23, 2025

The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++

CSS 43,449 5,467 Updated Jan 16, 2025

A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

Python 13,138 897 Updated Mar 25, 2025

Clean, minimal, accessible reproduction of DeepSeek R1-Zero

Python 11,377 1,439 Updated Mar 10, 2025

s1: Simple test-time scaling

Python 6,076 709 Updated Mar 6, 2025

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 5,938 516 Updated Mar 27, 2025

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 43,134 5,933 Updated Mar 27, 2025

A collection of LLM papers, blogs, and projects, with a focus on OpenAI o1 🍓 and reasoning techniques.

6,607 365 Updated Mar 26, 2025

Sky-T1: Train your own O1 preview model within $450

Python 3,160 320 Updated Mar 25, 2025

My learning notes/codes for ML SYS.

Python 1,580 89 Updated Mar 27, 2025

Material for gpu-mode lectures

Jupyter Notebook 4,133 416 Updated Feb 9, 2025
Next
Showing results