Skip to content
View piDack's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Beijing
  • 07:34 - 8h ahead

Block or report piDack

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 169 9 Updated Mar 27, 2025

Applied AI experiments and examples for PyTorch

Python 250 24 Updated Mar 21, 2025

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 809 50 Updated Mar 19, 2025

DeepEP: an efficient expert-parallel communication library

Cuda 7,320 679 Updated Mar 27, 2025

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,944 230 Updated Mar 4, 2025

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems

Python 238 21 Updated Mar 27, 2025

Get your documents ready for gen AI

Python 25,498 1,522 Updated Mar 26, 2025

A Easy-to-understand TensorOp Matmul Tutorial

C++ 332 38 Updated Sep 21, 2024

Ongoing research training transformer models at scale

Python 11,913 2,671 Updated Mar 27, 2025

Fully open reproduction of DeepSeek-R1

Python 23,398 2,127 Updated Mar 27, 2025

Material for gpu-mode lectures

Jupyter Notebook 4,133 417 Updated Feb 9, 2025

Code release for book "Efficient Training in PyTorch"

Python 53 8 Updated Oct 13, 2024

这是一个从头训练大语言模型的项目,包括预训练、微调和直接偏好优化,模型拥有1B参数,支持中英文。

Python 313 45 Updated Feb 18, 2025
Jupyter Notebook 22 5 Updated Nov 7, 2024

Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 6,343 572 Updated Mar 21, 2025

[NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't Know'"

Python 109 10 Updated Jul 10, 2024

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

733 48 Updated Feb 28, 2025

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 423 18 Updated Jan 13, 2025

Stretching GPU performance for GEMMs and tensor contractions.

Python 234 158 Updated Mar 17, 2025

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 2,085 305 Updated Mar 25, 2025

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,369 47 Updated Mar 27, 2025

About This sample shows how to deploy glm-edge model series using OpenVINO

Python 2 1 Updated Jan 13, 2025

Only implemented through torch: "bi - mamba2" , "vision- mamba2 -torch". support 1d/2d/3d/nd and support export by jit.script/onnx;

Python 287 12 Updated Dec 11, 2024

GLM Series Edge Models

Python 131 6 Updated Feb 19, 2025

Run Generative AI models with simple C++/Python API and using OpenVINO Runtime

C++ 248 226 Updated Mar 26, 2025

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Python 1,800 263 Updated Mar 24, 2025
C++ 494 77 Updated Mar 25, 2025

LLM implementation one matrix multiplication at a time

Jupyter Notebook 9 4 Updated Aug 8, 2024

Library for modelling performance costs of different Neural Network workloads on NPU devices

C++ 32 3 Updated Mar 27, 2025
Next
Showing results