piDack

Follow

🎯

Focusing

====== piDack

🎯

Focusing

Follow

Software defined everything!

11 followers · 22 following

Beijing
07:34 - 8h ahead

Achievements

Achievements

Lists (7)

Sort

AI

23 repositories

awesome

CG

Computer Graphic

Heterogeneous Computing

CUDA,HIP & SYCL

14 repositories

QT

QT framework libs & programs

TensorRT

Useful Toolkits

Starred repositories

NVlabs / COAT

[ICLR 2025] COAT: Compressing Optimizer States and Activation for Memory-Efficient FP8 Training

Python 169 9 Updated Mar 27, 2025

pytorch-labs / applied-ai

Applied AI experiments and examples for PyTorch

Python 250 24 Updated Mar 21, 2025

bytedance / flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 809 50 Updated Mar 19, 2025

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 7,320 679 Updated Mar 27, 2025

deepseek-ai / open-infra-index

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

6,944 230 Updated Mar 4, 2025

simeonschaub / python-tensile

1 Updated Jan 16, 2025

ScalingIntelligence / KernelBench

KernelBench: Can LLMs Write GPU Kernels? - Benchmark with Torch -> CUDA problems

Python 238 21 Updated Mar 27, 2025

docling-project / docling

Get your documents ready for gen AI

Python 25,498 1,522 Updated Mar 26, 2025

KnowingNothing / MatmulTutorial

A Easy-to-understand TensorOp Matmul Tutorial

C++ 332 38 Updated Sep 21, 2024

NVIDIA / Megatron-LM

Ongoing research training transformer models at scale

Python 11,913 2,671 Updated Mar 27, 2025

huggingface / open-r1

Fully open reproduction of DeepSeek-R1

Python 23,398 2,127 Updated Mar 27, 2025

gpu-mode / lectures

Material for gpu-mode lectures

Jupyter Notebook 4,133 417 Updated Feb 9, 2025

ailzhang / EfficientPyTorch

Code release for book "Efficient Training in PyTorch"

Python 53 8 Updated Oct 13, 2024

qiufengqijun / mini_qwen

这是一个从头训练大语言模型的项目，包括预训练、微调和直接偏好优化，模型拥有1B参数，支持中英文。

Python 313 45 Updated Feb 18, 2025

sciai-lab / Truth_is_Universal

Jupyter Notebook 22 5 Updated Nov 7, 2024

QwenLM / Qwen-Agent

Agent framework and applications built upon Qwen>=2.0, featuring Function Calling, Code Interpreter, RAG, and Chrome extension.

Python 6,343 572 Updated Mar 21, 2025

shizhediao / R-Tuning

[NAACL 2024 Outstanding Paper] Source code for the NAACL 2024 paper entitled "R-Tuning: Instructing Large Language Models to Say 'I Don't Know'"

Python 109 10 Updated Jul 10, 2024

jxzhangjhu / Awesome-LLM-Uncertainty-Reliability-Robustness

Awesome-LLM-Robustness: a curated list of Uncertainty, Reliability and Robustness in Large Language Models

733 48 Updated Feb 28, 2025

ictnlp / LLaVA-Mini

LLaVA-Mini is a unified large multimodal model (LMM) that can support the understanding of images, high-resolution images, and videos in an efficient manner.

Python 423 18 Updated Jan 13, 2025

ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.

Python 234 158 Updated Mar 17, 2025

open-compass / VLMEvalKit

Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks

Python 2,085 305 Updated Mar 25, 2025

Xnhyacinth / Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

1,369 47 Updated Mar 27, 2025

openvino-dev-samples / glm-edge.openvino

About This sample shows how to deploy glm-edge model series using OpenVINO

Python 2 1 Updated Jan 13, 2025

Human9000 / nd-Mamba2-torch

Only implemented through torch: "bi - mamba2" , "vision- mamba2 -torch". support 1d/2d/3d/nd and support export by jit.script/onnx;

Python 287 12 Updated Dec 11, 2024

THUDM / GLM-Edge

GLM Series Edge Models

Python 131 6 Updated Feb 19, 2025

openvinotoolkit / openvino.genai

Run Generative AI models with simple C++/Python API and using OpenVINO Runtime

C++ 248 226 Updated Mar 26, 2025

intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform

Python 1,800 263 Updated Mar 24, 2025

amd / RyzenAI-SW

C++ 494 77 Updated Mar 25, 2025

ZiQiangXie / llm-from-scratch

LLM implementation one matrix multiplication at a time

Jupyter Notebook 9 4 Updated Aug 8, 2024

intel / npu-nn-cost-model

Library for modelling performance costs of different Neural Network workloads on NPU devices

C++ 32 3 Updated Mar 27, 2025

Starred topics

ChatGPT

C++

C

Awesome Lists

Algorithm

3D