autra-weiliu

Follow

🏀

I may be slow to respond.

Liu Wei autra-weiliu

🏀

I may be slow to respond.

Follow

deep dive into deep learning algorithm & infra

15 followers · 0 following

autra tech
Beijing
07:25 (UTC +08:00)

Achievements

BetaSend feedback

Achievements

BetaSend feedback

Block or Report

Block or report autra-weiliu

Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

Megatron-LM Megatron-LM Public

Forked from NVIDIA/Megatron-LM

Ongoing research training transformer models at scale

Python
DeepSpeed DeepSpeed Public

Forked from microsoft/DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python
how-to-optim-algorithm-in-cuda how-to-optim-algorithm-in-cuda Public

Forked from BBuf/how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda
TensorRT-LLM TensorRT-LLM Public

Forked from NVIDIA/TensorRT-LLM

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++
flash-attention flash-attention Public

Forked from Dao-AILab/flash-attention

Fast and memory-efficient exact attention

Python
vllm vllm Public

Forked from vllm-project/vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python