YJHMITWEB

Jinghan Yao YJHMITWEB

Achievements

Flover Flover Public

Temporal Fusion Framework for Efficient Autoregressive Model Parallel Inference

C++ 8
NVIDIA/TensorRT-LLM NVIDIA/TensorRT-LLM Public

TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…

C++ 8.5k 969
NVIDIA/nccl NVIDIA/nccl Public

Optimized primitives for collective multi-GPU communication

C++ 3.2k 810
fudan-zvg/SOFT fudan-zvg/SOFT Public

[NeurIPS 2021 Spotlight] & [IJCV 2024] SOFT: Softmax-free Transformer with Linear Complexity

Python 304 25
microsoft/DeepSpeed microsoft/DeepSpeed Public

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Python 35.3k 4.1k