Skip to content

Latest commit

 

History

History

SpeedUp

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

SpeedUp

  1. LightSeq: A High Performance Inference Library for Transformers [NAACL 2021] Xiaohui Wang, Ying Xiong, Yang Wei, Mingxuan Wang, Lei Li.
    1. use rewrite kernel and CuBLAS GEMM; Save most of the time;
    2. Propose one hierarchical beam search method, which use retrieve-rerank two-stage to reduce the softmax calculate.
    3. Dynamic GPU memory reuse.