Transformer related optimization, including BERT, GPT
-
Updated
Mar 27, 2024 - C++
Transformer related optimization, including BERT, GPT
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
该仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记
Code repository for the research paper "Space Efficient Transformer Neural Network"
A tutorial of building tensorflow serving service from scratch
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
LightSeq: A High Performance Library for Sequence Processing and Generation
Running BERT without Padding
Dutch/Indonesian BERT-NER setup.
Method for searching relevant podcast segments from transcripts using transformer models
Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
Add a description, image, and links to the bert topic page so that developers can more easily learn about it.
To associate your repository with the bert topic, visit your repo's landing page and select "manage topics."