Transformer related optimization, including BERT, GPT
-
Updated
Mar 27, 2024 - C++
Transformer related optimization, including BERT, GPT
该仓库主要记录 NLP 算法工程师相关的顶会论文研读笔记
LightSeq: A High Performance Library for Sequence Processing and Generation
a fast and user-friendly runtime for transformer inference (Bert, Albert, GPT2, Decoders, etc) on CPU and GPU.
Fast implementation of BERT inference directly on NVIDIA (CUDA, CUBLAS) and Intel MKL
Running BERT without Padding
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
A tutorial of building tensorflow serving service from scratch
Method for searching relevant podcast segments from transcripts using transformer models
Dutch/Indonesian BERT-NER setup.
Code repository for the research paper "Space Efficient Transformer Neural Network"
Add a description, image, and links to the bert topic page so that developers can more easily learn about it.
To associate your repository with the bert topic, visit your repo's landing page and select "manage topics."