Fast inference engine for Transformer models
deep-neural-networks
deep-learning
cpp
neon
machine-translation
openmp
parallel-computing
cuda
inference
avx
intrinsics
avx2
neural-machine-translation
opennmt
quantization
gemm
mkl
thrust
transformer-models
onednn
-
Updated
Jul 11, 2024 - C++