gemm

Wrapper around intgemm (x86_64) and ruy (ARM) to switch between both based on architecture and provide a fast matrix multiplication backend for Mozilla Firefox's translation feature.

wrapper arm x86-64 gemm

Updated Apr 20, 2022
C++

scocoyash / Convolution-To-Gemm

Star

My experiments with convolution

matrix-multiplication convolution openmpi gemm gemm-optimization

Updated Jun 21, 2020
C++

xylcbd / gemm_base

Star

gemm baseline code.

gemm mkl openblas gemm-optimization

Updated Oct 22, 2017
C++

KaiserKlayton / lpa_cnn

Star

Low Precision Arithmetic for Convolutional Neural Network Inference

benchmarking caffe deep-learning image-recognition convolutional-neural-networks 8-bit gemm

Updated Oct 29, 2017
C++

yester31 / OpenCL_EX

Star

Development of deep learning inference code by OpenCL kerenl function.

opencl parallel-computing convolution deeplearning gemm

Updated Jun 1, 2022
C++

yester31 / GEMM_Conv2d_CUDA

Star

CUDA Gemm Convolution implementation

cuda cublas convolution cuda-kernels gemm cuda-programming

Updated Feb 4, 2022
C++

blackccpie / fastconv

Star

fast 2D convolution implementation benchmark

cpp avx simd convolution gemm toeplitz im2col

Updated Nov 21, 2017
C++

zixuanweeei / gemm-opt

Star

Manually optimize the GEMM (GEneral Matrix Multiply) operation. There is a long way to go.

cpu cpp gemm gemm-optimization

Updated Aug 22, 2021
C++

CambriconECO / BANGC_Gemm_Tutorial

Star

algorithm gemm cambricon bangc

Updated Apr 7, 2021
C++

XiaoSong9905 / dgemm-knl

Star

DGEMM on KNL, achieve 75% MKL

hpc high-performance linear-algebra x86 gemm dgemm

Updated May 19, 2022
C++

eth-cscs / spla

Star

Specialized Parallel Linear Algebra, providing distributed GEMM functionality for specific matrix distributions with optional GPU acceleration.

linear-algebra mpi cuda gemm rocm

Updated Jun 26, 2024
C++

CoffeeBeforeArch / mmul

Sponsor

Star

Serial and parallel implementations of matrix multiplication

serial parallel matrix-multiplication benchmarks gemm mmul

Updated Feb 19, 2021
C++

CNugteren / CLBlast

Sponsor

Star

Tuned OpenCL BLAS

gpu opencl matrix-multiplication blas gemm blas-libraries clblas

Updated Jun 13, 2024
C++

OpenNMT / CTranslate2

Star

Fast inference engine for Transformer models

Updated Jun 28, 2024
C++

Improve this page

Add a description, image, and links to the gemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gemm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemm

Here are 19 public repositories matching this topic...

PhuNH / hpc-aa

riskybacon / mnist_arma_blas

BenQuickDeNN / CUDA-GEMM

LRZ-BADW / OMMOP

a-sidorova / gpu_opencl_cource

jerinphilip / MozIntGemm

scocoyash / Convolution-To-Gemm

xylcbd / gemm_base

KaiserKlayton / lpa_cnn

yester31 / OpenCL_EX

yester31 / GEMM_Conv2d_CUDA

blackccpie / fastconv

zixuanweeei / gemm-opt

CambriconECO / BANGC_Gemm_Tutorial

XiaoSong9905 / dgemm-knl

eth-cscs / spla

CoffeeBeforeArch / mmul

CNugteren / CLBlast

OpenNMT / CTranslate2

Improve this page

Add this topic to your repo