gemm

Star

Here are 64 public repositories matching this topic...

PhuNH / hpc-aa

Star

High Performance Computing - Algorithms and Applications Course in WS18-19 at TUM

cuda fermi gemm

Updated Feb 3, 2019
C++

KarhouTam / cuda-kernels

Star

Some common CUDA kernel implementations (Not the fastest).

cuda-kernels gemm softmax relu cuda-programming layernorm cuda-learning

Updated Jun 13, 2024
Cuda

jhson989 / fast-conv

Star

Fast Convoluion Implementation via CUDA

cuda convolution gemm

Updated Apr 26, 2022
Cuda

a-sidorova / gpu_opencl_cource

Star

Course Programming on new Architecture-1 (GPU), autumn 2021

gpu opencl jacobi gemm heterogeneous-computing

Updated Dec 5, 2021
C++

enp1s0 / cuMpSGEMM

Star

Fast SGEMM emulation on Tensor Cores

gpu cuda gemm half-precision mixed-precision tensorcore tensorcores fp32

Updated Nov 20, 2023
Cuda

dev0x13 / gemm-benchmark-2023

Star

Benchmarks for modern (2023) high-performance floating-point GEMM implementations.

benchmark mojo gemm

Updated Dec 25, 2023
C++

rollingbug / LinMatrix

Star

A lightweight matrix computation software library aim for MCU or embedded system

microcontroller linear-algebra embedded-systems matrix-multiplication eigenvectors numerical-methods mcu eigenvalues gemm language-c lu-decomposition matrix-library qr-decomposition matrix-computations axpy

Updated Feb 24, 2022
C

junyoung1992 / OpenCL-GEMM

Star

GEMM Optimization

acceleration opencl parallelism gpgpu gemm

Updated Jan 12, 2022
C

yester31 / OpenCL_EX

Star

Development of deep learning inference code by OpenCL kerenl function.

opencl parallel-computing convolution deeplearning gemm

Updated Jun 1, 2022
C++

KaiserKlayton / lpa_cnn

Star

Low Precision Arithmetic for Convolutional Neural Network Inference

benchmarking caffe deep-learning image-recognition convolutional-neural-networks 8-bit gemm

Updated Oct 29, 2017
C++

andreytkachenko / yarblas

Star

Yet another rust BLAS

rust machine-learning math rust-lang blas gemm

Updated Feb 13, 2020
Rust

ZhangGe6 / how-to-optimize-playground

Star

High-performance computing (HPC) demos since I was a freshmen.

cuda x86 gemm

Updated Jun 15, 2022
C

JoeruCodes / CUDA-GEMM-kernel

Star

My attempt of making a GEMM kernel...

parallel-computing cuda cuda-kernels gemm gemm-optimization cuda-programming gemms

Updated Jun 16, 2023
Cuda

pminhtam / xnor_conv_pytorch_extension

Star

XNOR-Net with binary conv2d kernels with XNOR GEMM op, support both CPU and GPU.

cpp cuda pytorch xnor-net gemm binary-convolutions xnor-convolutions binary-neural-networks binary-op pytorch-extension

Updated Oct 25, 2022
C

digital-nomad-cheng / matmul_cuda_kernel_tvm

Star

Generate optimized MatMul cuda kernel automatically using tvm auto schedule.

hpc gpu cuda gemm tvm gemm-optimization matmul

Updated Feb 25, 2023
Jupyter Notebook

zixuanweeei / gemm-opt

Star

Manually optimize the GEMM (GEneral Matrix Multiply) operation. There is a long way to go.

cpu cpp gemm gemm-optimization

Updated Aug 22, 2021
C++

cyrusmsk / gemm_apple

Star

GEMM on Apple Silicon

benchmark deep-learning gemm applesilicon m1-mac

Updated Dec 25, 2023
Python

xylcbd / gemm_base

Star

gemm baseline code.

gemm mkl openblas gemm-optimization

Updated Oct 22, 2017
C++

riskybacon / mnist_arma_blas

Star

machine-learning matrix-multiplication blas gemm

Updated Feb 4, 2018
C++

BoooC / Implementation-of-a-Flexible-and-Energy-Efficient-Accelerator-For-Sparse-Convolution-Neural-Network

Star

A Flexible and Energy Efficient Accelerator For Sparse Convolution Neural Network

deep-neural-networks accelerator rtl verilog convolutional-neural-networks sparse-matrix gemm dla lenet-5 im2col eyeriss hardware-accelerator eyeriss-v2 hm-noc

Updated Jun 7, 2024
Verilog

Improve this page

Add a description, image, and links to the gemm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gemm topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gemm

Here are 64 public repositories matching this topic...

PhuNH / hpc-aa

KarhouTam / cuda-kernels

jhson989 / fast-conv

a-sidorova / gpu_opencl_cource

enp1s0 / cuMpSGEMM

dev0x13 / gemm-benchmark-2023

rollingbug / LinMatrix

junyoung1992 / OpenCL-GEMM

yester31 / OpenCL_EX

KaiserKlayton / lpa_cnn

andreytkachenko / yarblas

ZhangGe6 / how-to-optimize-playground

JoeruCodes / CUDA-GEMM-kernel

pminhtam / xnor_conv_pytorch_extension

digital-nomad-cheng / matmul_cuda_kernel_tvm

zixuanweeei / gemm-opt

cyrusmsk / gemm_apple

xylcbd / gemm_base

riskybacon / mnist_arma_blas

BoooC / Implementation-of-a-Flexible-and-Energy-Efficient-Accelerator-For-Sparse-Convolution-Neural-Network

Improve this page

Add this topic to your repo