Skip to content
View Bruce-Lee-LY's full-sized avatar
Block or Report

Block or report Bruce-Lee-LY

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned

  1. cuda_hgemm cuda_hgemm Public

    Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

    Cuda 203 45

  2. cuda_hook cuda_hook Public

    Hooked CUDA-related dynamic libraries by using automated code generation tools.

    C 100 28

  3. cuda_hgemv cuda_hgemv Public

    Several optimization methods of half-precision general matrix vector multiplication (HGEMV) using CUDA core.

    Cuda 20 4

  4. matrix_multiply matrix_multiply Public

    Several common methods of matrix multiplication are implemented on CPU and Nvidia GPU using C++11 and CUDA.

    C++ 12 2

  5. cuda_back2back_hgemm cuda_back2back_hgemm Public

    Use tensor core to calculate back-to-back HGEMM (half-precision general matrix multiplication) with MMA PTX instruction.

    Cuda 10 2

  6. memory_pool memory_pool Public

    Simple and efficient memory pool is implemented with C++11.

    C++ 4 4