Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CUDA without cuBLAS #82

Merged
merged 12 commits into from
Dec 12, 2023
Merged

Support CUDA without cuBLAS #82

merged 12 commits into from
Dec 12, 2023

Commits on Dec 10, 2023

  1. wip naive cublasGemmStridedBatchedEx

    Numbers match cublas, but using this code leads to LLaVA outputting
    nothing but white squares.
    mrdomino committed Dec 10, 2023
    Configuration menu
    Copy the full SHA
    7118b15 View commit details
    Browse the repository at this point in the history

Commits on Dec 11, 2023

  1. set stream (code now works)

    mrdomino committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    3d47d97 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c2a7a5e View commit details
    Browse the repository at this point in the history
  3. Implement cublasGemmEx

    mrdomino committed Dec 11, 2023
    Configuration menu
    Copy the full SHA
    4fb0813 View commit details
    Browse the repository at this point in the history

Commits on Dec 12, 2023

  1. Implement cublasSgemm

    mrdomino committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    ffb039d View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    43132ff View commit details
    Browse the repository at this point in the history
  3. Remove remaining cublas library calls in naive mode

    Uses some fairly disgusting preprocessor macros to get the job done
    while preserving behavior when `-DGGML_USE_CUBLAS`. With a bit of
    investigation into `ggml_cuda_mul_mat_mat_batched_cublas`, these can
    probably be removed or simplified.
    mrdomino committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    877c736 View commit details
    Browse the repository at this point in the history
  4. add header file, remove cublas_v2.h from naive

    N.B. we include the source file rather than the header file in
    `ggml-cuda.cu` because `llamafile/cuda.c` assumes that everything lives
    in a single compilation unit.
    mrdomino committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    89f721d View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    16c9276 View commit details
    Browse the repository at this point in the history
  6. rename naive -> tinyblas

    mrdomino committed Dec 12, 2023
    Configuration menu
    Copy the full SHA
    211be30 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    862bce5 View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    881ebfd View commit details
    Browse the repository at this point in the history