fix: Metal backend #150

PABannier · 2024-04-16T21:37:23Z

This PR allows users to use the Metal (MacOS) and cuBLAS backend by:

Exposing the n_gpu_layers parameter in the CLI
Using the Metal backend in the forward pass

siraben · 2024-04-19T17:52:59Z

After it creates the tokens and runs ggml_metal_init, I get this:

ggml_metal_init: GPU name:   Apple M1 Pro
ggml_metal_init: GPU family: MTLGPUFamilyApple7 (1007)
ggml_metal_init: hasUnifiedMemory              = true
ggml_metal_init: recommendedMaxWorkingSetSize  = 21845.34 MB
ggml_metal_init: maxTransferRate               = built-in GPU
ggml_metal_add_buffer: allocated 'backend         ' buffer, size =    54.36 MB, (   54.98 / 21845.34)
encodec_load_model_weights: model size =    44.36 MB
encodec_load_model: n_q = 32
ggml_metal_add_buffer: allocated 'backend         ' buffer, size =   314.06 MB, (  369.05 / 21845.34)
encodec_eval: compute buffer size: 314.05 MB

ggml_metal_graph_compute_block_invoke: error: node   0, op =   REPEAT not implemented
GGML_ASSERT: /Users/siraben/Git/bark.cpp/encodec.cpp/ggml/src/ggml-metal.m:1428: false
ggml_metal_graph_compute_block_invoke: error: node 4677, op = MAP_CUSTOM2_F32 not implemented
[1]    9701 abort      ./examples/main/main -ngl 100 -t 8 -m ./ggml_weights/ggml_weights.bin -em  -p

PABannier · 2024-04-20T13:48:53Z

Hello @siraben !
Indeed, it seems that some operations (e.g., repeat, which is used to broadcast computations) do not have a corresponding Metal kernel implemented in ggml. I'll open a PR to implement them.

normatovjj · 2024-04-23T23:50:53Z

When I try to run cmake -DGGML_CUBLAS=ON .. I get:

CMake Warning at encodec.cpp/ggml/src/CMakeLists.txt:219 (message):
  cuBLAS not found

normatovjj · 2024-04-26T02:52:28Z

When I try to run cmake -DGGML_CUBLAS=ON .. I get:
CMake Warning at encodec.cpp/ggml/src/CMakeLists.txt:219 (message):
  cuBLAS not found

I also tried CMAKE_ARGS='-DLLAMA_CUBLAS=on' cmake .. and added all the changes proposed in this pull, but to no success.

PABannier added 2 commits April 16, 2024 23:35

Exposed n_gpu_layers for Metal backend + build works

c8449db

Merge branch 'main' into fix_metal

d4431a9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Metal backend #150

fix: Metal backend #150

PABannier commented Apr 16, 2024

siraben commented Apr 19, 2024

PABannier commented Apr 20, 2024

normatovjj commented Apr 23, 2024

normatovjj commented Apr 26, 2024

fix: Metal backend #150

Are you sure you want to change the base?

fix: Metal backend #150

Conversation

PABannier commented Apr 16, 2024

siraben commented Apr 19, 2024

PABannier commented Apr 20, 2024

normatovjj commented Apr 23, 2024

normatovjj commented Apr 26, 2024