Support Metal inference #4

jonfairbanks · 2023-06-06T07:27:53Z

The latest version of llama.cpp has huge improvements on Mac hardware.

Could this be implemented here as well?

louisgv · 2023-06-06T07:42:00Z

Yup, it's based on top of ggml and although the rust implementation is a slight deviation fro llama.cpp, it's likely portable. We can use Rust macros to deliver an aarch64 dedicated GPU inferencing as well

LLukas22 · 2023-06-24T10:13:04Z

Since rustformers/llm@47a41c9 Metal support is in the main branch for llama based architectures, it can be enabled by compiling with the metal --features flag and by setting use-gpu to true in the model-config.

louisgv added enhancement New feature or request help wanted Extra attention is needed labels Jun 6, 2023

louisgv pinned this issue Jun 23, 2023

louisgv mentioned this issue Jun 30, 2023

feat: cuda, cublast, metal dedicated builds #56

Merged

louisgv closed this as completed in #56 Jul 1, 2023

louisgv unpinned this issue Jul 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Metal inference #4

Support Metal inference #4

jonfairbanks commented Jun 6, 2023

louisgv commented Jun 6, 2023

LLukas22 commented Jun 24, 2023

Support Metal inference #4

Support Metal inference #4

Comments

jonfairbanks commented Jun 6, 2023

louisgv commented Jun 6, 2023

LLukas22 commented Jun 24, 2023