Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Metal inference #4

Closed
jonfairbanks opened this issue Jun 6, 2023 · 2 comments · Fixed by #56
Closed

Support Metal inference #4

jonfairbanks opened this issue Jun 6, 2023 · 2 comments · Fixed by #56
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@jonfairbanks
Copy link

The latest version of llama.cpp has huge improvements on Mac hardware.

Could this be implemented here as well?

ggerganov/llama.cpp#1642

@louisgv
Copy link
Owner

louisgv commented Jun 6, 2023

Yup, it's based on top of ggml and although the rust implementation is a slight deviation fro llama.cpp, it's likely portable. We can use Rust macros to deliver an aarch64 dedicated GPU inferencing as well

@louisgv louisgv added enhancement New feature or request help wanted Extra attention is needed labels Jun 6, 2023
@louisgv louisgv pinned this issue Jun 23, 2023
@LLukas22
Copy link
Collaborator

Since rustformers/llm@47a41c9 Metal support is in the main branch for llama based architectures, it can be enabled by compiling with the metal --features flag and by setting use-gpu to true in the model-config.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants