-
-
Notifications
You must be signed in to change notification settings - Fork 19
Closed
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationresearch
Description
Tip from https://mas.to/@goranmoomin/110820724235904467
From my last time trying out llama.cpp & Llama 2, I don't think the text generation should be taking ~20s if you're using the Metal-accelerated implementation.
Sorry if you’ve already tried… but have you tried giving the env vars
CMAKE_ARGS='-DLLAMA_METAL=on' FORCE_CMAKE=1
when installing llama-cpp-python? (ref https://github.com/abetlen/llama-cpp-python )
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationresearch