-
Notifications
You must be signed in to change notification settings - Fork 53
CUDA support
Woolverine94 edited this page Apr 14, 2024
·
7 revisions
- biniou has been thinked as a cpu-only-no-gpu-required application, but it should be really easy to make it use your NVIDIA GPU to accelerate inferences. As of commit [630c975] (11/27/23), biniou will support the following features :
- Autodetection of CUDA device and configuration of biniou to use it
- If CUDA is enabled, using fp16 torch_dtype, which will force you to re-download the models, but half the size of them (when supported).
- If CUDA is enabled, using cpu_offload to save as much VRAM as possible (when supported).
- Complementary prerequisites are a 4GB+ VRAM Nvidia GPU using a working CUDA 12.1 environment and an already functional biniou standard installation.
- You can easily activate CUDA support by selecting the type of optimization to activate (CPU, CUDA or ROCm for Linux), in the WebUI control module.
Note : support option for CUDA in Chatbot and Llava module is in a separate setting panel, also in the WebUI control module.