-
Notifications
You must be signed in to change notification settings - Fork 859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU speed-up on Raspberry Pi 5 #226
Comments
Ubuntu doesn't even support the Vulkan Mesa driver you linked yet, so I doubt Tencent and beatmup are using GPU on RPI5. Vulkan Mesa is for graphics processing. You can't use it with OpenCL to multiply matrices. Even if we rewrote GGML in a shader language, libraries like OpenGL, GLFW, GLEW, etc. all depend on X Windows and can't run headlessly for general computation tasks like linear algebra. Broadcom claims their GPU is capable of general purpose computation:
The community project that lets Linux users write programs for Broadcom's GPU was abandoned three years ago and no longer builds. https://github.com/wimrijnders/V3DLib If you can show me how multiply a matrix on this GPU without depending on frameworks, then I'll reopen this issue and strongly consider supporting it. |
Thanks for the enlightening explanation. That is both good and bad news. Great that you're also enthousiastic about Raspberry Pi optimization.. but sad to hear (and read) that there is so little support for VideoCore hardware. |
Looks like someone actually did rewrite GGML in a shader language. Yesterday ggerganov/llama.cpp#2059 just got merged in llama.cpp which adds Vulkan support and a whole bunch of shaders. This gives me new hope that Raspberry Pi 5 GPU support will be possible. Unfortunately it doesn't appear possible today. If I build llama.cpp at head with
I'm going to leave this open until we can circle back in possibly several months to a year, until the distro driver situation improves, or someone else leaves a comment here helping us figure out how to do this. In the mean time, please do try this yourself. It's possible I broke my Ubuntu install by using a PPA earlier. |
Awesome! It seems someone else in that thread also ran into an issue. I'll attempt building Llamafile from source on the Pi 5 and let you know how it goes. |
It compiles and runs.
This is on a Pi 5 8Gb with the latest Raspberry Pi Lite OS, fully updated/upgraded, and mesa vulkan drivers installed.
Whether it's actually GPU enhanced though.. I noticed this in the output:
The full log is below:
|
Seems they are speedily fixing bugs in llama.cpp issue: interactive mode is broken on Vulkan Pull request |
You can offload layers to the GPU with the |
Thanks @Mar2ck ! It worked fine first try with The speed difference doesn't seem noticable. Oddly, the base version itself seems to run much faster already today, compared to last time I tried. Back then it generated one word per second. Not sure why it's different now. ##BEFORE
##AFTER
// I added the logs. Technically speaking the GPU version is actually a little slower.. strange. // Non-GPU version:
GPU version:
Funny how both runs decided the prompt was programming related.. |
Wait a tick:
|
https://www.phoronix.com/news/Raspberry-Pi-OS-Default-V3DV |
@chuangtc That's great news, thanks for sharing. Which model are you running though? I got a lot more tokens than that running small models (tinyllama-1.1b-1t-openorca.Q4_K_M.gguf) on the CPU. On that topic, I look forward to seeing what the new mathematical functions created by @jart will do to improve running on the Pi further, as those are said to speed up context ingestion. |
Here is what I am asking help on reddit. jason@raspberrypi5:~ $ vulkaninfo --summary
WARNING: [Loader Message] Code 0 : terminator_CreateInstance: Failed to CreateInstance in ICD 0. Skipping ICD.
==========
VULKANINFO
==========
Vulkan Instance Version: 1.3.239
|
Instance Extensions: count = 22
-------------------------------
VK_EXT_acquire_drm_display : extension revision 1
VK_EXT_acquire_xlib_display : extension revision 1
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_direct_mode_display : extension revision 1
VK_EXT_display_surface_counter : extension revision 1
VK_EXT_surface_maintenance1 : extension revision 1
VK_EXT_swapchain_colorspace : extension revision 4
VK_KHR_device_group_creation : extension revision 1
VK_KHR_display : extension revision 23
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_display_properties2 : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_surface_protected_capabilities : extension revision 1
VK_KHR_wayland_surface : extension revision 6
VK_KHR_xcb_surface : extension revision 6
VK_KHR_xlib_surface : extension revision 6
Instance Layers: count = 2
--------------------------
VK_LAYER_MESA_device_select Linux device selection layer 1.3.211 version 1
VK_LAYER_MESA_overlay Mesa Overlay layer 1.3.211 version 1
Devices:
========
GPU0:
apiVersion = 1.2.255
driverVersion = 23.2.1
vendorID = 0x14e4
deviceID = 0x55701c33
deviceType = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
deviceName = V3D 7.1.7
driverID = DRIVER_ID_MESA_V3DV
driverName = V3DV Mesa
driverInfo = Mesa 23.2.1-1~bpo12+rpt3
conformanceVersion = 1.3.6.1
deviceUUID = 5fd8106e-741a-cafa-e080-fdb16cf11a80
driverUUID = 1698c6ef-161f-3213-5159-557202953ee9
GPU1:
apiVersion = 1.3.255
driverVersion = 0.0.1
vendorID = 0x10005
deviceID = 0x0000
deviceType = PHYSICAL_DEVICE_TYPE_CPU
deviceName = llvmpipe (LLVM 15.0.6, 128 bits)
driverID = DRIVER_ID_MESA_LLVMPIPE
driverName = llvmpipe
driverInfo = Mesa 23.2.1-1~bpo12+rpt3 (LLVM 15.0.6)
conformanceVersion = 1.3.1.1
deviceUUID = 6d657361-3233-2e32-2e31-2d317e627000
driverUUID = 6c6c766d-7069-7065-5555-494400000000
|
Raspberry Pi 5 doesn't have all the Vulkan 1.3 capabilities: https://gitlab.freedesktop.org/mesa/mesa/-/issues/10896 |
update:the missing exts now marked as supported |
I'm experimenting with Llamafile on a Raspberry Pi 5 with 8Gb of ram, in order to integrate it with existing privacy-protecting smart home voice control. This is working great so far as long as I very small models are used.
I was wondering: would it be possible to speed up inference on the Rasbperry Pi 5 by using the GPU?
Through this Stack Overflow post I've found some frameworks that already do this, such as:
The Raspberry Pi 5's VideoCore GPU has vulkan drivers:
https://www.phoronix.com/news/Mesa-RPi-5-VideoCore-7.1.x
Curious to your thoughts.
Related:
#40
The text was updated successfully, but these errors were encountered: