Question #363

fakerybakery · 2024-04-25T03:03:26Z

Hi, thank you for releasing llamafile! The speedups are very impressive. Are there any plans to merge the improvements made here upstream into the llama.cpp repo?
Thanks!

twitchyliquid64 · 2024-04-25T05:38:59Z

This was asked on twitter, Justine's answer was yes its in flight: https://twitter.com/JustineTunney/status/1783332508505194674

fakerybakery · 2024-04-25T16:44:06Z

Thanks! (That was me on Twitter :) - but Llamafile is still much faster than llama.cpp. Does that mean not all improvements have been upstreamed?

jart · 2024-04-26T18:51:00Z

Yes, we're happy to share our optimizations with llama.cpp. Here's two PRs I sent them, which got merged:

The reason why llamafile still continues to be faster than llama.cpp is because I've discovered even more performance opportunities in last few weeks since I published my blog https://justine.lol/matmul/ My latest tricks will be upstreamed too, however they're still awaiting approval.

jart closed this as completed Apr 26, 2024

jart added the question label Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question #363

Question #363

fakerybakery commented Apr 25, 2024

twitchyliquid64 commented Apr 25, 2024

fakerybakery commented Apr 25, 2024

jart commented Apr 26, 2024

Question #363

Question #363

Comments

fakerybakery commented Apr 25, 2024

twitchyliquid64 commented Apr 25, 2024

fakerybakery commented Apr 25, 2024

jart commented Apr 26, 2024