Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question #363

Closed
fakerybakery opened this issue Apr 25, 2024 · 3 comments
Closed

Question #363

fakerybakery opened this issue Apr 25, 2024 · 3 comments
Labels

Comments

@fakerybakery
Copy link

Hi, thank you for releasing llamafile! The speedups are very impressive. Are there any plans to merge the improvements made here upstream into the llama.cpp repo?
Thanks!

@twitchyliquid64
Copy link

This was asked on twitter, Justine's answer was yes its in flight: https://twitter.com/JustineTunney/status/1783332508505194674

@fakerybakery
Copy link
Author

Thanks! (That was me on Twitter :) - but Llamafile is still much faster than llama.cpp. Does that mean not all improvements have been upstreamed?

@jart
Copy link
Collaborator

jart commented Apr 26, 2024

Yes, we're happy to share our optimizations with llama.cpp. Here's two PRs I sent them, which got merged:

The reason why llamafile still continues to be faster than llama.cpp is because I've discovered even more performance opportunities in last few weeks since I published my blog https://justine.lol/matmul/ My latest tricks will be upstreamed too, however they're still awaiting approval.

@jart jart closed this as completed Apr 26, 2024
@jart jart added the question label Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants