Skip to content

Releases: ggerganov/llama.cpp

b2885

15 May 10:10
e8a7fd4
Compare
Choose a tag to compare
metal : support FA without mask + add asserts (#7278)

* ggml : fa without mask + add asserts

ggml-ci

* metal : support non-contiguous KV

ggml-ci

b2884

15 May 09:34
Compare
Choose a tag to compare
sync : ggml

ggml-ci

b2879

15 May 03:33
4f02636
Compare
Choose a tag to compare
server: free sampling contexts on exit (#7264)

* server: free sampling contexts on exit

This cleans up last leak found by the address sanitizer.

* fix whitespace

* fix whitespace

b2878

14 May 20:01
1265c67
Compare
Choose a tag to compare
Revert "move ndk code to a new library (#6951)" (#7282)

This reverts commit efc8f767c8c8c749a245dd96ad4e2f37c164b54c.

b2877

14 May 13:10
5e31828
Compare
Choose a tag to compare
ggml : add RPC backend (#6829)

* ggml : add RPC backend

The RPC backend proxies all operations to a remote server which runs a
regular backend (CPU, CUDA, Metal, etc).

* set TCP_NODELAY

* add CI workflows

* Address review comments

* fix warning

* implement llama_max_devices() for RPC

* Address review comments

* Address review comments

* wrap sockfd into a struct

* implement get_alignment and get_max_size

* add get_device_memory

* fix warning

* win32 support

* add README

* readme : trim trailing whitespace

* Address review comments

* win32 fix

* Address review comments

* fix compile warnings on macos

b2876

14 May 08:54
5416002
Compare
Choose a tag to compare
llama : disable pipeline parallelism with nkvo (#7265)

b2875

14 May 08:29
efc8f76
Compare
Choose a tag to compare
move ndk code to a new library (#6951)

b2874

14 May 09:39
e0f5561
Compare
Choose a tag to compare
Add left recursion check: quit early instead of going into an infinit…

…e loop (#7083)

* Add left recursion check: quit early instead of going into an infinite loop

* Remove custom enum, rename left recursion check and move to "grammar internal" section, add handling for edge case where a leftmost nonterminal may be empty

* Remove unnecessary declaration

b2871

14 May 00:43
614d3b9
Compare
Choose a tag to compare
llama : less KV padding when FA is off (#7257)

ggml-ci

b2870

13 May 23:55
30e7033
Compare
Choose a tag to compare
llava-cli: fix base64 prompt (#7248)