Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b2885
b2884
sync : ggml ggml-ci
b2879
server: free sampling contexts on exit (#7264) * server: free sampling contexts on exit This cleans up last leak found by the address sanitizer. * fix whitespace * fix whitespace
b2878
Revert "move ndk code to a new library (#6951)" (#7282) This reverts commit efc8f767c8c8c749a245dd96ad4e2f37c164b54c.
b2877
ggml : add RPC backend (#6829) * ggml : add RPC backend The RPC backend proxies all operations to a remote server which runs a regular backend (CPU, CUDA, Metal, etc). * set TCP_NODELAY * add CI workflows * Address review comments * fix warning * implement llama_max_devices() for RPC * Address review comments * Address review comments * wrap sockfd into a struct * implement get_alignment and get_max_size * add get_device_memory * fix warning * win32 support * add README * readme : trim trailing whitespace * Address review comments * win32 fix * Address review comments * fix compile warnings on macos
b2876
llama : disable pipeline parallelism with nkvo (#7265)
b2875
move ndk code to a new library (#6951)
b2874
Add left recursion check: quit early instead of going into an infinit… …e loop (#7083) * Add left recursion check: quit early instead of going into an infinite loop * Remove custom enum, rename left recursion check and move to "grammar internal" section, add handling for edge case where a leftmost nonterminal may be empty * Remove unnecessary declaration
b2871
llama : less KV padding when FA is off (#7257) ggml-ci
b2870
llava-cli: fix base64 prompt (#7248)