LLM inference server implementation based on llama.cpp.
-
Updated
Jul 25, 2024 - C++
LLM inference server implementation based on llama.cpp.
workbench for learing&practising AI tech in real scenario on Android device, powered by GGML(Georgi Gerganov Machine Learning) and NCNN(Tencent NCNN) and FFmpeg
Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.
To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."