llamacpp
Here are 24 public repositories matching this topic...
LLM InferenceNet is a C++ project designed to facilitate fast and efficient inference from Large Language Models (LLMs) using a client-server architecture. It enables optimized interactions with pre-trained language models, making deployment on edge devices easier.
-
Updated
Jul 28, 2023 - C++
Getting an LLM to work with Godot.
-
Updated
Oct 11, 2023 - C++
Llama causal LM fully recreated in LibTorch. Designed to be used in Unreal Engine 5
-
Updated
Jan 5, 2024 - C++
This project accelerates local deployment of chatglm and vector inference using PyTorch compiled in C++, and includes an OpenAI API Mock script for quick setup of local speed testing services. This setup enhances performance and efficiency, ideal for high-performance applications and development testing.
-
Updated
Jan 20, 2024 - C++
Lightweight chat terminal-interface for llama.cpp server compilable for windows and linux.
-
Updated
Mar 1, 2024 - C++
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
-
Updated
Mar 15, 2024 - C++
Multi-Model and multi-tasking llama Discord Bot - Mirror of: https://gitlab.com/niansa/discord_llama
-
Updated
Mar 27, 2024 - C++
Inference Vision Transformer (ViT) in plain C/C++ with ggml
-
Updated
Apr 11, 2024 - C++
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
-
Updated
Apr 15, 2024 - C++
Local LLM Inference
-
Updated
Jun 4, 2024 - C++
LLM in Godot
-
Updated
Jun 6, 2024 - C++
Improve this page
Add a description, image, and links to the llamacpp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llamacpp topic, visit your repo's landing page and select "manage topics."