llamacpp
Here are 24 public repositories matching this topic...
Multi-Model and multi-tasking llama Discord Bot - Mirror of: https://gitlab.com/niansa/discord_llama
-
Updated
Mar 27, 2024 - C++
PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation
-
Updated
Apr 15, 2024 - C++
This project accelerates local deployment of chatglm and vector inference using PyTorch compiled in C++, and includes an OpenAI API Mock script for quick setup of local speed testing services. This setup enhances performance and efficiency, ideal for high-performance applications and development testing.
-
Updated
Jan 20, 2024 - C++
LLM InferenceNet is a C++ project designed to facilitate fast and efficient inference from Large Language Models (LLMs) using a client-server architecture. It enables optimized interactions with pre-trained language models, making deployment on edge devices easier.
-
Updated
Jul 28, 2023 - C++
Llama causal LM fully recreated in LibTorch. Designed to be used in Unreal Engine 5
-
Updated
Jan 5, 2024 - C++
Lightweight chat terminal-interface for llama.cpp server compilable for windows and linux.
-
Updated
Mar 1, 2024 - C++
Local LLM Inference
-
Updated
Jun 4, 2024 - C++
Getting an LLM to work with Godot.
-
Updated
Oct 11, 2023 - C++
LLM in Godot
-
Updated
Jun 6, 2024 - C++
Inference Vision Transformer (ViT) in plain C/C++ with ggml
-
Updated
Apr 11, 2024 - C++
WebAssembly binding for llama.cpp - Enabling in-browser LLM inference
-
Updated
Jun 6, 2024 - C++
Improve this page
Add a description, image, and links to the llamacpp topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llamacpp topic, visit your repo's landing page and select "manage topics."