llm-inference
Here are 15 public repositories matching this topic...
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
-
Updated
Apr 29, 2024 - C++
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
-
Updated
May 14, 2024 - C++
LLMs as Copilots for Theorem Proving in Lean
-
Updated
May 8, 2024 - C++
Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
-
Updated
May 13, 2024 - C++
A high-performance inference system for large language models, designed for production environments.
-
Updated
May 14, 2024 - C++
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
-
Updated
Mar 15, 2024 - C++
Local LLM inference Library
-
Updated
May 6, 2024 - C++
DashInfer is a native inference engine for Pre-trained Large Language Models (LLMs) developed by Tongyi Laboratory.
-
Updated
May 13, 2024 - C++
Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of: https://gitlab.com/niansa/libjustlm
-
Updated
Mar 25, 2024 - C++
Multi-Model and multi-tasking llama Discord Bot - Mirror of: https://gitlab.com/niansa/discord_llama
-
Updated
Mar 27, 2024 - C++
Leverage tensor parallelism techniques to run large language models in the CPU memory of edge devices.
-
Updated
Apr 27, 2024 - C++
Improve this page
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."