Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of: https://gitlab.com/niansa/libjustlm
-
Updated
Mar 25, 2024 - C++
Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of: https://gitlab.com/niansa/libjustlm
Multi-Model and multi-tasking llama Discord Bot - Mirror of: https://gitlab.com/niansa/discord_llama
LLM in Godot
Local LLM Inference
Leverage tensor parallelism techniques to run large language models in the CPU memory of edge devices.
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
Local LLM inference Library
Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).
A high-performance inference system for large language models, designed for production environments.
Tensor parallelism is all you need. Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.
LLMs as Copilots for Theorem Proving in Lean
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."