llm-inference

Star

Here are 15 public repositories matching this topic...

nomic-ai / gpt4all

Star

gpt4all: run open-source LLMs anywhere

llm-inference

Updated May 13, 2024
C++

SJTU-IPADS / PowerInfer

Star

High-speed Large Language Model Serving on PCs with Consumer-grade GPUs

falcon llama large-language-models llm local-inference llm-inference bamboo-7b

Updated Apr 29, 2024
C++

openvinotoolkit / openvino

Star

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated May 14, 2024
C++

lean-dojo / LeanCopilot

Star

LLMs as Copilots for Theorem Proving in Lean

machine-learning theorem-proving lean formal-mathematics lean4 llm-inference

Updated May 8, 2024
C++

b4rtaz / distributed-llama

Sponsor

Star

Run LLMs on weak devices or make powerful devices even more powerful by distributing the workload and dividing the RAM usage.

neural-network distributed-computing llm llms open-llm llm-inference llama2 distributed-llm llama3

Updated May 13, 2024
C++

vectorch-ai / ScaleLLM

Star

A high-performance inference system for large language models, designed for production environments.

performance gpu model production cuda efficiency inference transformer llama speculative serving llm llm-inference llama3

Updated May 14, 2024
C++

intel / neural-speed

Star

An innovative library for efficient LLM inference via low-bit quantization

sparsity cpu gpu int8 low-bit int4 fp8 llamacpp llm-inference gaudi2 nf4 fp4 mxformat llm-fine-tuning int3 int2

Updated May 14, 2024
C++

inferflow / inferflow

Star

Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).

bloom falcon moe gemma mistral mixture-of-experts model-quantization multi-gpu-inference m2m100 llamacpp llm-inference internlm llama2 qwen baichuan2 mixtral phi-2 deepseek minicpm

Updated Mar 15, 2024
C++

tinyBigGAMES / Dllama

Sponsor

Star

Local LLM inference Library

delphi pascal windows-10 windows-11 indidev llamacpp llm-inference

Updated May 6, 2024
C++

Mobile-Artificial-Intelligence / maid_llm

Sponsor

Star

maid_llm is a dart implementation of llama.cpp used by the mobile artificial intelligence distribution (maid)

facebook meta llama gemma mistral mobile-ai llm flutter-ai llamacpp ggml llm-inference local-ai llama2 gguf mixtral

Updated May 14, 2024
C++

modelscope / dash-infer

Star

DashInfer is a native inference engine for Pre-trained Large Language Models (LLMs) developed by Tongyi Laboratory.

cpu llm llm-inference native-engine

Updated May 13, 2024
C++

niansa / libjustlm

Star

Super easy to use library for doing LLaMA/GPT-J stuff! - Mirror of: https://gitlab.com/niansa/libjustlm

python ai mpt cpp17 llama wrapper-library cpp20 gpt-j llm llm-inference llama2

Updated Mar 25, 2024
C++

niansa / discord_llama

Star

Multi-Model and multi-tasking llama Discord Bot - Mirror of: https://gitlab.com/niansa/discord_llama

ai discord-bot llama cpp20 llm llamacpp llm-inference llama2

Updated Mar 27, 2024
C++

zhengpeirong / distributed-llama

Star

Leverage tensor parallelism techniques to run large language models in the CPU memory of edge devices.

raspberry-pi parallelism llm-inference

Updated Apr 27, 2024
C++

NewJerseyStyle / fastllm

Star

纯c++的全平台llm加速库，支持python调用，嘗試增加 ROCm 平台支援

amdgpu rocm llm-inference

Updated Nov 1, 2023
C++

Improve this page

Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-inference

Here are 15 public repositories matching this topic...

nomic-ai / gpt4all

SJTU-IPADS / PowerInfer

openvinotoolkit / openvino

lean-dojo / LeanCopilot

b4rtaz / distributed-llama

vectorch-ai / ScaleLLM

intel / neural-speed

inferflow / inferflow

tinyBigGAMES / Dllama

Mobile-Artificial-Intelligence / maid_llm

modelscope / dash-infer

niansa / libjustlm

niansa / discord_llama

zhengpeirong / distributed-llama

NewJerseyStyle / fastllm

Improve this page

Add this topic to your repo