InferenceNexus

text-generation-inference Public

Forked from huggingface/text-generation-inference

Large Language Model Text Generation Inference

Python 1

T-MAC Public

Forked from microsoft/T-MAC

Low-bit LLM inference on CPU with lookup table

C++ 1

ipex-llm Public

Forked from intel/ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Baichuan, Mixtral, Gemma, Phi, MiniCPM, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc,…

Python 1

litellm Public

Forked from BerriAI/litellm

Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere, Anthropic, Sagemaker, HuggingFace, Replicate, Groq]

Python 1

litgpt Public

Forked from Lightning-AI/litgpt

20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.

Python 1

inference-benchmarker Public

Forked from huggingface/inference-benchmarker

Inference server benchmarking tool

Rust 1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InferenceNexus

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!