LLM_info

Useful information about LLM and its environment is collected here

Open source projects and frameworks

Serving

vLLM: a fast and easy-to-use library for LLM inference and serving (blog)
RouteLLM: a framework for serving and evaluating LLM routers (paper)

Optimization

DeepSpeed: a deep learning optimization library that makes distributed training easy, efficient, and effective
TVM: a compiler stack for deep learning systems. It is designed to close the gap between the productivity-focused deep learning frameworks, and the performance- and efficiency-focused hardware backends
FireOptimizer: customizing latency and quality for your production inference workload
GGML: tensor library for machine learning
Medusa is a simple framework that democratizes the acceleration techniques for LLM generation with multiple decoding heads.
Optimal Brain Compression (OBC): a framework for accurate PTQ and pruning (paper)

Business-logic over LLM

LangChain is a framework for developing applications powered by large language models (LLMs)

AI Agents and platforms

OPEA: Open Platform for Enterprise AI from Intel (examples in github)

GenAIComps: a service-based tool that includes microservice components such as llm, embedding, reranking, and so on
GenAIInfra: part of the OPEA containerization and cloud-native suite, enables quick and efficient deployment of GenAIExamples in the cloud
GenAIEval: it measures service performance metrics such as throughput, latency, and accuracy for GenAIExamples. This feature helps users compare performance across various hardware configurations easily.

Common information in tutorials and blogs

Optimization concepts

Speculative decoding:

Speculative Decoding — Make LLM Inference Faster

API

OpenAI API
Using logprobs from OpenAI

Benchmarks

LLM evals and benchmarking

Papers

Raw list of papers sorted by general topics
Brief description and analysis of current state based on the papers can be also found there.

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
papers		papers
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM_info

Open source projects and frameworks

Serving

Optimization

Business-logic over LLM

AI Agents and platforms

Common information in tutorials and blogs

Optimization concepts

API

Benchmarks

Papers

About

Releases

Packages

vvchernov/LLM_info

Folders and files

Latest commit

History

Repository files navigation

LLM_info

Open source projects and frameworks

Serving

Optimization

Business-logic over LLM

AI Agents and platforms

Common information in tutorials and blogs

Optimization concepts

API

Benchmarks

Papers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages