Skip to content

rsharath/llm-garden

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 

Repository files navigation

LLM Garden

This is not another LLM benchmark or leaderboard. It is meant to be a simple place to find out what models to use, how they compare with others (benchmarks), and their leaderboards, which may help practitioners decide what models to use. If you find this useful, please consider leaving a star.

Introduction

This repository contains useful links and information about useful models, providers, and benchmarks. I built this mostly to keep track of the models I use and the benchmarks I find useful for the respective models. It also helps to have a single page that I can refer to for all the models and leaderboards.

Provider Benchmarks

Inference Providers This is primarily based on cost/performance/latency and other factors. It is a good place to start if you are looking for a provider for your model.

General LLM Leaderboards

Type Leaderboard & Benchmarks Notes
OpenLLM Leaderboard Huggingface OpenLLM
Helm Leaderboard Leaderboard
Chat Models LMSYS Chatbot Arena Leaderboard

Text Embedding Models

Type Leaderboard & Benchmarks Notes
Text Embedding Models MTEB Leaderboard

Coding Models

Type Leaderboard & Benchmarks Notes
Eval Plus Leaderboard
HumanEval+ Python Benchmark Leaderboard
Code Security CyberSecEval for Code
Code Effectiveness BigCode AI Benchmark
Code Tasks Can AI Code Leaderboard

Image Models

Type Leaderboard & Benchmarks Notes
T2I Comp Bench Benchmark

Video Models

Type Leaderboard & Benchmarks Notes
Text to Video Models Leaderboard

Speech Models

Type Leaderboard & Benchmarks Notes
Automatic Speech Recognition (ASR) Models for Speech to Text Open ASR Leaderboard
Text to Speech Synthesis
Text to Speech TTS Arena

Language Translation Models

Type Leaderboard & Benchmarks Notes

Releases

No releases published

Packages

No packages published