Embedding playground

Introducing embeds.ai: an embedding playground and battleground

Compare how embedding models work on a real world use case (retrieval augmented generation for Wikipedia articles).

A few weeks ago, we were looking for an embedding model to use for RAG. We eventually came across the MTEB leaderboard, but we struggled to understand the benchmark scores.

We wanted a tool to test various embedding models with example queries on real-world datasets. After unsuccessfully looking for such a “playground”, we decided to just build one ourselves!

We embedded HuggingFace’s Simple Wikipedia dataset using @OpenAI, @Cohere, and 2 open-source models via @Baseten. We then stored the embeddings in @Supabase using pgvector. Finally, we built a web app using NextJS and deployed it on @Vercel.

Now we’re hosting the playground for anyone to use for free, as well as open-sourcing our work so people can try evaluating other models, datasets, or indexes.

Learn more here in our full blog post here on Substack.

If you have other suggestions / pain points from working with embedding models, vector DBs, or RAG, or if you would like to collaborate on any of the above or unrelated projects, please reach out!

Built by: Shreyan Jain, David Song, and Elad Gil

Name		Name	Last commit message	Last commit date
Latest commit History 118 Commits
ingestor		ingestor
models		models
web-app		web-app
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ingestor

ingestor

models

models

web-app

web-app

.gitignore

.gitignore

README.md

README.md

Repository files navigation

Embedding playground

About

Releases

Packages

Contributors 3

Languages

EGCap/playground

Folders and files

Latest commit

History

Repository files navigation

Embedding playground

About

Resources

Stars

Watchers

Forks

Languages