Skip to content

Navigation Menu

Appearance settings

MinishLab

Explore
By company size
By use case
By industry
View all solutions
Topics
- AI
- DevOps
- Security
- Software Development
- View all
Explore
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

Appearance settings

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

Minish

Solving big problems with small models

192 followers
https://minish.ai/
company/105203845
https://huggingface.co/minishlab

Overview
Repositories
Projects
Packages
People

More

Overview
Repositories
Projects
Packages
People

README.md

Hello, we're Minish!

We're a two-person (@pringled and @stephantul) open-source lab, with a focus on Natural Language Processing.

We believe that if you make models fast enough, you unlock new possibilities.

Using our software, you can:

Embed the entire English Wikipedia in 5 minutes
Classify tens of thousands of documents per second on a CPU
Approximately deduplicate extremely large datasets in minutes
Build the fastest RAG application in the world
Easily evaluate which ANN algorithm works best for your data

Our projects:

model2vec: tiny static embedding models with state-of-the-art performance.
potion: the best small models in the world. 100-500x faster than a sentence-transformer, and almost as good.
vicinity: consistent interfaces to many approximate nearest neighbor algorithms.
semhash: lightning-fast, super accuracte, semantic deduplication and filtering for your text datasets.
model2vec-rs: a Rust port of model2vec.

You can also find us on:

🤗 huggingface
👽 LinkedIn
💬 Discord

Pinned Loading

model2vec model2vec Public

Fast State-of-the-Art Static Embeddings

Python 1.7k 92
semhash semhash Public

Fast Semantic Text Deduplication & Filtering

Python 737 42
vicinity vicinity Public

Lightweight Nearest Neighbors with Flexible Backends

Python 290 8
tokenlearn tokenlearn Public

Pre-train Static Word Embeddings

Python 79 8
model2vec-rs model2vec-rs Public

Official Rust Implementation of Model2Vec

Rust 118 5

Repositories

Loading

Type

Select type

All Public Sources Forks Archived Mirrors Templates

Language

Select language

All Makefile MDX Python Rust SCSS

Sort

Select order

Last updated Name Stars

Showing 10 of 10 repositories

.github Public
Readme

MinishLab/.github’s past year of commit activity

0 0 0 0 Updated Jun 20, 2025
docs Public

MinishLab/docs’s past year of commit activity

MDX 0 0 0 0 Updated Jun 11, 2025
model2vec Public
Fast State-of-the-Art Static Embeddings

MinishLab/model2vec’s past year of commit activity

Python 1,740 MIT 92 4 0 Updated Jun 6, 2025
tokenlearn Public
Pre-train Static Word Embeddings

MinishLab/tokenlearn’s past year of commit activity

Python 79 MIT 8 4 2 Updated Jun 2, 2025
minishlab.github.io Public

MinishLab/minishlab.github.io’s past year of commit activity

SCSS 0 MIT 1 0 0 Updated Jun 1, 2025
vicinity Public
Lightweight Nearest Neighbors with Flexible Backends

MinishLab/vicinity’s past year of commit activity

Python 290 MIT 8 2 0 Updated May 31, 2025
model2vec-rs Public
Official Rust Implementation of Model2Vec

MinishLab/model2vec-rs’s past year of commit activity

Rust 118 MIT 5 0 0 Updated May 30, 2025
semhash Public
Fast Semantic Text Deduplication & Filtering

MinishLab/semhash’s past year of commit activity

Python 737 MIT 42 1 0 Updated May 27, 2025
watertemplate Public template
Template

MinishLab/watertemplate’s past year of commit activity

Makefile 4 MIT 2 0 0 Updated Dec 9, 2024
evaluation Public
Code to evaluate performance for embeddings

MinishLab/evaluation’s past year of commit activity

Python 12 MIT 0 0 0 Updated Sep 25, 2024

People

Top languages

Loading…

Uh oh!

There was an error while loading. Please reload this page.

Most used topics

Loading…

Uh oh!

There was an error while loading. Please reload this page.

Footer

© 2025 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.