Skip to content

mtybadger/mmteb-toolkit

Repository files navigation

mmteb-toolkit

Code and datasets for MMTEB (Multilingual MTEB), used in our Multilingual Representations in Embeddings Models blog post for MIT 6.S898. The repo includes code to run the tests on a Transformers model

To run MMTEB, use mmteb.py. Tasks are in tasks.py. All datasets are hosted on the HuggingFace hub, at the names linked below, except for SciFact, which is local in the repo.

Huge thanks to the MTEB authors, Niklas Muennighoff, Nouamane Tazi, Loïc Magne and Nils Reimers.

Further thanks to Conviction for supplying API credits.

Dataset Link
Content Cell Content Cell
Content Cell Content Cell

About

Code and datasets for MMTEB (Multilingual MTEB)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages