This repository will contain code for reproducing different embedding and retrieval models, such as Dense retriever (on MSMARCO), Splade (sparse retriever), UnifieR (hybird retriever), and Udever (universal embedder for multiple natural and programing languages).
Language Models are Universal Embedders [Arxiv]
udever
embedders are finetuned from bloom
models via BitFit on MS MARCO Passage Ranking, SNLI and MultiNLI data. It is a universal embedding model across tasks, natural and programming languages. (From the technical view, udever is merely with some minor improvements to sgpt-bloom
)
The code used to train these checkpoints can be downloaded at this Google Drive link.
On HuggingFace:
On ModelScope / 魔搭社区: