You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I was wondering whether is would make sence to support models which, in addition to dense vectors, also support sparse and colbert. For example, BGE-M3 works well under infinity for dense vector retrieval. However, it would require some changes to the inference process to additionally obtain sparse vectors such as shown here: https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/modeling.py#L352-L355
I wonder if for such case, it's feasible to add extra config parameters in the CLI or that would require too much changes to the core logic of the model during startup?
The text was updated successfully, but these errors were encountered:
Matheus-Garbelini
changed the title
Question: Support for sparse or colbert embeddings?
Question: Support for sparse embeddings?
Mar 16, 2024
If you end up getting it done - I would love to feature it here!
Also if you have further questions, let me know!
I personally think the results from BGE-m3 paper are a bit to hastly - the performance is not good enough for a paradigm change, its more of a experiment. Perhaps its time for a BGE-M3-V2
Hi @michaelfeil , sorry for the late reply. I actually ended up implementing a very basic and manual version of sparse embeddings for BGE-M3, but it is so slow and occupy so much GPU vram that I just switched to use the simple bm25 in elasticsearch for lexical search instead haha.
Hi, I was wondering whether is would make sence to support models which, in addition to dense vectors, also support sparse and colbert. For example, BGE-M3 works well under infinity for dense vector retrieval. However, it would require some changes to the inference process to additionally obtain sparse vectors such as shown here:
https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/modeling.py#L352-L355
I wonder if for such case, it's feasible to add extra config parameters in the CLI or that would require too much changes to the core logic of the model during startup?
The text was updated successfully, but these errors were encountered: