Question: Support for sparse embeddings? #146

Matheus-Garbelini · 2024-03-16T15:56:05Z

Hi, I was wondering whether is would make sence to support models which, in addition to dense vectors, also support sparse and colbert. For example, BGE-M3 works well under infinity for dense vector retrieval. However, it would require some changes to the inference process to additionally obtain sparse vectors such as shown here:
https://github.com/FlagOpen/FlagEmbedding/blob/master/FlagEmbedding/BGE_M3/modeling.py#L352-L355

I wonder if for such case, it's feasible to add extra config parameters in the CLI or that would require too much changes to the core logic of the model during startup?

michaelfeil · 2024-03-16T17:03:14Z

The most staigtforward way to do this at this moment would be to:

fork BGE/m3
Add a “trust_remote_code=True” - ship the code above
See if the postprocessing (e.g. normalization) influences the new embedding model.

michaelfeil · 2024-03-16T17:47:56Z

If you end up getting it done - I would love to feature it here!
Also if you have further questions, let me know!

I personally think the results from BGE-m3 paper are a bit to hastly - the performance is not good enough for a paradigm change, its more of a experiment. Perhaps its time for a BGE-M3-V2

seetimee · 2024-06-25T01:26:27Z

same question now

Matheus-Garbelini · 2024-06-25T17:33:52Z

Hi @michaelfeil , sorry for the late reply. I actually ended up implementing a very basic and manual version of sparse embeddings for BGE-M3, but it is so slow and occupy so much GPU vram that I just switched to use the simple bm25 in elasticsearch for lexical search instead haha.

Matheus-Garbelini changed the title ~~Question: Support for sparse or colbert embeddings?~~ Question: Support for sparse embeddings? Mar 16, 2024

michaelfeil added question Further information is requested new model Make a model compatible labels Mar 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question: Support for sparse embeddings? #146

Question: Support for sparse embeddings? #146

Matheus-Garbelini commented Mar 16, 2024 •

edited

Loading

michaelfeil commented Mar 16, 2024

michaelfeil commented Mar 16, 2024 •

edited

Loading

seetimee commented Jun 25, 2024

Matheus-Garbelini commented Jun 25, 2024

Question: Support for sparse embeddings? #146

Question: Support for sparse embeddings? #146

Comments

Matheus-Garbelini commented Mar 16, 2024 • edited Loading

michaelfeil commented Mar 16, 2024

michaelfeil commented Mar 16, 2024 • edited Loading

seetimee commented Jun 25, 2024

Matheus-Garbelini commented Jun 25, 2024

Matheus-Garbelini commented Mar 16, 2024 •

edited

Loading

michaelfeil commented Mar 16, 2024 •

edited

Loading