llama-cpp-python (vicuna 13b) producing extremely poor embeddings with chromadb/langchain

i'm working on a question answering chatbot over my personal document store using langchain's `LlamaCppEmbeddings`, the `LlamaCpp` LLM, and the `Chroma` vectorstore.

when i use `LlamaCppEmbeddings` as the embedding_function (with vicuna v1 13b 4bit quantized weights), the performance of `similarity_search` is extremely poor. often, the best result is the very last or second to last result. this happens even when the text query is an exact substring of the input text, when there is no other input text with the same string.

switching to the default embedding function `SentenceTransformerEmbeddingFunction` with no other changes vastly improves the behavior of my chatbot due to returning a much saner ranking of input documents. this workaround is okay for me, but it seems like LlamaCppEmbeddings should be able to far outperform this much smaller transformer model and i suspect there is not-subtle bug lurking.

the consistency of these poor results makes me wonder if there is a sign or lt/gt mixed up somewhere. i looked at the implementation of `LlamaCppEmbeddings` and it is [so simple](https://github.com/hwchase17/langchain/blob/master/langchain/embeddings/llamacpp.py#L111) that i don't see how the error could be coming from there instead of llama-cpp-python (or llama.cpp).

i'd be happy to help troubleshoot this, and i have a self contained, simple, and easily redistributable example which demonstrates this issue consistently.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

llama-cpp-python (vicuna 13b) producing extremely poor embeddings with chromadb/langchain #105

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

llama-cpp-python (vicuna 13b) producing extremely poor embeddings with chromadb/langchain #105

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions