New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectara #5069
Vectara #5069
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the two notebooks in chains/index_examples are a bit out place - we can't really have that for all integrations. if you want to keep around, i would move into the integrations directory and link to from the integrations page
langchain/vectorstores/vectara.py
Outdated
from langchain.embeddings.base import Embeddings | ||
from langchain.vectorstores.base import VectorStore, VectorStoreRetriever | ||
|
||
"""Implementation of Vector Store using Vectara (https://vectara.com)""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring a bit out of place (if here should be comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the two notebooks in chains/index_examples are a bit out place - we can't really have that for all integrations. if you want to keep around, i would move into the integrations directory and link to from the integrations page
Moved the notebooks to integrations. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docstring a bit out of place (if here should be comment)
Yep. fixed!
from langchain.schema import BaseRetriever, Document | ||
from langchain.vectorstores.vectara import Vectara | ||
|
||
class VectaraRetriever(BaseRetriever): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we probably want to make Vectara.as_retriever() return this - so i would move it into that file (similar to how we do with redis)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so just move VectaraRetriever to vectara.py and get rid of vectara_retriever.py altogether?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect. Done!
Moved vectara_retriever code inside vectara.py per suggestion Moved IPYNB files to docs/integration (and changed name of one of them) Updated formatting to be compatible with "black"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two quick comments
langchain/vectorstores/vectara.py
Outdated
"""Initialize with Vectara API.""" | ||
self._vectara_customer_id = vectara_customer_id | ||
self._vectara_corpus_id = vectara_corpus_id | ||
self._vectara_api_key = vectara_api_key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we usually make it possible to set credentials (like api key and other id's) via environment variables. if we wanted to support that here could d something like
def __init__(vectara_api_key: Optional[str]=None, ...):
self.vectara_api_key = vectara_api_key or get_from_env("vectara_api_key", "VECTARA_API_KEY")
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion - updated!
new_texts = ["foobar", "foobaz"] | ||
docsearch.add_documents([Document(page_content=content) for content in new_texts]) | ||
output = docsearch.similarity_search("foobar", k=1) | ||
assert output == [Document(page_content="foobar")] or output == [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this not deterministic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I updated the test to not only be deterministic but also more comprehensive and test metadatas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
two quick comments
Vectara Integration
This PR provides integration with Vectara. Implemented here are:
And two IPYNB notebooks to do more testing: