Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supabase vectorstore - first cut #3100

Merged
merged 12 commits into from Apr 20, 2023

Conversation

danielchalef
Copy link
Contributor

@danielchalef danielchalef commented Apr 18, 2023

First cut of a supabase vectorstore loosely patterned on the langchainjs equivalent. Doesn't support async operations which is a limitation of the supabase python client.

@danielchalef
Copy link
Contributor Author

I'm overriding from_texts in an incompatible manner having added additional arguments. How do you suggest I modify to pass the linter?

cls: Type["SupabaseVectorStore"],
texts: List[str],
embedding: Embeddings,
metadatas: Optional[List[dict[Any, Any]]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think adding new explicit kwargs shouldn't be issue for linter, but changing kwarg to arg might. so metadatas should have default val None


def __init__(
self,
client: Any,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can support type checking on optional dependencies with optional imports. langchain/vectostors/chroma.py is one example. basically do something like

if TYPE_CHECKING:
  import supabase

class SupabaseVectorStore:
  def __init__(self, client: supabase.client.Client, ...):

@danielchalef
Copy link
Contributor Author

@dev2049 Modified per your suggestions. mypy is still unhappy with the method signature overrides:

langchain/vectorstores/supabase.py:95: error: Signature of "from_documents" incompatible with supertype "VectorStore"  [override]
langchain/vectorstores/supabase.py:95: note:      Superclass:
langchain/vectorstores/supabase.py:95: note:          @classmethod
langchain/vectorstores/supabase.py:95: note:          def from_documents(cls, documents: List[Document], embedding: Embeddings, **kwargs: Any) -> SupabaseVectorStore

Note that the signature for from_documents hasn't changed other than 3 additional explicit args.

@dev2049
Copy link
Contributor

dev2049 commented Apr 18, 2023

Note that the signature for from_documents hasn't changed other than 3 additional explicit args.

guessing you have to include **kwargs at the end. but also note that base class VectorStore has default from_documents implementation that probably works for you use case

    @classmethod
    def from_documents(
        cls: Type[VST],
        documents: List[Document],
        embedding: Embeddings,
        **kwargs: Any,
    ) -> VST:
        """Return VectorStore initialized from documents and embeddings."""
        texts = [d.page_content for d in documents]
        metadatas = [d.metadata for d in documents]
        return cls.from_texts(texts, embedding, metadatas=metadatas, **kwargs)

@hwchase17
Copy link
Contributor

@danielchalef excited to see this! has been in typescript for a while, excited to getting python to feature parity ;) thanks for adding

@danielchalef
Copy link
Contributor Author

note that base class VectorStore has default from_documents implementation that probably works for you use case

Good catch. Thanks!

mypy doesn't like my previous use of a keyword-only argument marker. Have made the additional arguments Optional and checking for presence at runtime. Not ideal but can't think of an alternative.

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great! one last small ask - is it possible to get an example notebook showing how to use this?

@danielchalef
Copy link
Contributor Author

@hwchase17 Done!

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect! thanks!

@hwchase17 hwchase17 merged commit 27cdf8d into langchain-ai:master Apr 20, 2023
9 checks passed
@liamcharmer
Copy link

I've followed this: https://python.langchain.com/en/latest/modules/indexes/vectorstores/examples/supabase.html

but when i do similarity_search it responds with postgrest.exceptions.APIError: {'code': '42P01', 'details': None, 'hint': None, 'message': 'relation "docstore" does not exist'} from Supabase. Is there something in the SQL script via the tutorial that isn't correct?

@danielchalef
Copy link
Contributor Author

danielchalef commented Apr 21, 2023

Create a table called "docstore" or modify the Postgres function to use the table name reflecting your schema.

I'll cut a PR later to correct the example function.

@liamcharmer
Copy link

Create a table called "docstore" or modify the Postgres function to use the table name reflecting your schema.

I'll cut a PR later to correct the example function.

Ohhh my bad! Thank you so much apologies for me being silly

dev2049 pushed a commit that referenced this pull request Apr 21, 2023
ref
#3100 (comment)

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
vowelparrot pushed a commit that referenced this pull request Apr 21, 2023
ref
#3100 (comment)

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
wertycn pushed a commit to wertycn/langchain-zh that referenced this pull request Apr 26, 2023
ref
langchain-ai/langchain#3100 (comment)

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
vowelparrot pushed a commit that referenced this pull request Apr 26, 2023
ref
#3100 (comment)

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
vowelparrot pushed a commit that referenced this pull request Apr 28, 2023
ref
#3100 (comment)

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
@rapcal
Copy link

rapcal commented Apr 30, 2023

Any way to query from_existing_index, like in Pinecone and in the JS implementation?

@danielchalef danielchalef deleted the supabase-vectorstore branch May 1, 2023 01:09
@danielchalef
Copy link
Contributor Author

Sounds like a great feature, though assumptions would need to be made around embedding model. I don't have bandwidth to work on this, but am sure the langchain team would appreciate the contribution.

samching pushed a commit to samching/langchain that referenced this pull request May 1, 2023
First cut of a supabase vectorstore loosely patterned on the langchainjs
equivalent. Doesn't support async operations which is a limitation of
the supabase python client.

---------

Co-authored-by: Daniel Chalef <daniel.chalef@private.org>
samching pushed a commit to samching/langchain that referenced this pull request May 1, 2023
yanghua pushed a commit to yanghua/langchain that referenced this pull request May 9, 2023
@kasem-sm
Copy link

Any way to query from_existing_index, like in Pinecone and in the JS implementation?

did you found any way?

@rapcal
Copy link

rapcal commented Jun 15, 2023

Any way to query from_existing_index, like in Pinecone and in the JS implementation?

did you found any way?

Nope. Went back to Pinecone.

@ShantanuNair
Copy link
Contributor

ShantanuNair commented Jun 27, 2023

@kasem-sm @rapcal I believe it's just this?

supabase: Client = create_client(supabase_url, supabase_key)
embeddings = OpenAIEmbeddings()
vector_store = SupabaseVectorStoreWithMetadataFiltering(table_name='documents', embedding=embeddings, client=supabase, query_name='match_documents')

That said the supabase integration is lacking functionality such as filtering, but I think this should work for just loading it as an index. Maybe I'll push a pr for the docs.

@ShantanuNair
Copy link
Contributor

@rapcal @kasem-sm If you have a few set fields you want to filter with, and not too much dynamic filtering, I got it working quite nicely by implementing my own Vector Store class. I explained a bit more here #5379 (comment), and if needed I can share some minimal reproducible code if someone wants it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants