Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SpacyEmbeddings class #6967

Merged
merged 13 commits into from Jul 3, 2023
Merged

Conversation

rjarun8
Copy link

@rjarun8 rjarun8 commented Jun 30, 2023

  • Description: Added a new SpacyEmbeddings class for generating embeddings using the Spacy library.
  • Issue: Sentencebert/Bert/Spacy/Doc2vec embedding support Sentencebert/Bert/Spacy/Doc2vec embedding support #6952
  • Dependencies: This change requires the Spacy library and the 'en_core_web_sm' Spacy model.
  • Tag maintainer: @dev2049
  • Twitter handle: N/A

This change includes a new SpacyEmbeddings class, but does not include a test or an example notebook.

@vercel
Copy link

vercel bot commented Jun 30, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Comments Updated (UTC)
langchain ⬜️ Ignored (Inspect) Jul 3, 2023 3:29pm

@dosubot dosubot bot added the 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features label Jun 30, 2023
Returns:
A list of embeddings, one for each document.
"""
return self.embed_documents(texts)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't use the synchronous methods here. can just raise a NotImplementedError if we don't want to implement async right now

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need some more explanation to understand the comment better. Can you pls share some reference design pattern or interface templates to be followed for implementing embedding integrations. @baskaryan

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't use the synchronous methods here. can just raise a NotImplementedError if we don't want to implement async right now

@baskaryan I have made the changes and committed.

52b0cc4

@baskaryan
Copy link
Collaborator

nice! could we add an example notebook showing how to use this? probably would live in docs/extras/modules/data_connection/text_embedding/integrations

@baskaryan baskaryan added needs documentation PR needs to be updated with documentation Ɑ: embeddings Related to text embedding models module labels Jun 30, 2023
@rjarun8
Copy link
Author

rjarun8 commented Jun 30, 2023

nice! could we add an example notebook showing how to use this? probably would live in docs/extras/modules/data_connection/text_embedding/integrations

Sure, i will get one created

Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great - thanks!

"metadata": {},
"outputs": [],
"source": [
"# Import the necessary classes\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notebooks should have a title of # Spacy and then all other cells should NOT have # but should be more like ## Import the ....

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

notebooks should have a title of # Spacy and then all other cells should NOT have # but should be more like ## Import the ....

done. Notebook updated

langchain/embeddings/spacy_embeddings.py Show resolved Hide resolved
@baskaryan
Copy link
Collaborator

thanks @rjarun8!

@baskaryan baskaryan merged commit e2d61ab into langchain-ai:master Jul 3, 2023
14 checks passed
bdonkey added a commit to bdonkey/langchain that referenced this pull request Jul 3, 2023
* master: (212 commits)
  Add SpacyEmbeddings class (langchain-ai#6967)
  docs: commented out `editUrl` option (langchain-ai#6440)
  Remove duplicate mongodb integration doc (langchain-ai#7006)
  Update get_started.mdx (langchain-ai#7005)
  openapi chain nit (langchain-ai#7012)
  Fix sample in FAISS section (langchain-ai#7050)
  Fix typo in google_places_api.py (langchain-ai#7055)
  move base prompt to schema (langchain-ai#6995)
  added `Brave Search` document_loader (langchain-ai#6989)
  Add JSON Lines support to JSONLoader (langchain-ai#6913)
  Vectara upd2 (langchain-ai#6506)
  docstrings `document_loaders` 2 (langchain-ai#6890)
  docstrings `document_loaders` 1 (langchain-ai#6847)
  Added filter and delete all option to delete function in Pinecone integration, updated base VectorStore's delete function (langchain-ai#6876)
  bump 221 (langchain-ai#7047)
  Rm retriever kwargs (langchain-ai#7013)
  Polish reference docs (langchain-ai#7045)
  Support params on GoogleSearchApiWrapper (langchain-ai#6810) (langchain-ai#7014)
  Fix typo (langchain-ai#7023)
  Fix openai multi functions agent docs (langchain-ai#7028)
  ...
vowelparrot pushed a commit that referenced this pull request Jul 4, 2023
- Description: Added a new SpacyEmbeddings class for generating
embeddings using the Spacy library.
- Issue: Sentencebert/Bert/Spacy/Doc2vec embedding support #6952
- Dependencies: This change requires the Spacy library and the
'en_core_web_sm' Spacy model.
- Tag maintainer: @dev2049
- Twitter handle: N/A

This change includes a new SpacyEmbeddings class, but does not include a
test or an example notebook.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
aerrober pushed a commit to aerrober/langchain-fork that referenced this pull request Jul 24, 2023
- Description: Added a new SpacyEmbeddings class for generating
embeddings using the Spacy library.
- Issue: Sentencebert/Bert/Spacy/Doc2vec embedding support langchain-ai#6952
- Dependencies: This change requires the Spacy library and the
'en_core_web_sm' Spacy model.
- Tag maintainer: @dev2049
- Twitter handle: N/A

This change includes a new SpacyEmbeddings class, but does not include a
test or an example notebook.

---------

Co-authored-by: Bagatur <baskaryan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: embeddings Related to text embedding models module 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features needs documentation PR needs to be updated with documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants