New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add Apify integration #998
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
@@ -0,0 +1,93 @@ | |||
import { Document } from "../document.js"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tools have a specific meaning in LangChain - this is in the wrong place. Can we bundle it into the document loader and run callActor
when load
'ing documents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review.
In the Python version, it's in the utilities. https://github.com/hwchase17/langchain/blob/master/langchain/utilities/apify.py
Not sure if there's an equivalent of it in the JS version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note that an Actor can potentially run quite a long time - hours, or even days for large sites. The scenario when we run the Actor, wait for it to finish, and then feed data to the vector index, is to demonstrate how it works. For large-scale production use cases, running of Actors will be separate from loading the vector index, often the loading will be invoked via webhook once the Actor finishes. Hence linking the callActor
action with load
ing documents might not be ideal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I'd prefer to put this as a callActor
method in the document_loader
class. In a broad sense, you're still preparing documents to be loaded and interacted with, right?
I can make the change today and polish everything up if that's ok with you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other reason is so that we can give everything that requires Apify one entrypoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jacoblee93 If you prefer so, feel free to update it that way.
# Conflicts: # docs/docs/modules/agents/tools/integrations/index.mdx # langchain/package.json # yarn.lock
Extended and merged here: #1271 Thanks for your patience! |
JS version of langchain-ai/langchain#2201
If you have any suggestions, feel free to comment on this PR.
Also if there are some issues concerning Apify integrations in langchain, feel free to contact me.