Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Move document loader interfaces to core #17723

Merged
merged 3 commits into from
Mar 6, 2024

Conversation

cbornet
Copy link
Collaborator

@cbornet cbornet commented Feb 19, 2024

This is needed to be able to move document loaders to partner packages.

@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Feb 19, 2024
Copy link

vercel bot commented Feb 19, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 6, 2024 6:58pm

@dosubot dosubot bot added Ɑ: doc loader Related to document loader module (not documentation) 🤖:refactor A large refactor of a feature(s) or restructuring of many files labels Feb 19, 2024
@cbornet cbornet force-pushed the core-doc-loaders branch 3 times, most recently from 2e45ee1 to 3513ec1 Compare February 19, 2024 10:19
@cbornet
Copy link
Collaborator Author

cbornet commented Feb 19, 2024

There's a circular dependency since BaseLoader uses TextSplitter which is in the langchain package. How should it be handled ? Maybe move TextSplitter and RecursiveCharacterTextSplitter to core ?

@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Feb 21, 2024
@cbornet cbornet force-pushed the core-doc-loaders branch 3 times, most recently from 80a930e to 4f3a297 Compare February 21, 2024 13:37
@cbornet
Copy link
Collaborator Author

cbornet commented Feb 21, 2024

I opted for moving TextSplitter and RecursiveCharacterTextSplitter to core.
Is that OK ?

@cbornet cbornet force-pushed the core-doc-loaders branch 7 times, most recently from 7b6959d to deb6241 Compare February 24, 2024 11:37
@cbornet
Copy link
Collaborator Author

cbornet commented Mar 1, 2024

#18346 moved the text splitters to their own package.
I'll update the PR accordingly.

@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Mar 1, 2024
@cbornet
Copy link
Collaborator Author

cbornet commented Mar 2, 2024

Rebased to use langchain-text-splitters

@hemidactylus
Copy link
Contributor

LGTM, thank you for identifying the problem and coming up with an elegant solution!

@eyurtsev eyurtsev self-assigned this Mar 6, 2024
@eyurtsev
Copy link
Collaborator

eyurtsev commented Mar 6, 2024

Fantastic I've been wanting to do this for a while!

@eyurtsev
Copy link
Collaborator

eyurtsev commented Mar 6, 2024

@baskaryan this looks good to me. We need your input on the namespaces. Do we match document_loader for minimal changes or move things in documents to have less namespaces?

@eyurtsev
Copy link
Collaborator

eyurtsev commented Mar 6, 2024

OK document loaders is good according to @baskaryan

@eyurtsev
Copy link
Collaborator

eyurtsev commented Mar 6, 2024

comandeering to resolve conflicts

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Mar 6, 2024
@eyurtsev eyurtsev merged commit ea14151 into langchain-ai:master Mar 6, 2024
95 checks passed
@cbornet cbornet deleted the core-doc-loaders branch March 6, 2024 21:44
bechbd pushed a commit to bechbd/langchain that referenced this pull request Mar 29, 2024
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
gkorland pushed a commit to FalkorDB/langchain that referenced this pull request Mar 30, 2024
This is needed to be able to move document loaders to partner packages.

---------

Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ɑ: doc loader Related to document loader module (not documentation) lgtm PR looks good. Use to confirm that a PR is ready for merging. 🤖:refactor A large refactor of a feature(s) or restructuring of many files size:XL This PR changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants