Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

changed TextSplitter class to align with langchain #668

Merged
merged 3 commits into from
May 18, 2023

Conversation

zacharypfiz
Copy link
Contributor

I have been playing around with langchain and langchainjs and noticed that the textsplitters were producing different results. My goal is to align the textsplitters in langchainjs to be more similar to the langchain textsplitters.

I changed one line of conditional logic that was different between the two libraries. In the mergeSplits function:
langchain: if (total + _len + (separator_len if len(current_doc) > 0 else 0) > self._chunk_size):
langchainjs: if (total + _len >= this.chunkSize)
new langchainjs: if (total + _len + (currentDoc.length > 0 ? separator.length : 0) > this.chunkSize)

@vercel
Copy link

vercel bot commented Apr 7, 2023

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated (UTC)
langchainjs-docs ✅ Ready (Inspect) Visit Preview May 18, 2023 4:08pm

@zacharypfiz zacharypfiz changed the title refined mergeSplits condition to align with langchain changed TextSplitter class to align with langchain Apr 10, 2023
@nfcampos
Copy link
Collaborator

Thanks!

@nfcampos nfcampos added the lgtm PRs that are ready to be merged as-is label Apr 13, 2023
@nfcampos nfcampos removed the lgtm PRs that are ready to be merged as-is label Apr 17, 2023
@nfcampos nfcampos added the lgtm PRs that are ready to be merged as-is label May 18, 2023
@nfcampos nfcampos merged commit c7fc06c into langchain-ai:main May 18, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm PRs that are ready to be merged as-is
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants