RecursiveCharacterTextSplitter uses regex value instead of original separator when merging and keep_separator is false #23394
Labels
🤖:bug
Related to a bug, vulnerability, unexpected error with an existing feature
Ɑ: text splitters
Related to text splitters package
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
No response
Description
I am trying to use the
langchain
library to split a test using regex separators. I expect the output strings to contain the original separators, but what happens is that when using thekeep_separator
flag asFalse
it uses the regex value instead of the original separator.Possible code pointer where the problem might be coming from: libs/text-splitters/langchain_text_splitters/character.py#L98
System Info
langchain==0.2.5
langchain-core==0.2.9
langchain-text-splitters==0.2.1
Platform: Apple M1 Pro
macOS: 14.5 (23F79)
python version: Python 3.12.3
The text was updated successfully, but these errors were encountered: