Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting #12623

mohiuddin-khan-shiam · 2025-06-29T12:49:17Z

Description

semantic_kernel.text.text_chunker._split_str always returned input_was_split=False even after splitting, causing higher-level routines to keep searching separators and unnecessarily re-split text.
The function now sets input_was_split=True as soon as it performs the initial split and continues to propagate deeper recursive flags, improving performance and preserving intended chunk boundaries.

…tting `semantic_kernel.text.text_chunker._split_str` always returned `input_was_split=False` even after splitting, causing higher-level routines to keep searching separators and unnecessarily re-split text. The function now sets `input_was_split=True` as soon as it performs the initial split and continues to propagate deeper recursive flags, improving performance and preserving intended chunk boundaries. Co-Authored-By: S. M. Mohiuddin Khan Shiam <147746955+mohiuddin-khan-shiam@users.noreply.github.com>

…tting `semantic_kernel.text.text_chunker._split_str` always returned `input_was_split=False` even after splitting, causing higher-level routines to keep searching separators and unnecessarily re-split text. The function now sets `input_was_split=True` as soon as it performs the initial split and continues to propagate deeper recursive flags, improving performance and preserving intended chunk boundaries.

moonbox3 · 2025-07-10T01:47:57Z

Hi @mohiuddin-khan-shiam thanks for the contribution. Can you please have a look at the failing unit tests?

odiomarcelino and others added 2 commits June 29, 2025 18:46

mohiuddin-khan-shiam requested a review from a team as a code owner June 29, 2025 12:49

markwallace-microsoft added the python Pull requests for the Python Semantic Kernel label Jun 29, 2025

github-actions bot changed the title ~~Fix inaccurate split flag in TextChunker to prevent redundant re-splitting~~ Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting Jun 29, 2025

Merge branch 'main' into main

0234c90

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting #12623

Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting #12623

mohiuddin-khan-shiam commented Jun 29, 2025

Uh oh!

moonbox3 commented Jul 10, 2025

Uh oh!

Uh oh!

Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting #12623

Are you sure you want to change the base?

Python: Fix inaccurate split flag in TextChunker to prevent redundant re-splitting #12623

Conversation

mohiuddin-khan-shiam commented Jun 29, 2025

Description

Uh oh!

moonbox3 commented Jul 10, 2025

Uh oh!

Uh oh!