Skip to content

fix: dataset code block splitting#6998

Merged
c121914yu merged 1 commit into
labring:mainfrom
YYH211:fix/dataset-code-block-split
May 27, 2026
Merged

fix: dataset code block splitting#6998
c121914yu merged 1 commit into
labring:mainfrom
YYH211:fix/dataset-code-block-split

Conversation

@YYH211
Copy link
Copy Markdown
Collaborator

@YYH211 YYH211 commented May 27, 2026

What

  • Prevent fenced code blocks from swallowing large preceding text during dataset chunking.
  • Cap code block preservation by chunk size instead of model max context.
  • Add regression coverage for long text followed by markdown image code blocks.

Tests

  • pnpm exec prettier --check packages/global/common/string/textSplitter.ts packages/global/test/common/string/textSplitter.test.ts
  • pnpm exec eslint --fix packages/global/common/string/textSplitter.ts packages/global/test/common/string/textSplitter.test.ts
  • pnpm --filter @fastgpt/global test -- common/string/textSplitter.test.ts
  • pnpm --filter @fastgpt/app build:workers

@github-actions
Copy link
Copy Markdown

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 13.96% 1140 / 8161
🔵 Statements 13.95% 1195 / 8564
🔵 Functions 12.47% 245 / 1964
🔵 Branches 11.94% 536 / 4489
File CoverageNo changed files found.
Generated in workflow #454 for commit 8b5b89f by the Vitest Coverage Report Action

@github-actions
Copy link
Copy Markdown

Admin Preview Image Ready!

ghcr.io/labring/fastgpt-pr:admin_8b5b89f4f4bd05405f822248124a8b7092e870b5

🕒 Time: 2026-05-27 14:49:11 (UTC+8)

@github-actions
Copy link
Copy Markdown

Build Successful - Preview fastgpt Image for this PR:

ghcr.io/labring/fastgpt-pr:fastgpt_8b5b89f4f4bd05405f822248124a8b7092e870b5

🕒 Time: 2026-05-27 14:49:50 (UTC+8)

@c121914yu c121914yu merged commit 9f07741 into labring:main May 27, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants