Skip to content

Add Paragraph#substitute for placeholders split across runs (#147)#181

Merged
satoryu merged 2 commits into
masterfrom
fix-147-cross-run-substitution
May 31, 2026
Merged

Add Paragraph#substitute for placeholders split across runs (#147)#181
satoryu merged 2 commits into
masterfrom
fix-147-cross-run-substitution

Conversation

@satoryu
Copy link
Copy Markdown
Member

@satoryu satoryu commented May 31, 2026

Summary

Fixes #147. Adds Paragraph#substitute(pattern, replacement), which can replace a placeholder even when Word has split it across multiple text runs.

The bug

Word frequently splits a placeholder such as {{first_name}} across several <w:r> runs (e.g. {{fi, rst_na, me}}). The existing TextRun#substitute works on a single run, so it can never match a placeholder that spans runs:

doc.paragraphs.each { |p| p.each_text_run { |tr| tr.substitute('{{first_name}}', 'World') } }
# => no change; the placeholder is split across runs

Replacing at paragraph level via paragraph.text = ... "works" but collapses the whole paragraph into one run, losing all formatting (noted by the reporter).

The fix

Paragraph#substitute(pattern, replacement):

  1. Joins the run texts into the full paragraph string.
  2. Finds matches across run boundaries.
  3. Collapses each match into the first run it touches (keeping that run's formatting) and empties the other runs the match spanned. Runs outside the match are untouched, so unrelated formatting is preserved.

Supports both string and regex patterns, including capture-group backreferences in the replacement (uses String#sub semantics).

doc.paragraphs.each { |p| p.substitute('{{first_name}}', 'World') }

Tests

  • New fixture split_placeholder.docx — paragraph "Hello {{first_name}}!" with the placeholder split across runs ["Hello ", "{{fi", "rst_na", "me}}", "!"].
  • Specs cover: the fixture really is split; string substitution preserving surrounding text; regex with a capture group; and a save/reopen round-trip.
  • Confirmed the split-run case is unfixable via the old per-run API and now passes. Full suite green locally (156 examples, 0 failures).

Notes / limitations

  • The match region adopts the first spanned run's formatting (a placeholder is meant to be replaced wholesale, so this is the desired behavior).
  • The existing TextRun#substitute / #substitute_with_block are unchanged.

Credit to @manovasanth1227 for the clear report and the run-healing approach that informed this implementation.

Closes #147

🤖 Generated with Claude Code

satoryu and others added 2 commits June 1, 2026 00:17
Word often splits a placeholder like {{first_name}} across several <w:r>
runs. TextRun#substitute works per run, so it can never match such a
placeholder, and replacing at paragraph level via #text= loses formatting.

Adds Paragraph#substitute(pattern, replacement), which joins the runs,
finds matches across run boundaries, and collapses each match into the
first run it touches (emptying the other spanned runs). Runs outside the
match keep their text and formatting. Supports string and regex patterns,
including capture-group backreferences in the replacement.

Adds a fixture (split_placeholder.docx) whose paragraph "Hello
{{first_name}}!" has the placeholder split across runs, and specs covering
string/regex substitution and a save round-trip.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a README example for Paragraph#substitute (split-across-runs case) and
expand the method's doc comment with concrete usage instead of only a "See
#147" reference.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@satoryu satoryu merged commit 56bec8b into master May 31, 2026
6 checks passed
@satoryu satoryu deleted the fix-147-cross-run-substitution branch May 31, 2026 15:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Text Replacement not working as Expected

1 participant