Summary
DocxAdapter.set_cell() and PptxAdapter.set_cell() use cell.text = value to replace cell content. In both python-docx and python-pptx, this setter deletes all existing runs and creates a single new run with default font/size, so any formatting on the original cell (font family, size, bold, color, etc.) is lost.
HwpxAdapter is not affected — it already uses paragraphs[0].text = value which preserves run structure.
Affected code
document_adapter/docx_adapter.py
def set_cell(self, table_index: int, row: int, col: int, value: str) -> str:
cell = self._doc.tables[table_index].rows[row].cells[col]
old = cell.text
cell.text = value # ← drops all runs, resets formatting
return old
document_adapter/pptx_adapter.py
def set_cell(self, table_index: int, row: int, col: int, value: str) -> str:
table = self._get_table(table_index)
cell = table.cell(row, col)
old = cell.text
cell.text = value # ← same issue via python-pptx
return old
DocxAdapter.append_row() has the same issue — it calls new_row.cells[i].text = v on each cell, so newly added rows ignore any run-level template formatting.
Reproduction
from docx import Document
from docx.shared import Pt
doc = Document()
table = doc.add_table(rows=1, cols=1)
cell = table.cell(0, 0)
run = cell.paragraphs[0].add_run('original')
run.font.name = 'Malgun Gothic'
run.font.size = Pt(18)
run.bold = True
# simulating our adapter
cell.text = 'replaced'
new_run = cell.paragraphs[0].runs[0]
print(new_run.font.name, new_run.font.size, new_run.bold)
# → None None None (formatting gone)
Same behavior in python-pptx with shape.table.cell(r, c).text = value.
Impact
Any downstream application that cares about preserving the visual style of a template (fonts for Korean/CJK text, header bold, numeric alignment, brand colors, etc.) gets a degraded result after set_cell or append_row. This is especially visible in real office templates from .docx / .pptx where the original cell runs carry non-default fonts.
Currently in xgen-workflow we work around this with a monkey patch on the installed document_adapter package; the fix belongs upstream.
Proposed fix
Mirror HwpxAdapter.set_cell's run-preserving strategy.
DOCX
def set_cell(self, table_index: int, row: int, col: int, value: str) -> str:
cell = self._doc.tables[table_index].rows[row].cells[col]
old = cell.text
paragraphs = cell.paragraphs
if not paragraphs or not paragraphs[0].runs:
# fall back only when the cell is truly empty
cell.text = value
return old
first_para = paragraphs[0]
first_run = first_para.runs[0]
first_run.text = value
for run in first_para.runs[1:]:
run.text = ""
for para in paragraphs[1:]:
for run in para.runs:
run.text = ""
return old
PPTX
Same strategy against cell.text_frame.paragraphs[0].runs[0].
append_row
Replace per-cell cell.text = v with the same run-preserving helper, so newly-added rows inherit formatting from the template row when python-docx copies it.
Acceptance criteria
Related
HwpxAdapter.set_cell — already uses the correct pattern, keep as the reference implementation.
Summary
DocxAdapter.set_cell()andPptxAdapter.set_cell()usecell.text = valueto replace cell content. In bothpython-docxandpython-pptx, this setter deletes all existing runs and creates a single new run with default font/size, so any formatting on the original cell (font family, size, bold, color, etc.) is lost.HwpxAdapteris not affected — it already usesparagraphs[0].text = valuewhich preserves run structure.Affected code
document_adapter/docx_adapter.pydocument_adapter/pptx_adapter.pyDocxAdapter.append_row()has the same issue — it callsnew_row.cells[i].text = von each cell, so newly added rows ignore any run-level template formatting.Reproduction
Same behavior in
python-pptxwithshape.table.cell(r, c).text = value.Impact
Any downstream application that cares about preserving the visual style of a template (fonts for Korean/CJK text, header bold, numeric alignment, brand colors, etc.) gets a degraded result after
set_cellorappend_row. This is especially visible in real office templates from.docx/.pptxwhere the original cell runs carry non-default fonts.Currently in xgen-workflow we work around this with a monkey patch on the installed
document_adapterpackage; the fix belongs upstream.Proposed fix
Mirror
HwpxAdapter.set_cell's run-preserving strategy.DOCX
PPTX
Same strategy against
cell.text_frame.paragraphs[0].runs[0].append_row
Replace per-cell
cell.text = vwith the same run-preserving helper, so newly-added rows inherit formatting from the template row when python-docx copies it.Acceptance criteria
DocxAdapter.set_cellpreserves font name/size/bold/italic/color of the first run when the cell had existing runsPptxAdapter.set_cellpreserves the same run-level attributesDocxAdapter.append_rowdoes not drop formatting on newly added rowstests/test_smoke.pyasserting preservedfont.size/font.nameafterset_cellRelated
HwpxAdapter.set_cell— already uses the correct pattern, keep as the reference implementation.