Skip to content

Enable rich text formatting in generated DOCX documents#43

Merged
erseco merged 1 commit intomainfrom
feature/fix-opentbs-html-formatting-issue
Oct 14, 2025
Merged

Enable rich text formatting in generated DOCX documents#43
erseco merged 1 commit intomainfrom
feature/fix-opentbs-html-formatting-issue

Conversation

@erseco
Copy link
Copy Markdown
Collaborator

@erseco erseco commented Oct 14, 2025

Summary

  • track rich text field values during document merge preparation so they can be post-processed
  • convert merged HTML fragments into WordprocessingML runs when rendering DOCX templates with OpenTBS
  • add unit coverage for the DOCX rich text converter

Testing

  • ./vendor/bin/phpcs --standard=.phpcs.xml.dist includes/class-resolate-document-generator.php
  • ./vendor/bin/phpcs --standard=.phpcs.xml.dist includes/class-resolate-opentbs.php
  • composer test (fails: WordPress test bootstrap not available in this environment)

https://chatgpt.com/codex/tasks/task_e_68ee7fdd11fc83228913324cea0d1b59

@erseco erseco merged commit 0062425 into main Oct 14, 2025
2 of 3 checks passed
Comment on lines +160 to +208
public static function convert_docx_part_rich_text( $xml, $lookup ) {
$rich_lookup = self::prepare_rich_lookup( $lookup );
if ( empty( $rich_lookup ) ) {
return $xml;
}
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$dom->formatOutput = false; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
libxml_use_internal_errors( true );
$loaded = $dom->loadXML( $xml );
libxml_clear_errors();
if ( ! $loaded ) {
return $xml;
}
$xpath = new DOMXPath( $dom );
$xpath->registerNamespace( 'w', self::WORD_NAMESPACE );
$nodes = $xpath->query( '//w:t' );
$modified = false;
if ( $nodes instanceof DOMNodeList ) {
foreach ( $nodes as $node ) {
if ( ! $node instanceof DOMElement ) {
continue;
}
$value = html_entity_decode( $node->textContent, ENT_QUOTES | ENT_XML1, 'UTF-8' ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( '' === $value || ! isset( $rich_lookup[ $value ] ) ) {
continue;
}
$run = $node->parentNode; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( ! $run instanceof DOMElement ) {
continue;
}
$base_rpr = self::clone_run_properties( $run );
$runs = self::build_docx_runs_from_html( $dom, $value, $base_rpr );
if ( empty( $runs ) ) {
continue;
}
$parent = $run->parentNode; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( ! $parent ) {
continue;
}
foreach ( $runs as $new_run ) {
$parent->insertBefore( $new_run, $run );
}
$parent->removeChild( $run );
$modified = true;
}
}
return $modified ? $dom->saveXML() : $xml;
}

Check warning

Code scanning / PHPMD

Code Size Rules: CyclomaticComplexity Warning

The method convert_docx_part_rich_text() has a Cyclomatic Complexity of 13. The configured cyclomatic complexity threshold is 10.
Comment on lines +160 to +208
public static function convert_docx_part_rich_text( $xml, $lookup ) {
$rich_lookup = self::prepare_rich_lookup( $lookup );
if ( empty( $rich_lookup ) ) {
return $xml;
}
$dom = new DOMDocument();
$dom->preserveWhiteSpace = false; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$dom->formatOutput = false; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
libxml_use_internal_errors( true );
$loaded = $dom->loadXML( $xml );
libxml_clear_errors();
if ( ! $loaded ) {
return $xml;
}
$xpath = new DOMXPath( $dom );
$xpath->registerNamespace( 'w', self::WORD_NAMESPACE );
$nodes = $xpath->query( '//w:t' );
$modified = false;
if ( $nodes instanceof DOMNodeList ) {
foreach ( $nodes as $node ) {
if ( ! $node instanceof DOMElement ) {
continue;
}
$value = html_entity_decode( $node->textContent, ENT_QUOTES | ENT_XML1, 'UTF-8' ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( '' === $value || ! isset( $rich_lookup[ $value ] ) ) {
continue;
}
$run = $node->parentNode; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( ! $run instanceof DOMElement ) {
continue;
}
$base_rpr = self::clone_run_properties( $run );
$runs = self::build_docx_runs_from_html( $dom, $value, $base_rpr );
if ( empty( $runs ) ) {
continue;
}
$parent = $run->parentNode; // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( ! $parent ) {
continue;
}
foreach ( $runs as $new_run ) {
$parent->insertBefore( $new_run, $run );
}
$parent->removeChild( $run );
$modified = true;
}
}
return $modified ? $dom->saveXML() : $xml;
}

Check warning

Code scanning / PHPMD

Code Size Rules: NPathComplexity Warning

The method convert_docx_part_rich_text() has an NPath complexity of 784. The configured NPath complexity threshold is 200.
Comment on lines +277 to +344
private static function append_html_nodes_to_runs( DOMDocument $doc, array &$runs, $nodes, $base_rpr, array $formatting ) {
if ( ! $nodes instanceof DOMNodeList ) {
return;
}
foreach ( $nodes as $node ) {
if ( XML_TEXT_NODE === $node->nodeType ) { // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$text = str_replace( array( "\r\n", "\r" ), "\n", $node->nodeValue ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$parts = explode( "\n", $text );
foreach ( $parts as $index => $part ) {
$part = (string) $part;
if ( '' !== $part ) {
$run = self::create_text_run( $doc, $part, $base_rpr, $formatting );
if ( $run ) {
$runs[] = $run;
}
}
if ( $index < count( $parts ) - 1 ) {
$runs[] = self::create_break_run( $doc, $base_rpr );
}
}
continue;
}
if ( ! $node instanceof DOMElement ) {
continue;
}
$tag = strtolower( $node->nodeName ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
switch ( $tag ) {
case 'strong':
case 'b':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::with_format_flag( $formatting, 'bold', true ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'em':
case 'i':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::with_format_flag( $formatting, 'italic', true ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'u':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::with_format_flag( $formatting, 'underline', true ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'br':
$runs[] = self::create_break_run( $doc, $base_rpr );
break;
case 'p':
case 'div':
case 'section':
case 'article':
case 'blockquote':
case 'address':
case 'span':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::extract_span_formatting( $formatting, $node ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( 'span' !== $tag ) {
$runs[] = self::create_break_run( $doc, $base_rpr );
}
break;
case 'ul':
case 'ol':
self::append_list_runs( $doc, $runs, $node, $base_rpr, $formatting, 'ol' === $tag );
break;
case 'li':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, $formatting ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'a':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, $formatting ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
default:
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, $formatting ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
}
}
}

Check warning

Code scanning / PHPMD

Code Size Rules: CyclomaticComplexity Warning

The method append_html_nodes_to_runs() has a Cyclomatic Complexity of 38. The configured cyclomatic complexity threshold is 10.
Comment on lines +277 to +344
private static function append_html_nodes_to_runs( DOMDocument $doc, array &$runs, $nodes, $base_rpr, array $formatting ) {
if ( ! $nodes instanceof DOMNodeList ) {
return;
}
foreach ( $nodes as $node ) {
if ( XML_TEXT_NODE === $node->nodeType ) { // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$text = str_replace( array( "\r\n", "\r" ), "\n", $node->nodeValue ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
$parts = explode( "\n", $text );
foreach ( $parts as $index => $part ) {
$part = (string) $part;
if ( '' !== $part ) {
$run = self::create_text_run( $doc, $part, $base_rpr, $formatting );
if ( $run ) {
$runs[] = $run;
}
}
if ( $index < count( $parts ) - 1 ) {
$runs[] = self::create_break_run( $doc, $base_rpr );
}
}
continue;
}
if ( ! $node instanceof DOMElement ) {
continue;
}
$tag = strtolower( $node->nodeName ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
switch ( $tag ) {
case 'strong':
case 'b':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::with_format_flag( $formatting, 'bold', true ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'em':
case 'i':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::with_format_flag( $formatting, 'italic', true ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'u':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::with_format_flag( $formatting, 'underline', true ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'br':
$runs[] = self::create_break_run( $doc, $base_rpr );
break;
case 'p':
case 'div':
case 'section':
case 'article':
case 'blockquote':
case 'address':
case 'span':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, self::extract_span_formatting( $formatting, $node ) ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
if ( 'span' !== $tag ) {
$runs[] = self::create_break_run( $doc, $base_rpr );
}
break;
case 'ul':
case 'ol':
self::append_list_runs( $doc, $runs, $node, $base_rpr, $formatting, 'ol' === $tag );
break;
case 'li':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, $formatting ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
case 'a':
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, $formatting ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
break;
default:
self::append_html_nodes_to_runs( $doc, $runs, $node->childNodes, $base_rpr, $formatting ); // phpcs:ignore WordPress.NamingConventions.ValidVariableName.UsedPropertyNotSnakeCase
}
}
}

Check warning

Code scanning / PHPMD

Code Size Rules: NPathComplexity Warning

The method append_html_nodes_to_runs() has an NPath complexity of 1346. The configured NPath complexity threshold is 200.
Comment on lines +380 to +409
private static function extract_span_formatting( array $formatting, DOMElement $node ) {
$style = $node->getAttribute( 'style' );
if ( $style ) {
$styles = array_map( 'trim', explode( ';', strtolower( $style ) ) );
foreach ( $styles as $rule ) {
if ( '' === $rule ) {
continue;
}
list( $prop, $val ) = array_map( 'trim', explode( ':', $rule ) + array( '', '' ) );
switch ( $prop ) {
case 'font-weight':
if ( 'bold' === $val || '700' === $val ) {
$formatting['bold'] = true;
}
break;
case 'font-style':
if ( 'italic' === $val ) {
$formatting['italic'] = true;
}
break;
case 'text-decoration':
if ( false !== strpos( $val, 'underline' ) ) {
$formatting['underline'] = true;
}
break;
}
}
}
return $formatting;
}

Check warning

Code scanning / PHPMD

Code Size Rules: CyclomaticComplexity Warning

The method extract_span_formatting() has a Cyclomatic Complexity of 11. The configured cyclomatic complexity threshold is 10.
Comment on lines +1144 to +1242
public static function get_term_schema( $term_id ) {
$raw = get_term_meta( $term_id, 'schema', true );
if ( ! is_array( $raw ) ) {
$raw = get_term_meta( $term_id, 'resolate_type_fields', true );
}
if ( ! is_array( $raw ) ) {
return array();
}

$out = array();
foreach ( $raw as $item ) {
if ( ! is_array( $item ) ) {
continue;
}

$slug = isset( $item['slug'] ) ? sanitize_key( $item['slug'] ) : '';
$label = isset( $item['label'] ) ? sanitize_text_field( $item['label'] ) : '';
$type = isset( $item['type'] ) ? sanitize_key( $item['type'] ) : 'textarea';
$placeholder = isset( $item['placeholder'] ) ? preg_replace( '/[^A-Za-z0-9._:-]/', '', (string) $item['placeholder'] ) : '';
$data_type = isset( $item['data_type'] ) ? sanitize_key( $item['data_type'] ) : '';

if ( '' === $slug ) {
continue;
}

if ( '' === $label ) {
$label = self::humanize_schema_label( $slug );
}

if ( '' === $label ) {
continue;
}

if ( '' === $placeholder ) {
$placeholder = $slug;
}

if ( 'array' === $type ) {
$item_schema = array();
if ( isset( $item['item_schema'] ) && is_array( $item['item_schema'] ) ) {
foreach ( $item['item_schema'] as $key => $definition ) {
$item_key = sanitize_key( $key );
if ( '' === $item_key ) {
continue;
}

$item_label = isset( $definition['label'] ) ? sanitize_text_field( $definition['label'] ) : '';
if ( '' === $item_label ) {
$item_label = self::humanize_schema_label( $item_key );
}

$item_type = isset( $definition['type'] ) ? sanitize_key( $definition['type'] ) : 'textarea';
if ( ! in_array( $item_type, array( 'single', 'textarea', 'rich' ), true ) ) {
$item_type = 'textarea';
}

$item_data_type = isset( $definition['data_type'] ) ? sanitize_key( $definition['data_type'] ) : 'text';
if ( ! in_array( $item_data_type, array( 'text', 'number', 'boolean', 'date' ), true ) ) {
$item_data_type = 'text';
}

$item_schema[ $item_key ] = array(
'label' => $item_label,
'type' => $item_type,
'data_type' => $item_data_type,
);
}
}

$out[] = array(
'slug' => $slug,
'label' => $label,
'type' => 'array',
'placeholder' => $placeholder,
'data_type' => 'array',
'item_schema' => $item_schema,
);
continue;
}

if ( ! in_array( $type, array( 'single', 'textarea', 'rich' ), true ) ) {
$type = 'textarea';
}

if ( ! in_array( $data_type, array( 'text', 'number', 'boolean', 'date' ), true ) ) {
$data_type = 'text';
}

$out[] = array(
'slug' => $slug,
'label' => $label,
'type' => $type,
'placeholder' => $placeholder,
'data_type' => $data_type,
);
}

return $out;
}

Check warning

Code scanning / PHPMD

Code Size Rules: CyclomaticComplexity Warning

The method get_term_schema() has a Cyclomatic Complexity of 27. The configured cyclomatic complexity threshold is 10.
Comment on lines +1144 to +1242
public static function get_term_schema( $term_id ) {
$raw = get_term_meta( $term_id, 'schema', true );
if ( ! is_array( $raw ) ) {
$raw = get_term_meta( $term_id, 'resolate_type_fields', true );
}
if ( ! is_array( $raw ) ) {
return array();
}

$out = array();
foreach ( $raw as $item ) {
if ( ! is_array( $item ) ) {
continue;
}

$slug = isset( $item['slug'] ) ? sanitize_key( $item['slug'] ) : '';
$label = isset( $item['label'] ) ? sanitize_text_field( $item['label'] ) : '';
$type = isset( $item['type'] ) ? sanitize_key( $item['type'] ) : 'textarea';
$placeholder = isset( $item['placeholder'] ) ? preg_replace( '/[^A-Za-z0-9._:-]/', '', (string) $item['placeholder'] ) : '';
$data_type = isset( $item['data_type'] ) ? sanitize_key( $item['data_type'] ) : '';

if ( '' === $slug ) {
continue;
}

if ( '' === $label ) {
$label = self::humanize_schema_label( $slug );
}

if ( '' === $label ) {
continue;
}

if ( '' === $placeholder ) {
$placeholder = $slug;
}

if ( 'array' === $type ) {
$item_schema = array();
if ( isset( $item['item_schema'] ) && is_array( $item['item_schema'] ) ) {
foreach ( $item['item_schema'] as $key => $definition ) {
$item_key = sanitize_key( $key );
if ( '' === $item_key ) {
continue;
}

$item_label = isset( $definition['label'] ) ? sanitize_text_field( $definition['label'] ) : '';
if ( '' === $item_label ) {
$item_label = self::humanize_schema_label( $item_key );
}

$item_type = isset( $definition['type'] ) ? sanitize_key( $definition['type'] ) : 'textarea';
if ( ! in_array( $item_type, array( 'single', 'textarea', 'rich' ), true ) ) {
$item_type = 'textarea';
}

$item_data_type = isset( $definition['data_type'] ) ? sanitize_key( $definition['data_type'] ) : 'text';
if ( ! in_array( $item_data_type, array( 'text', 'number', 'boolean', 'date' ), true ) ) {
$item_data_type = 'text';
}

$item_schema[ $item_key ] = array(
'label' => $item_label,
'type' => $item_type,
'data_type' => $item_data_type,
);
}
}

$out[] = array(
'slug' => $slug,
'label' => $label,
'type' => 'array',
'placeholder' => $placeholder,
'data_type' => 'array',
'item_schema' => $item_schema,
);
continue;
}

if ( ! in_array( $type, array( 'single', 'textarea', 'rich' ), true ) ) {
$type = 'textarea';
}

if ( ! in_array( $data_type, array( 'text', 'number', 'boolean', 'date' ), true ) ) {
$data_type = 'text';
}

$out[] = array(
'slug' => $slug,
'label' => $label,
'type' => $type,
'placeholder' => $placeholder,
'data_type' => $data_type,
);
}

return $out;
}

Check warning

Code scanning / PHPMD

Code Size Rules: NPathComplexity Warning

The method get_term_schema() has an NPath complexity of 2162692. The configured NPath complexity threshold is 200.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

https://github.com/erseco/wp-resolate/blob/61f97a79370654752ffc0871667645ba6a393873/tests/unit/includes/ResolateOpenTBSTest.php#L1-L16
P1 Badge Load OpenTBS helper before invoking it in unit tests

The new test class calls Resolate_OpenTBS::convert_docx_part_rich_text() directly, but there is no require or autoloader entry that loads includes/class-resolate-opentbs.php. The plugin bootstrap (resolate.php) never includes that helper either, so running this test will hit Error: Class 'Resolate_OpenTBS' not found before any assertions execute. Explicitly include the file (or adjust the bootstrap) before calling the static method so the test suite can run.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@erseco erseco deleted the feature/fix-opentbs-html-formatting-issue branch November 30, 2025 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants