Skip to content

addHtml() inconsistency across various export formats #2232

@sirfragalot

Description

@sirfragalot

Describe the Bug

Inconsistency in exporting HTML to various output formats.

I develop the WIKINDX bibliographic management software that has a word processor in it which uses TInyMCE. Currently we export the word processor output to RTF using code we developed ourselves (so can appreciate the difficulties). We'd like to export to more formats such as DOCX and ODT so I decided to give PHPWord a go. Unfortunately, the results are not yet usable as exporting HTML to RTF, DOCX and ODT gives wildly different results.

A related issue is the documentation. For example, a search on 'addhtml' produces no results and I only came across addHtml quite by chance looking through the bug reports here. It seems there is a lot of (important) PHPWord functionality simply not documented. (Yes, I know from experience, free and open source software relies on the kindness of strangers.) I mention this because the inconsistency I note might be solved by some method/setting that I cannot find in the documentation.

Steps to Reproduce

The following code uses HTML output by TinyMCE to export to DOCX, RTF, and ODT.

require "core/libs/vendor/PHPWord/bootstrap.php";
$phpWord = new \PhpOffice\PhpWord\PhpWord();
\PhpOffice\PhpWord\Settings::setOutputEscapingEnabled(true);

echo "here";

$section = $phpWord->addSection();

$text = "
<p>Is it possible to <strong>im<em>agi</em>ne</strong> an <span style=\"text-decoration: underline;\">unimaginable</span> sound<sup>2</sup>?</p>
<p><span style=\"font-family: 'courier new', courier, monospace; font-size: 36pt;\">large text different font</span></p>
<p>A number of ways to approach this question beginning with semantics (the meanings of the words in the questions):</p>
<ul>
<li>What is the definition of 'imagination' or 'to imagine' that is being used?</li>
<li><em>unimaginable/unimagined</em> by one person or within a culture?</li>
<li>There is a difference between <em>unimaginable</em> & <em>unimagined</em>.</li>
</ul>
<p><img src=\"data/images/winnmp_project_1_a8b8fcce8f468f33fff821212dcf9ebc1ea7274d.png\" width=\"400\" height=\"296\" /></p>
<p> </p>
<table style=\"border-collapse: collapse; width: 100%; height: 40.3334px; border-style: solid;\" border=\"1\">
<tbody>
<tr>
<td style=\"width: 48.1085%;\">cell 1</td>
<td style=\"width: 48.1085%;\"><span style=\"color: #e03e2d;\">cell 2</span></td>
</tr>
<tr>
<td style=\"width: 48.1085%;\">cell 3: ōöéÊÉï</td>
<td style=\"width: 48.1085%;\">cell 4: 𞤔𞤕</td>
</tr>
</tbody>
</table>
<p><a title=\"wikindx\" href=\"https://wikindx.sourceforge.io\">wikindx</a></p>
<p> </p>
<p style=\"text-align: left;\">Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified Left justified </p>
<p style=\"text-align: right;\">Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified Right justified </p>
<p>Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified Justified </p>
<p>Centered</p>
";

\PhpOffice\PhpWord\Shared\Html::addHtml($section, $text);

$file = 'helloWorld.docx';
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'Word2007');
$objWriter->save($file);

$file = 'helloWorld.rtf';
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'RTF');
$objWriter->save($file);

$file = 'helloWorld.odt';
$objWriter = \PhpOffice\PhpWord\IOFactory::createWriter($phpWord, 'ODText');
$objWriter->save($file);

Expected Behavior

Consistent and accurate export across the three outputs.

Current Behavior

Inconsistent and innacurate export accross the three outputs.

  1. Image. Not output in ODT
  2. List. Not output in RTF and ODT
  3. Table. No border in RTF (Fixed in RTF Writer : Support for Table Border Style #2656), DOCX, ODT
  4. Table. Not 100% width in RTF, DOCX, ODT
  5. Table. Cells completely messed up in RTF
  6. Coloured font. Only in DOCX
  7. Hyperlink. Generally fine but only ODT has link in blue and underlined
  8. Paragraph. No justified or centered text in any of the three outputs
  9. UTF-8 etc. Only in ODT

Context

Please fill in your environment information:

  • PHP Version: 7.4.9
  • PHPWord Version: 0.18.3

Regards,

Mark

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions