Skip to content

Commit

Permalink
Table Borders Fixes
Browse files Browse the repository at this point in the history
Fix #2402. Fix #2474. Both issues deal with borders around tables when they aren't wanted. There are 3 big issues in the code, and several minor ones.

First big issue - Word table styles can have both a `styleId` and a `name`, which are often different from each other, and each of which is used by various Word functions, and what documentation I can find is far from clear on the difference. I have added a `tableStyle` property (for styleId) to Style/Table, and the reader will now preserve both `styleId` and `name`. It will similarly preserve `basedOn`, which in now a private property in Style/Paragraph, but is changed to be a protected property in Style.

Second big issue - Word2007 Reader assumes that table style can be specified either by name or by inline declarations, but not both. Guess what? It is now changed to support both. This makes the delta for Reader/Word2007/AbstractPart appear to be much more complicated than it actually is. The change is almost entirely of the form:
```
if (condition) {short_code_block} else {long_code_block}
```
to
```
long_code_block
```

Third big issue. In html, td does not inherit border styles from table. In word, cell border styles are specified in table styles (as insideH/V), so they do, in effect, inherit. This is resolved, as best as I can, by having each td/th without its own style use the table border style. So adding an html border style should produce a consistent result in Html and Docx output.

Minor issues:
- Html table (not css) attribute border=0 should set borderStyle none on all borders; any other value should set borderStyle single.
- PhpWord accepts named colors from html styles. According to the documentation that I can find, Word does not recognize those, but, in practice, it often does. Nevertheless, I have added translation to hex (borrowed from PhpSpreadsheet). If nothing else, this will increase interoperability (e.g. RTF doesn't accept named colors, and html 3-hex-digit short forms are now permitted). If a color is not found in the translation table, it will be left unchanged, so there should be no impact.
- Writer/Html/Style/Table now accepts colors as 6 hex digits, as well as strings.
- The parsing of border css attributes is not accurate. It rejects legitimate values. One example is `2px solid red`, since PhpWord, unlike html, insists on color before style. It rejects `2px #ff0000 solid` because it doesn't accept colors as hex strings. It does not allow the omission of the size and color attributes, but css does. The parsing is rewritten to try to overcome these deficiencies. Note, BTW, that css `border:0` is not acceptable css (size needs a unit and style is omitted); this was mentioned in one of the issues as not being handled correctly, but, since it is invalid, there should be no expectation of its being handed in any particular way.
- Style/Border::hasBorder is expanded to test all of Size, Color, and Style, rather than limiting its test to Size.
- Properties insideHStyle and insideVStyle are added to Style/Table. Their Color and Size equivalents already existed.
- If border is not specified as an Html or css attribute on a table, it is not the same as specifying html border=0 or css border:none. The end result will be whatever the app that reads the result defaults to. The results may not be consistent between, say, Html and Docx. This is already addressed in part by setting default styling for table and td in the html head section to match the Word defaults. However, there may still be differences; the way to (mostly) avoid them is to specify a table style.
  • Loading branch information
oleibman committed Dec 29, 2023
1 parent 11a7aaa commit 5f42638
Show file tree
Hide file tree
Showing 20 changed files with 1,087 additions and 112 deletions.
5 changes: 0 additions & 5 deletions phpstan-baseline.neon
Expand Up @@ -405,11 +405,6 @@ parameters:
count: 1
path: src/PhpWord/Shared/Html.php

-
message: "#^Cannot call method setBorderSize\\(\\) on PhpOffice\\\\PhpWord\\\\Style\\\\Table\\|string\\.$#"
count: 1
path: src/PhpWord/Shared/Html.php

-
message: "#^Cannot call method setStyleName\\(\\) on PhpOffice\\\\PhpWord\\\\Style\\\\Table\\|string\\.$#"
count: 1
Expand Down
65 changes: 38 additions & 27 deletions src/PhpWord/Reader/Word2007/AbstractPart.php
Expand Up @@ -592,35 +592,46 @@ protected function readTableStyle(XMLReader $xmlReader, DOMElement $domNode)
$borders = array_merge($margins, ['insideH', 'insideV']);

if ($xmlReader->elementExists('w:tblPr', $domNode)) {
$tblStyleName = '';
if ($xmlReader->elementExists('w:tblPr/w:tblStyle', $domNode)) {
$style = $xmlReader->getAttribute('w:val', $domNode, 'w:tblPr/w:tblStyle');
} else {
$styleNode = $xmlReader->getElement('w:tblPr', $domNode);
$styleDefs = [];
foreach ($margins as $side) {
$ucfSide = ucfirst($side);
$styleDefs["cellMargin$ucfSide"] = [self::READ_VALUE, "w:tblCellMar/w:$side", 'w:w'];
}
foreach ($borders as $side) {
$ucfSide = ucfirst($side);
$styleDefs["border{$ucfSide}Size"] = [self::READ_VALUE, "w:tblBorders/w:$side", 'w:sz'];
$styleDefs["border{$ucfSide}Color"] = [self::READ_VALUE, "w:tblBorders/w:$side", 'w:color'];
$styleDefs["border{$ucfSide}Style"] = [self::READ_VALUE, "w:tblBorders/w:$side", 'w:val'];
}
$styleDefs['layout'] = [self::READ_VALUE, 'w:tblLayout', 'w:type'];
$styleDefs['bidiVisual'] = [self::READ_TRUE, 'w:bidiVisual'];
$styleDefs['cellSpacing'] = [self::READ_VALUE, 'w:tblCellSpacing', 'w:w'];
$style = $this->readStyleDefs($xmlReader, $styleNode, $styleDefs);

$tablePositionNode = $xmlReader->getElement('w:tblpPr', $styleNode);
if ($tablePositionNode !== null) {
$style['position'] = $this->readTablePosition($xmlReader, $tablePositionNode);
}
$tblStyleName = $xmlReader->getAttribute('w:val', $domNode, 'w:tblPr/w:tblStyle');
}
$styleNode = $xmlReader->getElement('w:tblPr', $domNode);
$styleDefs = [];

$indentNode = $xmlReader->getElement('w:tblInd', $styleNode);
if ($indentNode !== null) {
$style['indent'] = $this->readTableIndent($xmlReader, $indentNode);
}
foreach ($margins as $side) {
$ucfSide = ucfirst($side);
$styleDefs["cellMargin$ucfSide"] = [self::READ_VALUE, "w:tblCellMar/w:$side", 'w:w'];
}
foreach ($borders as $side) {
$ucfSide = ucfirst($side);
$styleDefs["border{$ucfSide}Size"] = [self::READ_VALUE, "w:tblBorders/w:$side", 'w:sz'];
$styleDefs["border{$ucfSide}Color"] = [self::READ_VALUE, "w:tblBorders/w:$side", 'w:color'];
$styleDefs["border{$ucfSide}Style"] = [self::READ_VALUE, "w:tblBorders/w:$side", 'w:val'];
}
$styleDefs['layout'] = [self::READ_VALUE, 'w:tblLayout', 'w:type'];
$styleDefs['bidiVisual'] = [self::READ_TRUE, 'w:bidiVisual'];
$styleDefs['cellSpacing'] = [self::READ_VALUE, 'w:tblCellSpacing', 'w:w'];
$style = $this->readStyleDefs($xmlReader, $styleNode, $styleDefs);

$tablePositionNode = $xmlReader->getElement('w:tblpPr', $styleNode);
if ($tablePositionNode !== null) {
$style['position'] = $this->readTablePosition($xmlReader, $tablePositionNode);
}

$indentNode = $xmlReader->getElement('w:tblInd', $styleNode);
if ($indentNode !== null) {
$style['indent'] = $this->readTableIndent($xmlReader, $indentNode);
}
if ($xmlReader->elementExists('w:basedOn', $domNode)) {
$style['basedOn'] = $xmlReader->getAttribute('w:val', $domNode, 'w:basedOn');
}
if ($tblStyleName !== '') {
$style['tblStyle'] = $tblStyleName;
}
// this may be unneeded
if ($xmlReader->elementExists('w:name', $domNode)) {
$style['styleName'] = $xmlReader->getAttribute('w:val', $domNode, 'w:name');
}
}

Expand Down
6 changes: 4 additions & 2 deletions src/PhpWord/Reader/Word2007/Styles.php
Expand Up @@ -63,8 +63,9 @@ public function read(PhpWord $phpWord): void
foreach ($nodes as $node) {
$type = $xmlReader->getAttribute('w:type', $node);
$name = $xmlReader->getAttribute('w:val', $node, 'w:name');
$styleId = $xmlReader->getAttribute('w:styleId', $node);
if (null === $name) {
$name = $xmlReader->getAttribute('w:styleId', $node);
$name = $styleId;
}
$headingMatches = [];
preg_match('/Heading\s*(\d)/i', $name, $headingMatches);
Expand Down Expand Up @@ -96,7 +97,8 @@ public function read(PhpWord $phpWord): void
case 'table':
$tStyle = $this->readTableStyle($xmlReader, $node);
if (!empty($tStyle)) {
$phpWord->addTableStyle($name, $tStyle);
$newTable = $phpWord->addTableStyle($styleId, $tStyle);
$newTable->setStyleName($name);
}

break;
Expand Down
83 changes: 53 additions & 30 deletions src/PhpWord/Shared/Html.php
Expand Up @@ -26,6 +26,7 @@
use PhpOffice\PhpWord\Element\Row;
use PhpOffice\PhpWord\Element\Table;
use PhpOffice\PhpWord\Settings;
use PhpOffice\PhpWord\SimpleType\Border;
use PhpOffice\PhpWord\SimpleType\Jc;
use PhpOffice\PhpWord\SimpleType\NumberFormat;
use PhpOffice\PhpWord\Style\Paragraph;
Expand All @@ -37,6 +38,8 @@
*/
class Html
{
private const SPECIAL_BORDER_WIDTHS = ['thin' => '0.5pt', 'thick' => '3.5pt', 'medium' => '2.0pt'];

protected static $listIndex = 0;

protected static $xpath;
Expand Down Expand Up @@ -142,7 +145,7 @@ protected static function parseInlineStyle($node, $styles = [])
break;
case 'bgcolor':
// tables, rows, cells e.g. <tr bgColor="#FF0000">
$styles['bgColor'] = trim($val, '# ');
HtmlColours::setArrayColour($styles, 'bgColor', $val);

break;
case 'valign':
Expand Down Expand Up @@ -421,9 +424,10 @@ protected static function parseTable($node, $element, &$styles)
}

$attributes = $node->attributes;
if ($attributes->getNamedItem('border') !== null) {
if ($attributes->getNamedItem('border') !== null && is_object($newElement->getStyle())) {
$border = (int) $attributes->getNamedItem('border')->value;
$newElement->getStyle()->setBorderSize(Converter::pixelToTwip($border));
$newElement->getStyle()->setBorderSize((int) Converter::pixelToTwip($border));
$newElement->getStyle()->setBorderStyle(($border === 0) ? 'none' : 'single');
}

return $newElement;
Expand Down Expand Up @@ -720,11 +724,11 @@ protected static function parseStyleDeclarations(array $selectors, array $styles

break;
case 'color':
$styles['color'] = trim($value, '#');
HtmlColours::setArrayColour($styles, 'color', $value);

break;
case 'background-color':
$styles['bgColor'] = trim($value, '#');
HtmlColours::setArrayColour($styles, 'bgColor', $value);

break;
case 'line-height':
Expand Down Expand Up @@ -804,7 +808,7 @@ protected static function parseStyleDeclarations(array $selectors, array $styles

break;
case 'border-width':
$styles['borderSize'] = Converter::cssToPoint($value);
$styles['borderSize'] = Converter::cssToPoint(self::SPECIAL_BORDER_WIDTHS[$value] ?? $value);

break;
case 'border-style':
Expand Down Expand Up @@ -834,29 +838,46 @@ protected static function parseStyleDeclarations(array $selectors, array $styles
case 'border-bottom':
case 'border-right':
case 'border-left':
// must have exact order [width color style], e.g. "1px #0011CC solid" or "2pt green solid"
// Word does not accept shortened hex colors e.g. #CCC, only full e.g. #CCCCCC
if (preg_match('/([0-9]+[^0-9]*)\s+(\#[a-fA-F0-9]+|[a-zA-Z]+)\s+([a-z]+)/', $value, $matches)) {
if (false !== strpos($property, '-')) {
$tmp = explode('-', $property);
$which = $tmp[1];
$which = ucfirst($which); // e.g. bottom -> Bottom
} else {
$which = '';
}
// Note - border width normalization:
// Width of border in Word is calculated differently than HTML borders, usually showing up too bold.
// Smallest 1px (or 1pt) appears in Word like 2-3px/pt in HTML once converted to twips.
// Therefore we need to normalize converted twip value to cca 1/2 of value.
// This may be adjusted, if better ratio or formula found.
// BC change: up to ver. 0.17.0 was $size converted to points - Converter::cssToPoint($size)
$size = Converter::cssToTwip($matches[1]);
$stylePattern = '/(^|\\s)(none|hidden|dotted|dashed|solid|double|groove|ridge|inset|outset)(\\s|$)/';
if (!preg_match($stylePattern, $value, $matches)) {
break;
}
$borderStyle = $matches[2];
$value = preg_replace($stylePattern, ' ', $value) ?? '';
$borderSize = $borderColor = null;
$sizePattern = '/(^|\\s)([0-9]+([.][0-9]+)?+(%|[a-z]*)|thick|thin|medium)(\\s|$)/';
if (preg_match($sizePattern, $value, $matches)) {
$borderSize = $matches[2];
$borderSize = self::SPECIAL_BORDER_WIDTHS[$borderSize] ?? $borderSize;
$value = preg_replace($sizePattern, ' ', $value) ?? '';
}
$colorPattern = '/(^|\\s)([#][a-fA-F0-9]{6}|[#][a-fA-F0-9]{3}|[a-z][a-z0-9]+)(\\s|$)/';
if (preg_match($colorPattern, $value, $matches)) {
$borderColor = HtmlColours::convertColour($matches[2]);
}
if (false !== strpos($property, '-')) {
$tmp = explode('-', $property);
$which = $tmp[1];
$which = ucfirst($which); // e.g. bottom -> Bottom
} else {
$which = '';
}
// Note - border width normalization:
// Width of border in Word is calculated differently than HTML borders, usually showing up too bold.
// Smallest 1px (or 1pt) appears in Word like 2-3px/pt in HTML once converted to twips.
// Therefore we need to normalize converted twip value to cca 1/2 of value.
// This may be adjusted, if better ratio or formula found.
// BC change: up to ver. 0.17.0 was $size converted to points - Converter::cssToPoint($size)
if ($borderSize !== null) {
$size = Converter::cssToTwip($borderSize);
$size = (int) ($size / 2);
// valid variants may be e.g. borderSize, borderTopSize, borderLeftColor, etc ..
$styles["border{$which}Size"] = $size; // twips
$styles["border{$which}Color"] = trim($matches[2], '#');
$styles["border{$which}Style"] = self::mapBorderStyle($matches[3]);
}
if (!empty($borderColor)) {
$styles["border{$which}Color"] = $borderColor;
}
$styles["border{$which}Style"] = self::mapBorderStyle($borderStyle);

break;
case 'vertical-align':
Expand Down Expand Up @@ -1006,21 +1027,23 @@ protected static function mapBorderStyle($cssBorderStyle)
case 'dotted':
case 'double':
return $cssBorderStyle;
case 'hidden':
return 'none';
default:
return 'single';
}
}

protected static function mapBorderColor(&$styles, $cssBorderColor): void
{
$numColors = substr_count($cssBorderColor, '#');
$colors = explode(' ', $cssBorderColor);
$numColors = count($colors);
if ($numColors === 1) {
$styles['borderColor'] = trim($cssBorderColor, '#');
} elseif ($numColors > 1) {
$colors = explode(' ', $cssBorderColor);
HtmlColours::setArrayColour($styles, 'borderColor', $cssBorderColor);
} else {
$borders = ['borderTopColor', 'borderRightColor', 'borderBottomColor', 'borderLeftColor'];
for ($i = 0; $i < min(4, $numColors, count($colors)); ++$i) {
$styles[$borders[$i]] = trim($colors[$i], '#');
HtmlColours::setArrayColour($styles, $borders[$i], $colors[$i]);
}
}
}
Expand Down

0 comments on commit 5f42638

Please sign in to comment.