Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xlsx writer does not escape or remove control characters that cannot be represented in XML #212

Closed
1 of 3 tasks
athanasius-kircher opened this issue Sep 1, 2017 · 3 comments

Comments

@athanasius-kircher
Copy link

This is:

What is the expected behavior?

If you put a string containing control characters as vertical tabs into a cell, the resulting xlsx using the corresponding writer should not be corrupted.

What is the current behavior?

The xlsx file is corrupted when opening it in Excel.

What are the steps to reproduce?

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
$worksheet = $spreadsheet -> getActiveSheet();
$excelCell = $worksheet -> getCellByColumnAndRow('A',1);
$excelCell -> setValueExplicit(
  'test '."\v" . 'test',
  \PhpOffice\PhpSpreadsheet\Cell\DataType::TYPE_STRING2
);
$excelWriter = \PhpOffice\PhpSpreadsheet\IOFactory::createWriter($spreadsheet, 'Xlsx');
$excelWriter -> save('test.xlsx');

There seems to be some escaping intended:

Compare \PhpOffice\PhpSpreadsheet\Shared\StringHelper::controlCharacterPHP2OOXML(). But even if I use the following code, the control character looks weird in Excel. But the file is not corrupted. If that is intended, then should the proper escaping not be active by default, without using \PhpOffice\PhpSpreadsheet\Shared\StringHelper::buildCharacterSets().

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
\PhpOffice\PhpSpreadsheet\Shared\StringHelper::buildCharacterSets();
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();
$worksheet = $spreadsheet -> getActiveSheet();
$excelCell = $worksheet -> getCellByColumnAndRow('A',1);
$excelCell -> setValueExplicit(
  'test '."\v" . 'test',
  \PhpOffice\PhpSpreadsheet\Cell\DataType::TYPE_STRING2
);
$excelWriter = \PhpOffice\PhpSpreadsheet\IOFactory::createWriter($spreadsheet, 'Xlsx');
$excelWriter -> save('test.xlsx');

Which versions of PhpSpreadsheet and PHP are affected?

"phpoffice/phpspreadsheet": "1.0.0-beta"
PHP 5.6/7.0/7.1

@thommyhh
Copy link

Actually the fix is not working in my case. I'm exporting json encoded data from a mysql database. In the database the text contains ^K, which should be vertical tab control character. The data is read from the database and then json-decoded. Created XLSX file is corrupted because of that character.

I'd like to provide more information, if you need that to investigate this further.

@PowerKiKi
Copy link
Member

@thommyhh, StringHelperTest::testControlCharacterOOXML2PHP() seems to suggests that it should work. Would you be able to debug further and come up with a failing tests case that cover your case ?

@thommyhh
Copy link

I will try to do so, yes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants