-
Couldn't load subscription status.
- Fork 3.6k
Description
Prerequisites
- phpoffice/phpspreadsheet version:
4.4.0(and confirmed to still exist in the latest official5.1.0). - PHP Version: 8.x
- Microsoft Excel Version: Microsoft 365 / Excel 2019 or newer.
Description
The Xlsx writer corrupts valid formulas containing the BASE function when loading an existing Excel file and re-saving it. Crucially, this corruption happens even if the cell containing the formula is never edited by the PHP script.
This indicates a fundamental issue in how the writer component parses and rewrites certain existing formulas, causing a perfectly functional spreadsheet to become broken after a simple load-and-save operation.
Steps to Reproduce
This issue is best demonstrated in two parts:
Part 1: Create a valid starting file
- Open Microsoft Excel.
- In cell
A1, enter the formula:=BASE(RANDBETWEEN(0,36^8-1),36,8) - Save this file as
template_with_base_function.xlsx. - Close and reopen the file to confirm it works perfectly. Cell A1 displays a random string.
Part 2: Use PhpSpreadsheet to simply load and save the file
- Place the
template_with_base_function.xlsxfile where the script can access it. - Run the following minimal PHP script:
<?php
require 'vendor/autoload.php';
use PhpOffice\PhpSpreadsheet\IOFactory;
// 1. Load the perfectly valid Excel file
$spreadsheet = IOFactory::load('template_with_base_function.xlsx');
$sheet = $spreadsheet->getActiveSheet();
// 2. IMPORTANT: We do NOT edit cell A1.
// To prove the file is being processed, we can edit a different cell.
$sheet->setCellValue('B1', 'This cell was edited by the script.');
// 3. Save the spreadsheet to a new file
$writer = IOFactory::createWriter($spreadsheet, 'Xlsx');
$writer->save('corrupted_output.xlsx');
echo "File re-saved. Please compare the template with the corrupted output.";Expected result
When opening corrupted_output.xlsx in Excel:
- Cell A1 should still contain the correctly functioning
BASEformula and display a new random string. - The functionality should be identical to the original
template_with_base_function.xlsx. - Cell B1 should contain the text "This cell was edited by the script."
Actual result
When opening corrupted_output.xlsx in Excel:
- The formula in cell A1 is broken and displays the
#NAME?error. - Cell B1 correctly shows the new text.
- Inspecting the
xl/worksheets/sheet1.xmlinsidecorrupted_output.xlsxreveals that the formula tag for cell A1 has been written incorrectly by PhpSpreadsheet (e.g.,<f>@BASE(...)</f>or similar corruption), whereas the original template file had the correct structure.
Analysis & Conclusion
This demonstrates that the issue is not in the reader (it correctly identifies the formula) but lies specifically within the Xlsx writer's formula serialization logic. The writer fails to correctly handle the BASE function (and likely other newer functions) when it encounters them in existing cells, leading to data corruption on save.
The fact that this worked in much older versions (e.g., 1.2.6) strongly suggests a regression was introduced in the formula parsing/writing engine at some point.
(The workaround section remains the same as it's still the valid solution)
Workaround
A reliable workaround is to proactively find such cells, manually rebuild the formula string with the _xlfn. prefix, and then use a two-step set value/set data type process to bypass the faulty writer logic.
// Inside a loop iterating over cells...
$cell = $sheet->getCell('A1');
if ($cell->isFormula() && str_contains($cell->getValue(), 'BASE(')) {
$formula = $cell->getValue();
$formulaWithoutEquals = substr($formula, 1);
$correctFormulaString = '=_xlfn.' . $formulaWithoutEquals;
$cell->setValueExplicit($correctFormulaString, DataType::TYPE_FORMULA);
}This two-step process generates an XML structure that Excel interprets correctly, solving the problem without modifying the library's source code.
Hopefully, this detailed report helps in diagnosing and fixing this issue in the formula writer. Thank you for your work on this great library.