Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UNIQUE and SORT are amended with a "@" making the calculation incorrect. #3901

Closed
2 of 7 tasks
roland-jungwirth opened this issue Feb 14, 2024 · 2 comments
Closed
2 of 7 tasks

Comments

@roland-jungwirth
Copy link

This is:

- [X] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

The function "UNIQUE" or "SORT" should not append a "@" breaking it.

What is the current behavior?

If I read a sheet with the formula "=UNIQUE(A1:A19)" or "=SORT(A1:A19)", the resulting formulas are "=@unique(A1:A19)" and "=@sort(A1:A19)" respectively. This is irrelevant of whether the sheet already has that formula in a cell, or I set the cell's value with the formula.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

$test_numbers = array( 41, 57, 51, 54, 49, 43, 35, 35, 44, 47, 48, 26, 57, 34, 61, 34, 28, 29, 41 );
$columnArray = array_chunk($test_numbers, 1);
$spreadsheet->setActiveSheetIndex( 0 )->fromArray( $columnArray, 'A1' );

$spreadsheet->setActiveSheetIndex( 0 )->setCellValue('C1', '=UNIQUE(A1:A19)');
$spreadsheet->setActiveSheetIndex( 0 )->setCellValue('D1', '=SORT(A1:A19)');

$writer = new Xlsx( $spreadsheet );
ob_end_clean();
$writer->save( 'php://output' );

If this is an issue with reading a specific spreadsheet file, then it may be appropriate to provide a sample file that demonstrates the problem; but please keep it as small as possible, and sanitize any confidential information before uploading.

What features do you think are causing the issue

  • [X ] Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

No idea, I only tested it with .xlsx files.

Which versions of PhpSpreadsheet and PHP are affected?

PHP 8.1, PhpSpreadsheet: whatever this version is: #3767

test.xlsx

@oleibman
Copy link
Collaborator

See #3659 (comment) for an explanation of the at sign. As I mentioned in that comment, I haven't been able to figure out what has to be done to keep Excel from adding it (it is not present in the spreadsheet generated by PhpSpreadsheet).

@roland-jungwirth
Copy link
Author

Thank you for the explanation. I'll extract the data via PHP and write them as values.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Mar 27, 2024
This has come up a number of times, most recently with issue PHPOffice#3901, and also issue PHPOffice#3659. It will certainly come up more often in days to come. Excel is changing formulas which PhpSpreadsheet has output as `=UNIQUE(A1:A19)`; Excel is processing the formula as it were `=@unique(A1:A19)`. This behavior is explained, in part, by PHPOffice#3659 (comment). It is doing so in order to ensure that the function returns only a single value rather than an array of values, in case the spreadsheet is being processed (or possibly was created) by a less current version of Excel which cannot handle the array result.

PhpSpreadsheet follows Excel to a certain extent; it defaults to returning a single calculated value when an array would be returned. Further, its support for outputting an array even when that default is overridden is incomplete. I am not prepared to do everything that Excel does for the array functions (details below), but this PR is a start in that direction. If the default is changed via:
```php
use PhpOffice\PhpSpreadsheet\Calculation\Calculation;
Calculation::setArrayReturnType(Calculation::RETURN_ARRAY_AS_ARRAY);
```
When that is done, `getCalculatedValue` will return an array (no code change necessary). However, Writer/Xlsx will now be updated to look at that value, and if an array is returned in that circumstance, will indicate in the Xml that the result is an array *and* will include a reference to the bounds of the array. This gets us close, although not completely there, to what Excel does, and may be good enough for now. Excel will still mess with the formula, but now it will treat it as `{=UNIQUE(A1:A19)}`. This means that the spreadsheet will now look correct; there will be superficial differences, but all cells will have the expected value.

Technically, the major difference between what PhpSpreadsheet will output now, and what Excel does on its own, is that Excel supplies values in the xml for all the cells in the range. That would be difficult for PhpSpreadsheet to do; that could be a project for another day. Excel will treat the output from PhpSpreadsheet as "Array Formulas" (a.k.a. CSE (control shift enter) formulas because you need to use that combination of keys to manually enter them in older versions of Excel). Current versions of Excel will instead use "Dynamic Array Formulas". Dynamic Array Formulas can be changed by the user; Array Formulas need to be deleted and re-entered if you want to change them. I don't know what else might have to change to get Excel to use the latter for PhpSpreadsheet formulas, and I will probably not even try to look now, saving it for a future date.

Unit testing of this change uncovered a bug in Calculation::calculateCellValue. That routine saves off ArrayReturnType, and may change it, and is supposed to restore it. But it does not do the restore if the calculation throws an exception. It is changed to do so.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants