Skip to content

Commit 9fc8e50

Browse files
committed
Extremely Limited Support for GROUPBY Function
This is a partial response to issue #4282. The actual logic to implement GROUPBY is probably very complicated. And, even worse, Excel has thrown a whole new way of (internally) specifying one of the arguments into the mix. That argument is a function name, expressed not as a mapped integer (as SUBTOTAL does), nor even as a string, but as the unquoted function name prefixed by `_xleta.`. And, unlike its `_xlfn.` and `_xlws.` predecessors, it is difficult to figure out when the new prefix needs to be added, and when it needs to be ignored. I am not even going to attempt that task with this ticket. So, what does this change do? Like earlier attempts to introduce limited functionality (such as with form controls), it is there so that using GROUPBY can be passed through - you can load a spreadsheet that contains it, and save it to a new spreadsheet, and the function and its results are preserved. Some cautionary notes. Dynamic arrays must be enabled (the function makes no sense without doing that). Changing any of the inputs used in the function may result in internal inconsistencies between PhpSpreadsheet and Excel; this is especially so if the dimensions of the returned array change as a result of changes to the input data. The programmer can avoid some of these problems by changing the formulatAttributes of the cell where the function is used; this may be difficult to do in practice. Oh, yes, using the GROUPBY cell as an argument in another formula will probably lead to problems. Finally, I confess that part of this solution looks awfully kludgey to me. With its limitations and those cautions, is it worth proceeding with this change? My gut feel is that it is more useful to proceed than not. However, I will give others the opportunity to weigh in. I will wait at least a couple of weeks into the new year before proceeding with this.
1 parent eccbcce commit 9fc8e50

File tree

8 files changed

+57
-3
lines changed

8 files changed

+57
-3
lines changed

docs/references/function-list-by-category.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -245,6 +245,7 @@ COLUMNS | \PhpOffice\PhpSpreadsheet\Calculation\LookupRef\RowCo
245245
FILTER | \PhpOffice\PhpSpreadsheet\Calculation\LookupRef\Filter::filter
246246
FORMULATEXT | \PhpOffice\PhpSpreadsheet\Calculation\LookupRef\Formula::text
247247
GETPIVOTDATA | **Not yet Implemented**
248+
GROUPBY | **Not yet Implemented**
248249
HLOOKUP | \PhpOffice\PhpSpreadsheet\Calculation\LookupRef\HLookup::lookup
249250
HYPERLINK | \PhpOffice\PhpSpreadsheet\Calculation\LookupRef\Hyperlink::set
250251
INDEX | \PhpOffice\PhpSpreadsheet\Calculation\LookupRef\Matrix::index

docs/references/function-list-by-name.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -239,6 +239,7 @@ GCD | CATEGORY_MATH_AND_TRIG | \PhpOffice\PhpSpread
239239
GEOMEAN | CATEGORY_STATISTICAL | \PhpOffice\PhpSpreadsheet\Calculation\Statistical\Averages\Mean::geometric
240240
GESTEP | CATEGORY_ENGINEERING | \PhpOffice\PhpSpreadsheet\Calculation\Engineering\Compare::GESTEP
241241
GETPIVOTDATA | CATEGORY_LOOKUP_AND_REFERENCE | **Not yet Implemented**
242+
GROUPBY | CATEGORY_LOOKUP_AND_REFERENCE | **Not yet Implemented**
242243
GROWTH | CATEGORY_STATISTICAL | \PhpOffice\PhpSpreadsheet\Calculation\Statistical\Trends::GROWTH
243244

244245
## H

src/PhpSpreadsheet/Calculation/Calculation.php

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1256,6 +1256,11 @@ public static function getExcelConstants(string $key): bool|null
12561256
'functionCall' => [Functions::class, 'DUMMY'],
12571257
'argumentCount' => '2+',
12581258
],
1259+
'GROUPBY' => [
1260+
'category' => Category::CATEGORY_LOOKUP_AND_REFERENCE,
1261+
'functionCall' => [Functions::class, 'DUMMY'],
1262+
'argumentCount' => '3-7',
1263+
],
12591264
'GROWTH' => [
12601265
'category' => Category::CATEGORY_STATISTICAL,
12611266
'functionCall' => [Statistical\Trends::class, 'GROWTH'],
@@ -4601,7 +4606,7 @@ private static function dataTestReference(array &$operandData): mixed
46014606
private static int $matchIndex10 = 10;
46024607

46034608
/**
4604-
* @return array<int, mixed>|false
4609+
* @return array<int, mixed>|false|string
46054610
*/
46064611
private function processTokenStack(mixed $tokens, ?string $cellID = null, ?Cell $cell = null)
46074612
{
@@ -5182,6 +5187,9 @@ private function processTokenStack(mixed $tokens, ?string $cellID = null, ?Cell
51825187
} elseif (preg_match('/^' . self::CALCULATION_REGEXP_DEFINEDNAME . '$/miu', $token, $matches)) {
51835188
// if the token is a named range or formula, evaluate it and push the result onto the stack
51845189
$definedName = $matches[6];
5190+
if (str_starts_with($definedName, '_xleta')) {
5191+
return Functions::NOT_YET_IMPLEMENTED;
5192+
}
51855193
if ($cell === null || $pCellWorksheet === null) {
51865194
return $this->raiseFormulaError("undefined name '$token'");
51875195
}
@@ -5214,6 +5222,7 @@ private function processTokenStack(mixed $tokens, ?string $cellID = null, ?Cell
52145222
}
52155223

52165224
$result = $this->evaluateDefinedName($cell, $namedRange, $pCellWorksheet, $stack, $specifiedWorksheet !== '');
5225+
52175226
if (isset($storeKey)) {
52185227
$branchStore[$storeKey] = $result;
52195228
}

src/PhpSpreadsheet/Worksheet/Worksheet.php

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,6 +49,8 @@ class Worksheet
4949
public const MERGE_CELL_CONTENT_HIDE = 'hide';
5050
public const MERGE_CELL_CONTENT_MERGE = 'merge';
5151

52+
public const FUNCTION_LIKE_GROUPBY = '/\\b(groupby|_xleta)\\b/i'; // weird new syntax
53+
5254
protected const SHEET_NAME_REQUIRES_NO_QUOTES = '/^[_\p{L}][_\p{L}\p{N}]*$/mui';
5355

5456
/**
@@ -3701,7 +3703,9 @@ public function calculateArrays(bool $preCalculateFormulas = true): void
37013703
$keys = $this->cellCollection->getCoordinates();
37023704
foreach ($keys as $key) {
37033705
if ($this->getCell($key)->getDataType() === DataType::TYPE_FORMULA) {
3704-
$this->getCell($key)->getCalculatedValue();
3706+
if (preg_match(self::FUNCTION_LIKE_GROUPBY, $this->getCell($key)->getValue()) !== 1) {
3707+
$this->getCell($key)->getCalculatedValue();
3708+
}
37053709
}
37063710
}
37073711
}

src/PhpSpreadsheet/Writer/Xlsx/FunctionPrefix.php

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -142,6 +142,7 @@ class FunctionPrefix
142142
. '|drop'
143143
. '|expand'
144144
. '|filter'
145+
. '|groupby'
145146
. '|hstack'
146147
. '|isomitted'
147148
. '|lambda'

src/PhpSpreadsheet/Writer/Xlsx/Worksheet.php

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1578,7 +1578,11 @@ private function writeCell(XMLWriter $objWriter, PhpspreadsheetWorksheet $worksh
15781578
$mappedType = $pCell->getDataType();
15791579
if ($mappedType === DataType::TYPE_FORMULA) {
15801580
if ($this->useDynamicArrays) {
1581-
$tempCalc = $pCell->getCalculatedValue();
1581+
if (preg_match(PhpspreadsheetWorksheet::FUNCTION_LIKE_GROUPBY, $cellValue) === 1) {
1582+
$tempCalc = [];
1583+
} else {
1584+
$tempCalc = $pCell->getCalculatedValue();
1585+
}
15821586
if (is_array($tempCalc)) {
15831587
$objWriter->writeAttribute('cm', '1');
15841588
}
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
<?php
2+
3+
declare(strict_types=1);
4+
5+
namespace PhpOffice\PhpSpreadsheetTests\Reader\Xlsx;
6+
7+
use PhpOffice\PhpSpreadsheet\Reader\Xlsx;
8+
use PhpOffice\PhpSpreadsheetTests\Functional\AbstractFunctional;
9+
10+
class GroupByLimitedTest extends AbstractFunctional
11+
{
12+
private static string $testbook = 'tests/data/Reader/XLSX/excel-groupby-one.xlsx';
13+
14+
public function testRowBreaks(): void
15+
{
16+
$reader = new Xlsx();
17+
$spreadsheet = $reader->load(self::$testbook);
18+
$reloadedSpreadsheet = $this->writeAndReload($spreadsheet, 'Xlsx');
19+
$spreadsheet->disconnectWorksheets();
20+
$reloadedSheet = $reloadedSpreadsheet->getActiveSheet();
21+
self::assertSame(['t' => 'array', 'ref' => 'E3:F7'], $reloadedSheet->getCell('E3')->getFormulaAttributes());
22+
$group = $reloadedSheet->rangeToArray('E3:F8');
23+
$expected = [
24+
['Design', '$505,000 '],
25+
['Development', '$346,000 '],
26+
['Marketing', '$491,000 '],
27+
['Research', '$573,000 '],
28+
['Total', '$1,915,000 '],
29+
[null, null],
30+
];
31+
self::assertSame($expected, $reloadedSheet->rangeToArray('E3:F8'));
32+
$reloadedSpreadsheet->disconnectWorksheets();
33+
}
34+
}
13.3 KB
Binary file not shown.

0 commit comments

Comments
 (0)