Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed collection sorting error #2020

Conversation

@afbora
Copy link
Contributor

commented Aug 19, 2019

Describe the PR

This fix is used for cases when main sorting specified by columns has equal values
Without it it will lead to Fatal Error: Nesting level too deep - recursive dependency?

Tested with Kirby 3.2.3

Waiting your reviews.

Related issues

Reference solution

yiisoft/yii2@78de315

Todos

  • Add unit tests for fixed bug/feature
  • Pass all unit tests
  • Fix code style issues with CS fixer and composer fix
  • If needeed, in-code documentation (DocBlocks etc.)
src/Toolkit/Collection.php Outdated Show resolved Hide resolved
src/Toolkit/Collection.php Outdated Show resolved Hide resolved
src/Toolkit/Collection.php Outdated Show resolved Hide resolved
Copy link
Contributor

left a comment

Thanks a lot for tracking down this bug and finding a potential solution (also thanks for linking to the source, that's very helpful!).

Unfortunately I can't reproduce it myself, so I can't really test if this fixes it.

One general question though: Is it really necessary to first check if the data has duplicate values? Because that makes the whole logic a lot more complex and unpredictable. Also I imagine that this change will be quite a performance hit for large data sets as every value list for every attribute/field needs to be checked for duplicates.
My thinking is: If we add the additional "artificial" sorting attribute as the last attribute, it should only ever be used by array_multisort if all prior attributes are duplicates. So it shouldn't matter if it's there if it's not needed.

You write that the sorting will be wrong if the check is omitted – wrong in what way and why?

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2019

I have created virtual pages section to reproduce, I hope it helps.

$kirby->impersonate("kirby");

$page = new Page([
    'slug' => 'test'
]);

$subpage1 = Page::create([
    'parent' => $page,
    'slug' => 'subpage-1'
]);

$subpage2 = Page::create([
    'parent' => $page,
    'slug' => 'subpage-2'
]);

 $subpage3 = Page::create([
    'parent' => $page,
    'slug' => 'subpage-3',
]);

$subpage1 = $subpage1->changeStatus('listed');
$subpage2 = $subpage2->changeStatus('listed');
$subpage3 = $subpage3->changeStatus('listed');

$section = new \Kirby\Cms\Section('pages', [
    'name'  => 'test',
    'model' => $page,
    'sortBy' => 'status desc'
]);

foreach($section->data() as $item) {
    var_dump($item['status']);
}

FireShot Capture 736 - Kirby CMS Debugger - http___localhost_test_kirby_1790_

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 20, 2019

One general question though: Is it really necessary to first check if the data has duplicate values?

This issue about equal values as you can see in reference issue/solution. As you know, the following codes add a unique sorting feature.

$params[] = range(1, count($array));
$params[] = SORT_ASC;
$params[] = SORT_NUMERIC;

You're right about performance, unpredictable. But when we don't check the duplicates, another problem arises. The answer is related to the second question below.

You write that the sorting will be wrong if the check is omitted – wrong in what way and why?

If we apply this feature always without checking duplicate, following tests failed:

FilesSectionTest::testSortBy()

https://github.com/getkirby/kirby/blob/master/tests/Cms/Sections/FilesSectionTest.php#L230-L290

Current output:

  • b.jpg
  • z.jpg
  • ä.jpg

New sort output:

  • z.jpg
  • ä.jpg
  • b.jpg

CollectionSorterTest::testSortCases()

public function testSortCases()
{
$collection = new Collection([
['name' => 'a'],
['name' => 'c'],
['name' => 'A'],
['name' => 'b']
]);
$sorted = $collection->sortBy('name', 'asc');
$this->assertEquals('A', $sorted->nth(0)['name']);
$this->assertEquals('a', $sorted->nth(1)['name']);
$this->assertEquals('b', $sorted->nth(2)['name']);
$this->assertEquals('c', $sorted->nth(3)['name']);
}

Current output:

  • A
  • a
  • b
  • c

New sort output:

  • a
  • A
  • b
  • c
@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 20, 2019

Thanks for the code example – I can reproduce the issue now.

Regarding the wrong sorting: That is very very weird. To be honest, I think we need to find out why this happens first, because it really shouldn't. Otherwise, the "simple" and high-performance solution will be wrong and the complex solution will be a performance issue and potentially not even a working solution either.

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 21, 2019

https://www.php.net/manual/tr/function.array-multisort.php#113445

When sorting an array of (complex) objects, this function can give you a "Fatal error: Nesting level too deep" since it directly compares elements in later arrays if the elements in earlier ones compare equal. This can be worked around with the Flag-Parameter

Btw there is a really strange situation. Don't work when they're all listed, but work when all pages draft. In both cases the values are equal but one works, the other does not 🤔

Works with draft status

$kirby->impersonate("kirby");

$page = new Page([
    'slug' => 'test'
]);

$subpage1 = Page::create([
    'parent' => $page,
    'slug' => 'subpage-1'
]);

$subpage2 = Page::create([
    'parent' => $page,
    'slug' => 'subpage-2'
]);

 $subpage3 = Page::create([
    'parent' => $page,
    'slug' => 'subpage-3',
]);

$section = new \Kirby\Cms\Section('pages', [
    'name'  => 'test',
    'model' => $page,
    'sortBy' => 'status desc'
]);

foreach($section->data() as $item) {
    var_dump($item['status']);
}

$page->delete(true);

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 21, 2019

I've done dozens of tests with Page, Section and Collection classes, following informations can give you an idea.

Important!
I reproduced the issue only sorting with listed pages that have equal values count more than one (>= 2) in sorting field:

Failed status:

  • listed
  • listed
  • listed

Failed status:

  • unlisted
  • listed
  • listed

Failed parent with listed pages:

  • test
  • test
  • test

Failed null text with listed pages:

  • null
  • null
  • null

Works parent with not listed pages:

  • test
  • test
  • test

Works status:

  • draft
  • draft
  • draft

Works status:

  • unlisted
  • unlisted
  • unlisted

Works status:

  • unlisted
  • unlisted
  • listed

Works null text with not listed pages:

  • null
  • null
  • null
@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 21, 2019

Thanks for the link to the PHP docs. That's interesting and makes a lot of sense now that I think about it: Because we give array_multisort() the actual array of data objects as the last param, it uses that for comparison if all previous fields are equal. And that fails because PHP compares objects now. Most of the time I love PHP, but sometimes I really hate it. 😢

Idea: What if we simply append a flag SORT_STRING to the end of the $params array (after the &$array param) so that PHP won't try to sort the objects directly? Pretty much all of our objects should have a __toString() method, so that should actually work quite well (unless I still misunderstand how the array_multisort() function works 😆).

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 22, 2019

What a great idea @lukasbestle 👍 and actually partially working.

Pages, section works but didnt work on array collection on casting array to string like following implement and test:

Implement:

foreach ($fields as $field) {
    $params[] = $field['values']    ?? [];
    $params[] = $field['direction'] ?? SORT_ASC;
    $params[] = $field['flags']     ?? SORT_NATURAL | SORT_FLAG_CASE;
}

$params[] = &$array;

// To compare strings instead of objects directly
// Fix "Nesting level too deep - recursive dependency?" error
$params[] = SORT_STRING | SORT_FLAG_CASE;

Test:

$collection = new Collection([
    [
        'name'  => 'Bastian',
        'role'  => 'developer',
        'color' => 'red'
    ],
    [
        'name' => 'Nico',
        'role' => 'developer',
        'color' => 'green'
    ],
    [
        'name' => 'nico',
        'role' => 'support',
        'color' => 'blue'
    ],
    [
        'name'  => 'Sonja',
        'role'  => 'support',
        'color' => 'red'
    ]
]);
$sorted = $collection->sortBy('name', 'asc');

foreach($sorted as $item) {
    echo $item['name'] . PHP_EOL;
}

Whoops\Exception\ErrorException: Array to string conversion in file \kirby\src\Toolkit\Collection.php on line 958 (array_multisort() line)

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 22, 2019

I have some tests about your idea.
I can ignore array data with checking is object:

$params[] = &$array;

if ($firstElement = $this->first()) {
    if (is_object($firstElement) === true && method_exists($firstElement, '__toString') === true) {
        $params[] = SORT_STRING;
    }
}

Even if it works for all data types, some tests failed with wrong sort like following my first implement too:

$params[] = range(1, count($array));
$params[] = SORT_ASC;
$params[] = SORT_NUMERIC;
@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 22, 2019

Oh noes, you are right. It won't work for all collections. :(

I can ignore array data with checking is object:

We can't really do it that way because you only check for the first element. If the collection elements are of different types (which shouldn't but can happen), it will break again. But we also can't check all elements, that would be pretty wasteful.

I'm really not sure what to do to be honest. The best solution so far is your first as that actually works for all types of collections, but we still don't know why it breaks the sorting order. Have you found any clues on that?

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 23, 2019

Implement

foreach ($fields as $index => $field) {
    $values = [];
    // To prevent empty values for files collection
    if (empty($field['values']) === false) {
        $values = array_filter($field['values']) ? array_values($field['values']) : array_keys($field['values']);
    }
    
    $params[] = $values;
    $params[] = $field['direction'] ?? SORT_ASC;
    $params[] = $field['flags'] ?? SORT_NATURAL | SORT_FLAG_CASE;
}

$params[] = range(1, count($array));
$params[] = SORT_ASC;
$params[] = SORT_NUMERIC;

$params[] = &$array;

Test 1: New Implement
CollectionSorterTest::testSortCases()

$collection = new Collection([
    ['name' => 'a'],
    ['name' => 'B'],
    ['name' => 'b'],
    ['name' => 'A'],
]);

$sorted = $collection->sortBy('name', 'asc');

foreach($sorted as $item) {
    echo $item['name'] . PHP_EOL;
}

Output:

  1. a
  2. A
  3. B
  4. b

Expected:

  1. A
  2. a
  3. B
  4. b

Fail Reason:

We use SORT_FLAG_CASE flag, so a == A and a sort number (new implement as SORT_NUMERIC) is less than A as 1 < 4. Therefore, this error occurs when use case insensitive.

Test 2: New Implement without SORT_FLAG_CASE to fix test one
CollectionSorterTest::testSortBy()

$collection = new Collection([
    [
        'name'  => 'Bastian',
        'role'  => 'developer',
        'color' => 'red'
    ],
    [
        'name' => 'Nico',
        'role' => 'developer',
        'color' => 'green'
    ],
    [
        'name' => 'nico',
        'role' => 'support',
        'color' => 'blue'
    ],
    [
        'name'  => 'Sonja',
        'role'  => 'support',
        'color' => 'red'
    ]
]);

$sorted = $collection->sortBy('name', 'asc', 'color', SORT_ASC);

foreach($sorted as $item) {
    echo $item['name'] . '(' . $item['color'] . ')' . PHP_EOL;
}

Output:

  1. Bastian(red)
  2. Nico(green)
  3. Sonja(red)
  4. nico(blue)

Expected:

  1. Bastian(red)
  2. nico(blue)
  3. Nico(green)
  4. Sonja(red)

Fail Reason:

We removed the SORT_FLAG_CASE, so the priority was shifted to capital letters and test sort failed.


To be frank, my brain burned too.

I noticed that when I fix a place, the other side breaks, when I fix the other side, another place breaks down 😄

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 23, 2019

Thanks for the additional insights, now I get it.

I think the sorting with SORT_FLAG_CASE isn't actually wrong, it's just different than before. If we give PHP that flag, we tell it that we don't care about case, so that's exactly what it does when sorting.

Maybe this isn't even an issue after all. If the user wants case-sensitive sorting, they can pass a custom sorting flag.

What do you think @distantnative @bastianallgeier?

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 25, 2019

It makes sense to me.
So we need something like that SORT_FLAG_NOCASE

$collection->sortBy('name', 'asc', 'SORT_FLAG_NOCASE');

I couldn't find another solution.
This seems to be the most effective solution.

$params[] = range(1, count($array));
$params[] = SORT_ASC;
$params[] = SORT_NUMERIC;
@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2019

So we need something like that SORT_FLAG_NOCASE

What do you mean exactly? Case-sensitive sorting is the default in PHP anyway, so if the SORT_FLAG_CASE constant is not given, PHP will sort case-sensitively. This can be achieved by passing a custom flag attribute:

$collection->sortBy('name', 'asc', SORT_NATURAL);
@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Aug 25, 2019

Never mind :)) You're right. SORT_NATURAL | SORT_FLAG_CASE as default flags will be ignored when a custom flag is already sent 👍 I have to think simple like you 😄

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2019

I thought about it again. Maybe the result with the new implementation is wrong:

Before the change, the sorting algorithm worked like this:

  1. First sort by all configured fields case-insensitively
  2. If all fields are equal, compare the items themselves (= where case-sensitivity is implied)

➡️ Within the list of otherwise equal items, the items are now sorted by case (like in your A and a example above).

After the change, it changed to this:

  1. First sort by all configured fields case-insensitively
  2. If all fields are equal, compare arbitrary numbers that have been added to fix the sorting bug

➡️ Case-sensitivity is not respected when it's needed.

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2019

But unfortunately there is no way around that. It's unfortunately a limitation of array_multisort(). Either get correct sorting results or fix the fatal error. We cannot have both.

I think we need to wait for the input from @distantnative and @bastianallgeier. If they agree that it's more important to fix the fatal error than to have correct sorting results in all cases, we can go forward with the implementation.

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Aug 25, 2019

Oh, one idea: We could check if the items can be converted to string by checking if they have a __toString() method (exactly like you suggested above). If they do, we could add the SORT_STRING flag and if they don't, we would fall back to the arbitrary numbers. If the items are scalar values (string, int etc.) anyway, we don't need to add anything as PHP will already sort correctly.

Now that I think about it, it's probably fine to check only for the first item in the collection. Collections can generally be expected to only have items of the same type.

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Sep 9, 2019

@bastianallgeier @distantnative Need your reviews.

@distantnative

This comment has been minimized.

Copy link
Contributor

commented Sep 11, 2019

To be honest, you lost me on the way. I'm way too little familiar with the sorting flags yet.

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Sep 12, 2019

Recently I started to see this error more often when I was doing bug-fix tests in different classes.
It would be great if we could figure this out in the next version as 3.2.5 🔥

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Sep 12, 2019

@afbora Definitely! I think we can go forward with the implementation in #2020 (comment). I think that should be alright for now until we find a better solution.

@lukasbestle lukasbestle added this to the 3.2.5 milestone Sep 12, 2019
@lukasbestle lukasbestle self-assigned this Sep 12, 2019
@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Sep 12, 2019

@lukasbestle I didn't see that assigned it to yourself, and I updated PR 😇

Btw all tests passed successfully, but I haven't written any new tests yet.

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2019

@afbora That I assigned the PR to myself just means that I'm the one who reviews it.

Let me know when you are ready, I will then review it and add missing tests etc. That's not an issue. :)

@bastianallgeier

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2019

TBH, I find it hard to follow this PR. I'm worried that we run into performance issues when the loop gets too complex. But otherwise I'd trust your experience with this.

@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2019

@bastianallgeier Yes, it's unfortunately a quite complex issue and I think that there's no perfect solution.

I will do performance tests when reviewing.

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Sep 13, 2019

@bastianallgeier I don't think it's going to be a performance problem. Thanks to @lukasbestle , we have simplified the solution quite simply. We have just prevented with SORT_STRING from directly comparing objects if the collection consists only of objects.

The update I made in the above loop was related to the empty return of the values in the files collection.

@lukasbestle, I'm just gonna write a unit test.

@bastianallgeier

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2019

@afbora that's awesome! Great work guys! 👏

@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Sep 13, 2019

@lukasbestle I leave it in your hands 😅

Since I only reproduced the error when sorting collection with section, I wasn't sure whether it should be under the \Toolkit\Collection or under the \Cms\Section tests.

@lukasbestle lukasbestle force-pushed the afbora:fix/1790-collection-recursive-dependency-error branch from d4da7f3 to f46dc63 Sep 13, 2019
Copy link
Contributor

left a comment

Thank you very much! 👍❤️

I have modified the code to work for a few more edge cases according to my proposal in #2020 (comment).

I have also changed the unit tests to use only mock objects. There are two reasons for that:

  1. We want to decouple the Toolkit (and other non-CMS) tests from the CMS as our other code packages may be used outside of a CMS installations. The tests should therefore be able to run without the Cms classes being present. It also improves the reliability of the tests as less code is tested at once.
  2. Your tests were creating the actual pages in the file system and therefore couldn't run more than once without manual intervention (deleting the pages again). We generally use a fixtures directory inside the test directories for that. That directory is then cleaned up automatically after each test. But it's even better to write the tests in a way that they don't require changes to the file system at all (like with the mock classes).

Sorry that I had to throw away the tests you wrote, but you couldn't know the first reason, so it's no problem at all. :)


Now only my question about your "Rewrite loop for collection data values" commit and after that we should be ready to go. 🎉

src/Toolkit/Collection.php Outdated Show resolved Hide resolved
src/Toolkit/Collection.php Outdated Show resolved Hide resolved
@lukasbestle lukasbestle marked this pull request as ready for review Sep 13, 2019
@afbora

This comment has been minimized.

Copy link
Contributor Author

commented Sep 14, 2019

@lukasbestle First of all, I love your commits ❤️
I couldn't write like that 👍 Well, I'm not good at the second point too, so it's good to throw mine 😅


Let me explain by giving an example.

This is an update to work with the right data, not about recursive dependency error.
And to ensure that the values work correctly when they are empty, as in the following example.

/tests/Cms/Sections/FilesSectionTest::testSortBy()

$locale = setlocale(LC_ALL, 0);
setlocale(LC_ALL, ['de_DE.ISO8859-1', 'de_DE']);

$model = new Page([
    'slug'  => 'test',
    'files' => [
        [
            'filename' => 'z.jpg'
        ],
        [
            'filename' => 'ä.jpg'
        ],
        [
            'filename' => 'b.jpg'
        ]
    ]
]);

// no settings
$section = new \Kirby\Cms\Section('files', [
    'name'  => 'test',
    'model' => $model
]);

Dump sort values for above example:

foreach ($fields as $field) {
    var_dump($field['values'] ?? []);
    
    $params[] = $field['values']    ?? [];
    $params[] = $field['direction'] ?? SORT_ASC;
    $params[] = $field['flags']     ?? SORT_NATURAL | SORT_FLAG_CASE;
}        
exit;

And output:

array(3) {
  ["test/z.jpg"]=>
  string(0) ""
  ["test/ä.jpg"]=>
  string(0) ""
  ["test/b.jpg"]=>
  string(0) ""
}

As you can see, values are empty and test will fail when implement new feature that SORT_STRING and SORT_NUMERIC. You can test yourself how you gave the error with revert loop (original) and SORT_STRING implement.

With rewrite loop output:

array(3) {
  [0]=>
  string(10) "test/z.jpg"
  [1]=>
  string(11) "test/ä.jpg"
  [2]=>
  string(10) "test/b.jpg"
}

I think this update is always necessary for stable work.
In fact, if I can explain what the problem is, I think you have a better idea.

src/Toolkit/Collection.php Outdated Show resolved Hide resolved
lukasbestle added a commit to afbora/kirby that referenced this pull request Sep 14, 2019
Fix recursive dependency error on sorting collection

getkirby#2020
lukasbestle added a commit to afbora/kirby that referenced this pull request Sep 14, 2019
If the `sort` field is not set for some or all files, the sorting would be arbitrary. By sorting by filename as a fallback, there is always a predictable order.

getkirby#2020
@lukasbestle lukasbestle force-pushed the afbora:fix/1790-collection-recursive-dependency-error branch from b3fda62 to 6687a5e Sep 14, 2019
@lukasbestle

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2019

Ah, I see!

I have fixed this directly in the files section instead as that was the origin for the bug. It's a bug that is only visible now with the new implementation, but it was there before. So it's better to fix it directly.

Your fix in the $collection->sortBy() method would have worked for this case, but for other cases it may have resulted in unexpected sorting results.

afbora and others added 2 commits Aug 19, 2019
Fix recursive dependency error on sorting collection

#2020
If the `sort` field is not set for some or all files, the sorting would be arbitrary. By sorting by filename as a fallback, there is always a predictable order.

#2020
@lukasbestle lukasbestle force-pushed the afbora:fix/1790-collection-recursive-dependency-error branch from 6687a5e to 46703f6 Sep 14, 2019
Copy link
Contributor

left a comment

This is now ready for 3.2.5. Thanks again @afbora, your help was amazing, we wouldn't have found the issue and a solution without you. 👍

@bastianallgeier @distantnative We need to keep an eye on this during the RC phase, maybe explicitly ask the users to test sorting in their sites. There are quite a few edge-cases here and we might have missed some with the fix.

…error
@bastianallgeier bastianallgeier merged commit 759afa7 into getkirby:develop Sep 16, 2019
1 check failed
1 check failed
Travis CI - Pull Request Build Errored
Details
bastianallgeier added a commit that referenced this pull request Sep 16, 2019
Fix recursive dependency error on sorting collection

#2020
@afbora afbora deleted the afbora:fix/1790-collection-recursive-dependency-error branch Sep 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.