Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IBX-704: Added aggregation support to NodeFactory #1792

Merged

Conversation

mateuszbieniek
Copy link
Contributor

@mateuszbieniek mateuszbieniek commented Jul 2, 2021

Jira's issue: https://issues.ibexa.co/browse/IBX-704

requires:
ezsystems/ezplatform-kernel#215
ezsystems/ezplatform-solr-search-engine#215

Children counts call to Solr can create an overhead for load-subtree calls.
This PR adds Aggregation support for NodeFactory to speed it up when supported.

@mateuszbieniek mateuszbieniek marked this pull request as draft July 2, 2021 09:50
@lserwatka
Copy link
Member

@webhdx would be great for reviewing this next week.

@mateuszbieniek mateuszbieniek changed the title [WIP] Added aggregation support to NodeFactory IBX-704: Added aggregation support to NodeFactory Jul 19, 2021
@mateuszbieniek mateuszbieniek marked this pull request as ready for review July 19, 2021 09:23
@mateuszbieniek
Copy link
Contributor Author

ping @ezsystems/engineering-team

Copy link
Contributor

@webhdx webhdx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too experienced with SOLR/Elastic usage so it needs attention from Adam or Andrew. Anyway I don't quite like the general idea here. NodeFactory becomes more and more complicated/bloated with all performance improvements. Would be nice to provide alternative NodeFactory which could utilize different handling when ES/SOLR is being used. But I understand this would require more refactoring and providing some extension points. I put some remarks in the comments, please consider them.

@@ -33,6 +34,7 @@ final class NodeFactory
'DatePublished' => SortClause\DatePublished::class,
'ContentName' => SortClause\ContentName::class,
];
private const MAX_AGGREGATED_LOCATION_IDS = 100;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be configurable if there is possibility to fine tune performance on different setups.

src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
*
* @throws \eZ\Publish\API\Repository\Exceptions\InvalidArgumentException
*/
private function supplyChildrenCount(Node $node, array $aggregationResult = null): void
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner if you used empty array instead of null value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing null here indicates that Aggregation results are not available (for example: using a legacy search engine) and the "legacy" way of counting children should be used inside the supplyChildrenCount method. An empty array is a totally valid result of aggregation. I can change it to pass to an empty array, but we will have to introduce something like $isAggregationSupported and pass it alongside to supplyChildrenCount

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner if you used empty array instead of null value.

Passing null here indicates that Aggregation results are not available (for example: using a legacy search engine) and the "legacy" way of counting children should be used inside the supplyChildrenCount method. An empty array is a totally valid result of aggregation. I can change it to pass to an empty array, but we will have to introduce something like $isAggregationSupported and pass it alongside to supplyChildrenCount

@mateuszbieniek I don't see how distinguishing between [] and null would change the behavior of this method in its current state. Could you show a snippet with [] instead of null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two possibilities:

  1. Aggregations are not supported ($aggregationResult is null). Every single container node will have $this->countSubitems($node->locationId); called

  2. Aggregations are supported and $aggregationResult is an array.

$totalCount = isset($aggregationResult[$node->locationId]) ?
                    $aggregationResult[$node->locationId] :
                    0;

If $aggregationResult has key $node->locationId then node has children, if a key does not exist, this means that there children count is 0;

So, if I pass it as empty array, when aggregations are not supported current state of method will result on 0 children count for all nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it to an empty array but will have to add a check for aggregation support inside supplyChildrenCount method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've changed the check inside from supplyChildrenCount:

if ($aggregationResult) {

to

if ($aggregationResult !== null) {

This should reduce the confusion.

src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
*
* @throws \eZ\Publish\API\Repository\Exceptions\InvalidArgumentException
*/
private function supplyChildrenCount(Node $node, array $aggregationResult = null): void
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner if you used empty array instead of null value.

Passing null here indicates that Aggregation results are not available (for example: using a legacy search engine) and the "legacy" way of counting children should be used inside the supplyChildrenCount method. An empty array is a totally valid result of aggregation. I can change it to pass to an empty array, but we will have to introduce something like $isAggregationSupported and pass it alongside to supplyChildrenCount

@mateuszbieniek I don't see how distinguishing between [] and null would change the behavior of this method in its current state. Could you show a snippet with [] instead of null?

Copy link
Member

@adamwojs adamwojs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please share some basic numbers in PR description as this is an optimisation patch?

src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
src/lib/UI/Module/ContentTree/NodeFactory.php Outdated Show resolved Hide resolved
@mateuszbieniek
Copy link
Contributor Author

SonarCloud Quality Gate failed.    Quality Gate failed

Bug C 1 Bug
Vulnerability A 0 Vulnerabilities
Security Hotspot E 2 Security Hotspots
Code Smell A 83 Code Smells

No Coverage information No Coverage information
2.4% 2.4% Duplication

Those are false positives - SonarCloud messes something up...

Co-authored-by: Adam Wójs <adam@wojs.pl>
@mateuszbieniek mateuszbieniek changed the base branch from master to 2.2 August 5, 2021 08:45
Copy link
Member

@alongosz alongosz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the change to the method behavior it's clear why you need it there. With test coverage we could avoid this discussion, because you would spot a mistake immediately.
One final request related to forward compatibility of rebranding:

src/bundle/Resources/config/default_parameters.yaml Outdated Show resolved Hide resolved
@mateuszbieniek mateuszbieniek changed the base branch from 2.2 to 2.3 August 5, 2021 09:03
@sonarcloud
Copy link

sonarcloud bot commented Aug 5, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
No Duplication information No Duplication information

@bogusez bogusez self-assigned this Aug 5, 2021
@lserwatka lserwatka merged commit 64b5a09 into ezsystems:2.3 Aug 9, 2021
@lserwatka
Copy link
Member

You can merge it up now.

@mateuszbieniek mateuszbieniek deleted the load_subtree_aggregation_support branch August 9, 2021 12:11
mnocon pushed a commit that referenced this pull request Oct 20, 2021
* Added aggregation support to NodeFactory

* Changed to use LocationChildrenTermAggregation

* LocationChildrenTermAggregation moved to different namespace in kernel

* Changes after CR

* CS Fix

* Changes after CR#3

* CS fix

* Fixups after CR

* Update src/lib/UI/Module/ContentTree/NodeFactory.php

Co-authored-by: Adam Wójs <adam@wojs.pl>

* Changed the parameter name to ibexa.admin_ui.content_tree.node_factory.max_location_ids_in_single_aggregation

Co-authored-by: Adam Wójs <adam@wojs.pl>
(cherry picked from commit 64b5a09)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
7 participants