You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #53 has fixed a memory issue, but it does not resolve performance issues for certain things. Let's list them:
When dealing with ProductAttributeValue filtering - we are running a GROUP BY on values and in some cases, for more common attribute across the database, that results in a "full scan" situation that is horrible on performance - and issues scales with the number of records that need to be scanned. Also, since the data structure actually uses different columns for different data types - not all fields can be indexed or even GROUP BY be applied (json/array columns definitely an issue).
Also right now boolean fields filtering is not handled at all, so it does not work properly at all and that needs to be handled separately. Also if a boolean type attribute is applied to most products - the group by performance is horrible due to a full scan of the whole table (in our case that is basically group by to 2 objects from a 4 million record table - no fun).
All in all, I think this needs to be implemented to be selected via faceted navigation on ElasticSearch side - be it dedicated index or based of the index of the attributes, but this whole thing needs it's own data structure and filtering based on already applied filters and stuff. It does not work on the database on any decent size DB or where there are numerous and various attributes. Just to give an idea - the DB I tested this has over 3000 attributes spread out over 350k+ products resulting in the ProductAttributeValue table being in the range of 4-5 million records (I will look up the exact amount and update the value).
Personally, I do not really have any implementation ideas at this point, but I can test it against a big and somewhat convoluted DB :)
One thing for sure - with Sylius'es DB structure it is not possible to have working filtering on any decent sized database that does work under a second, and it is not very hard to go into PHP timing out - we just decided to remove quite a few attributes from filtering because there were no easy fixes.
The text was updated successfully, but these errors were encountered:
psihius
changed the title
[Performance][Data structure adjustments]
[Performance][Data structure adjustments] Implementing proper attribute filtering for all attribute types
Jun 3, 2019
PR #53 has fixed a memory issue, but it does not resolve performance issues for certain things. Let's list them:
When dealing with ProductAttributeValue filtering - we are running a GROUP BY on values and in some cases, for more common attribute across the database, that results in a "full scan" situation that is horrible on performance - and issues scales with the number of records that need to be scanned. Also, since the data structure actually uses different columns for different data types - not all fields can be indexed or even GROUP BY be applied (json/array columns definitely an issue).
Also right now boolean fields filtering is not handled at all, so it does not work properly at all and that needs to be handled separately. Also if a boolean type attribute is applied to most products - the group by performance is horrible due to a full scan of the whole table (in our case that is basically group by to 2 objects from a 4 million record table - no fun).
All in all, I think this needs to be implemented to be selected via faceted navigation on ElasticSearch side - be it dedicated index or based of the index of the attributes, but this whole thing needs it's own data structure and filtering based on already applied filters and stuff. It does not work on the database on any decent size DB or where there are numerous and various attributes. Just to give an idea - the DB I tested this has over 3000 attributes spread out over 350k+ products resulting in the ProductAttributeValue table being in the range of 4-5 million records (I will look up the exact amount and update the value).
Personally, I do not really have any implementation ideas at this point, but I can test it against a big and somewhat convoluted DB :)
One thing for sure - with Sylius'es DB structure it is not possible to have working filtering on any decent sized database that does work under a second, and it is not very hard to go into PHP timing out - we just decided to remove quite a few attributes from filtering because there were no easy fixes.
The text was updated successfully, but these errors were encountered: