Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch products together during catalog rule indexer calculation to improve performance #37889

Conversation

aligent-lturner
Copy link
Contributor

@aligent-lturner aligent-lturner commented Aug 17, 2023

Description (*)

Improves performance of the Catalog Rule scheduled indexer by batching products together.
Currently, products are processed one by one. In the case of a large number of product changes, this can take a long time to process, even if there are no catalog rules present, as there will need to be at least 2 * number of products database queries (so with 10k products to process, there are at least 20k queries being performed).

This PR introduces a new class, ReindexRuleProductsPrice to be used in place of ReindexRuleProductPrice. Additionally, RuleProductsSelectBuilder has been modified to handle multiple products instead of a single one.

Fixed Issues (if relevant)

  1. Fixes Very slow catalogrule_product index even when no catalog rule exists #34784

Manual testing scenarios (*)

  1. Process catalog rule indexers (indexer_update_all_views cron job) with a large number of products in the backlog (performance test)
  2. Ensure indexed data is the same with or without this PR

Questions or comments

I introduced a new class, rather than modifying the existing ReindexRuleProductPrice class. I could have potentially added a new function there to handle the case with multiple products, but it would have been a little messy in terms of updating associated tests.

Contribution checklist (*)

  • Pull request has a meaningful description of its purpose
  • All commits are accompanied by meaningful commit messages
  • All new or changed code is covered with unit/integration tests (if applicable)
  • All automated tests passed successfully (all builds are green)

@m2-assistant
Copy link

m2-assistant bot commented Aug 17, 2023

Hi @aligent-lturner. Thank you for your contribution!
Here are some useful tips on how you can test your changes using Magento test environment.

Add the comment under your pull request to deploy test or vanilla Magento instance:
  • @magento give me test instance - deploy test instance based on PR changes
  • @magento give me 2.4-develop instance - deploy vanilla Magento instance

❗ Automated tests can be triggered manually with an appropriate comment:

  • @magento run all tests - run or re-run all required tests against the PR changes
  • @magento run <test-build(s)> - run or re-run specific test build(s)
    For example: @magento run Unit Tests

<test-build(s)> is a comma-separated list of build names.

Allowed build names are:
  1. Database Compare
  2. Functional Tests CE
  3. Functional Tests EE
  4. Functional Tests B2B
  5. Integration Tests
  6. Magento Health Index
  7. Sample Data Tests CE
  8. Sample Data Tests EE
  9. Sample Data Tests B2B
  10. Static Tests
  11. Unit Tests
  12. WebAPI Tests
  13. Semantic Version Checker

You can find more information about the builds here
ℹ️ Run only required test builds during development. Run all test builds before sending your pull request for review.


For more details, review the Code Contributions documentation.
Join Magento Community Engineering Slack and ask your questions in #github channel.

@m2-community-project m2-community-project bot added the Priority: P3 May be fixed according to the position in the backlog. label Aug 17, 2023
@m2-github-services m2-github-services added Partner: Aligent Consulting partners-contribution Pull Request is created by Magento Partner labels Aug 17, 2023
@m2-community-project m2-community-project bot added this to Pending Review in Pull Requests Dashboard Aug 17, 2023
@aligent-lturner
Copy link
Contributor Author

There was a previous PR for this issue (#34785), but it has since been closed, and the changes are no longer visible. Based on the description and comments, it was handling a specific case where there are no active rules. This PR is a more general performance improvement, and should help regardless of how many rules are present/active.

@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@aligent-lturner aligent-lturner force-pushed the feature/catalogrule_product_indexer_performance branch from caa9f66 to d7a2d0d Compare August 18, 2023 04:05
@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@aligent-lturner aligent-lturner force-pushed the feature/catalogrule_product_indexer_performance branch from d7a2d0d to 9ba77b1 Compare August 18, 2023 04:57
@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@aligent-lturner aligent-lturner force-pushed the feature/catalogrule_product_indexer_performance branch from 9ba77b1 to 4e189d2 Compare August 18, 2023 06:00
@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@aligent-lturner aligent-lturner force-pushed the feature/catalogrule_product_indexer_performance branch from 4e189d2 to a6be322 Compare August 18, 2023 07:03
@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@magento-automated-testing
Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

aligent-lturner and others added 3 commits August 21, 2023 09:11
…xer_performance' into feature/catalogrule_product_indexer_performance

# Conflicts:
#	app/code/Magento/CatalogRule/Test/Unit/Model/Indexer/ReindexRuleProductsPriceProcessorTest.php
#	app/code/Magento/CatalogRule/Test/Unit/Model/Indexer/ReindexRuleProductsPriceTest.php
@aligent-lturner
Copy link
Contributor Author

@magento run all tests

@engcom-Hotel
Copy link
Contributor

✔️ QA Passed

Preconditions:
Setup Magento with a good amount of products

Manual testing scenario:

  1. Process catalog rule indexers (indexer_update_all_views cron job) with a large number of products in the backlog (performance test)
  2. Ensure indexed data is the same with or without this PR

Actual Result: ✔️
indexer_update_all_views Should complete with batch products together.

After: ✔️

After PR changes the indexer took less time. Please refer to the below screenshot from cron.log file:

image

Before: ✖️

It took comparatively more time
image

Tested all the manual scenarios, no impact on regression testing.

As some tests are still failing hence moving this PR in Extended Testing.

Thanks

@engcom-Hotel engcom-Hotel moved this from Testing in Progress to Extended testing (optional) in Community Dashboard Jan 4, 2024
@engcom-Hotel
Copy link
Contributor

@magento run Functional Tests B2B, Functional Tests CE, Integration Tests

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Hotel
Copy link
Contributor

@magento run Functional Tests B2B, Functional Tests CE

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Hotel
Copy link
Contributor

@magento run Functional Tests B2B

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Hotel
Copy link
Contributor

In 2 consecutive build runs, it was found that the build failures were flaky. Please refer to the below screenshots:

Functional Tests CE
Build Run 1
image

Build Run 2
image

Functional Tests B2b
Build Run 1
image

Build Run 2
image

Hence moving this PR in merge in progress.

@engcom-Hotel engcom-Hotel moved this from Extended testing (optional) to Merge in Progress in Community Dashboard Jan 9, 2024
@engcom-Charlie
Copy link
Contributor

@magento run all tests

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Charlie
Copy link
Contributor

@magento run all tests

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@engcom-Charlie
Copy link
Contributor

@magento run Functional Tests CE, Functional Tests B2B, Functional Tests EE, Integration Tests, WebAPI Tests

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@andrewbess
Copy link
Contributor

@magento run Functional Tests CE, Functional Tests B2B, Functional Tests EE

Copy link

The requested builds are added to the queue. You should be able to see them here within a few minutes. Please message the #magento-devops slack channel if they don't show in a reasonable amount of time and a representative will look into any issues.

@kamilmaliszewski
Copy link

Hey @aligent-lturner I've created patch from your changes and applied it for our M2 instance. There is almost no benefit in scope of catalog-rule reindex.

We have 500k products and there is 1 686 929 records in catalog_rule_product.

Catalog Rule Product index has been rebuilt successfully in 01:16:59 with this patch
vs
Catalog Rule Product index has been rebuilt successfully in 01:18:04 without this patch...

Have u tested your changes? Have u had better results?

@aligent-lturner
Copy link
Contributor Author

Hey @aligent-lturner I've created patch from your changes and applied it for our M2 instance. There is almost no benefit in scope of catalog-rule reindex.

We have 500k products and there is 1 686 929 records in catalog_rule_product.

Catalog Rule Product index has been rebuilt successfully in 01:16:59 with this patch vs Catalog Rule Product index has been rebuilt successfully in 01:18:04 without this patch...

Have u tested your changes? Have u had better results?

Hi @kamilmaliszewski
In my limited testing, this patch actually produces a bigger benefit for instances where catalog price rules are not used (but there are a lot of products). The reason for this is that the existing code still processes products one at a time, so for n products, there are at least n database queries still. We have seen instances of the catalog price rule indexer taking upwards of 20 minutes to run, even when there are no catalog price rules at all!.

I have not done extensive performance testing of various scenarios, but I cannot imagine there is any realistic scenario where performance is not improved, even if the improvement is small.

@kamilmaliszewski
Copy link

Yeah @aligent-lturner. It helps, but profit is not a big. Unfortunately it is not solution for catalog-rule problem of Magento with big catalog and many customer groups. I was debugging this catalog-rule and it seems that for existing catalog-rule the biggest problem is method $rule->getMatchingProductIds() which needs to be processed at every reindex. When customer create a huge condition here, then this time is growing.

if I understand this logic correctly, then when the catalog-rule it empty it seems that simple solution like

protected function doReindexFull()
{
   $rules = $this->getAllRules();
   if (empty($rules) {
       return;
   }
.....
}

should solve the problem.

@aligent-lturner
Copy link
Contributor Author

Yeah @aligent-lturner. It helps, but profit is not a big. Unfortunately it is not solution for catalog-rule problem of Magento with big catalog and many customer groups. I was debugging this catalog-rule and it seems that for existing catalog-rule the biggest problem is method $rule->getMatchingProductIds() which needs to be processed at every reindex. When customer create a huge condition here, then this time is growing.

if I understand this logic correctly, then when the catalog-rule it empty it seems that simple solution like

protected function doReindexFull()
{
   $rules = $this->getAllRules();
   if (empty($rules) {
       return;
   }
.....
}

should solve the problem.

Yes, that would solve that particular issue, but would still result in very poor performance in many other similar situations. I'm not suggesting this PR is a panacea for all catalog price rule performance issues - only that it will improve performance in general.

@magento-devops-reposync-svc magento-devops-reposync-svc merged commit a01560a into magento:2.4-develop Feb 24, 2024
10 of 12 checks passed
@engcom-Charlie engcom-Charlie moved this from Merge in Progress to Recently Merged in Community Dashboard Feb 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Partner: Aligent Consulting partners-contribution Pull Request is created by Magento Partner Priority: P3 May be fixed according to the position in the backlog. Progress: accept Project: Community Picked PRs upvoted by the community Triage: Dev.Experience Issue related to Developer Experience and needs help with Triage to Confirm or Reject it
Projects
Community Dashboard
Recently Merged
Development

Successfully merging this pull request may close these issues.

Very slow catalogrule_product index even when no catalog rule exists
9 participants