Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented pausing mechanisms into cli/finder_indexer.php. #13502

Merged
merged 21 commits into from Sep 17, 2019

Conversation

@frankmayer
Copy link
Contributor

commented Jan 7, 2017

New issue:
With thousands of articles or K2 items with lots of different text each, the server might become unresponsive, mostly as a result of a lot of processing and IO/WAIT.

Summary of Changes

This patch implements a pausing mechanism that pauses for a defined or dynamically adjusted amount of time between batches, therefore giving the server a little time to catch up.

Testing Instructions

Note:
For all of the tests, you may break the processing, at any point, after the findings are confirmed. No need to run them completely on very large sets of data.

  1. Do not apply the patch, yet. Run the command with the --purge parameter to also clean up the database.

Note: You might or might not notice (on huge number of articles with a lot of different text) that the machine(server) might get unresponsive.

Even if you don't notice that, you will notice that the apache/php- and mysql-processes are each constantly fully consuming a CPU core. Which depending on the cores available (and the settings of mysql in core usage) might translate to 50% on a dual core, 25% on a quad core or 12,5% on an octa-core processor. While this alone might not be an issue, it does add up the the problem at hand.

You also might or might not notice (depending on the storage subsystem [hdd, ssd or hybrid]) that there is a lot of IO/WAIT during the task. It gets worse, if it is a very long running task. I have seen this resulting in an unresponsive site, during indexing. Where unresponsive can mean that the response could take from 5-10 seconds to a lot more, or a complete timeout.

  1. Now apply the patch and run the same command again (with the --purge argument).
    You should notice pauses being inserted between each batch, according to the time the finished batch took to process. The division to determine the pause length is rounded[processed-batch-running-time / divisor]. The default divisor is 5.

You should also notice that the site/admin interface is now a lot more responsive, as the machine has less IO/WAITS and less constantly fully consuming of CPU cores.

  1. Now run the same command again (with the --purge argument), and add --pause=division --divisor=3
    Notice that the pauses are correctly calculated using the new divisor.

  2. Now run the same command again (with the --purge argument), and replace the --pause=division --divisor=3 with --pause=3
    Notice that the pauses are constantly three seconds long.

Documentation Changes Required

Yes. The new arguments should be included in the documentation here: https://docs.joomla.org/Setting_up_automatic_Smart_Search_indexing

 The reason for that is, that with thousands of articles or K2 items, the server would often become unresponsive, because of a lot of processing and IO/WAiT on the mysql server.
  This patch implements a pausing mechanism that pauses for a defined or dynamically adjusted amount of time between batches, therefore giving the server a little time to catch up.
language/en-GB/en-GB.finder_cli.ini Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
- type safe comparison of empty string
@frankmayer frankmayer changed the title Implemented pausing mechanisms into cli/finder_indexer.php. [Smart Search] Implemented pausing mechanisms into cli/finder_indexer.php. Jun 11, 2017
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
@frankmayer frankmayer requested a review from brianteeman as a code owner May 17, 2018
cli/finder_indexer.php Outdated Show resolved Hide resolved
@Quy

This comment has been minimized.

Copy link
Contributor

commented May 18, 2018

I have tested this item successfully on b7134e2


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/13502.

@Hackwar

This comment has been minimized.

Copy link
Member

commented Jul 19, 2019

Thank you for this feature. I do consider it a new feature and not just a bugfix, and thus I would recommend to add this to 4.0 instead of 3.x. I would actually be interested if this issue persists with the new DB structure in 4.0.

And yes, some things take a VERY long time. sigh

@frankmayer

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2019

@Hackwar Yes, it can be seen both ways. But it fixes a problem that is happening on the 3.x version. And since people will be on that version for some time to come, we might as well provide a better experience.

As for 4.0, I have not tested, and unfortunately I didn't have any time to check out all the great stuff you people have done with the new version. But if we have this on 3.x it shouldn't be too much of an effort, if needed, to port it to 4.0.

# Conflicts:
#	cli/finder_indexer.php
@frankmayer

This comment has been minimized.

Copy link
Contributor Author

commented Jul 19, 2019

@franz-wohlkoenig resolved conflicts.
@SharkyKZ want to give it a try?
Thanks

cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
frankmayer added 3 commits Jul 20, 2019
- Set correct default for class property.
…run through the pausing part, if we don't want to pause.
- Changes in script documentation
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
cli/finder_indexer.php Outdated Show resolved Hide resolved
frankmayer added 2 commits Jul 22, 2019
@SharkyKZ

This comment has been minimized.

Copy link
Contributor

commented Jul 22, 2019

I have tested this item successfully on 9b4cd19


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/13502.

@franz-wohlkoenig

This comment has been minimized.

Copy link
Member

commented Jul 22, 2019

@Quy can you please retest?

@Quy

This comment has been minimized.

Copy link
Contributor

commented Jul 22, 2019

I have tested this item successfully on 9b4cd19


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/13502.

@Quy

This comment has been minimized.

Copy link
Contributor

commented Jul 22, 2019

RTC


This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/13502.

@joomla-cms-bot joomla-cms-bot added the RTC label Jul 22, 2019
@HLeithner HLeithner merged commit fb2b123 into joomla:staging Sep 17, 2019
5 checks passed
5 checks passed
Hound No violations found. Woof!
JTracker/HumanTestResults Human Test Results: 2 Successful 0 Failed.
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/drone/pr Build is passing
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@joomla-cms-bot joomla-cms-bot removed the RTC label Sep 17, 2019
@HLeithner

This comment has been minimized.

Copy link
Member

commented Sep 17, 2019

Thanks for this PR. The reason I'm merging this into 3.9 series is to protect servers overloading.

@HLeithner HLeithner added this to the Joomla! 3.9.12 milestone Sep 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.