-
Notifications
You must be signed in to change notification settings - Fork 24.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TEST] Fix slowness of AutoDateHistogramAggregatorTests #35072
[TEST] Fix slowness of AutoDateHistogramAggregatorTests #35072
Conversation
Randomize test assertion and test set size instead of asserting on an exhaustive list of dates with fixed test set size. Also refactor common objects used to avoid recreating them, avoid date to string conversion and reduce duplicate test code
Pinging @elastic/es-search-aggs |
@ekalgolas thanks for opening the PR looks really great at first glance but since this is a larger change it will require some more detailed attention. |
@ekalgolas there are a couple of checkstyle errors coming from AutoDateHistogramAggregatorTests (see https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request/1581/console). Could you fix those so I can re-run the tests? In general, if you run |
Sure, thanks for the info. Will upload a patch |
Randomize test assertion and test set size instead of asserting on an exhaustive list of dates with fixed test set size. Also refactor common objects used to avoid recreating them, avoid date to string conversion and reduce duplicate test code
…ests' into fix-AutoDateHistogramAggregatorTests # Conflicts: # server/src/test/java/org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Fixed now |
@elasticmachine test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ekalgolas great change, I quickly compared the runtime of the test on my local machine and where it took ~25sec on average before it is now down to 10sec. Sorry it took a bit, but the change set is quite large due to the many formatting changes, but I think I got the gist of it now.
I left a few short questions and will kick of running the tests in the meantime.
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Show resolved
Hide resolved
histogram -> { | ||
final List<? extends Histogram.Bucket> buckets = histogram.getBuckets(); | ||
final int expectedDocCount = bucketsToExpectedDocCountMap.get(randomNumberOfBuckets); | ||
final int expectedSizeCeil = (seed + expectedDoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where did this test move to? I'm not sure but I think it has a readon to be here and I cannot find its replacement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this test as it felt kind of redundant. In this test case we first set the buckets so that data point is a key with doc count as 1, which is tested in testRandomHourInterval and testIntervalHour. Next, this sets the number of buckets to 6, a number such that keys are now 3 hour intervals, which is again tested by testRandomHourInterval. I did not see a special case being tested cases which the existing hour test cases could not test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that part of this seems to be covered by the other randomized test, the case here seems to be somehow more carfuly crafted so I assumed it might have a special purpose. Maybe @colings86 who authored the initial commit of this test can elaborate more on this? If we want to keep it we should probably add a short note about the special purpose of this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 on removing this test.
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Outdated
Show resolved
Hide resolved
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Outdated
Show resolved
Hide resolved
@elasticmachine test this please |
Also I see the build has failed due to "> Task :distribution:bwc:next-bugfix-snapshot:buildBwcVersion FAILED" which might be due to #28653 |
…ests' into fix-AutoDateHistogramAggregatorTests # Conflicts: # server/src/test/java/org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Uploaded changes for renaming seed and undoing the conditional operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ekalgolas, thanks for the update. I think the PR looks good in general but I'm wondering if we can simplify the new randomized tests a bit by making the dataset size fixed to avoid a few of the more complicated calculations now needed in the test. I left a comment along those lines, would be nice if you could try this on one method and if this still keeps runtime down apply this to all the other methods as well.
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Outdated
Show resolved
Hide resolved
histogram -> { | ||
final List<? extends Histogram.Bucket> buckets = histogram.getBuckets(); | ||
final int expectedDocCount = bucketsToExpectedDocCountMap.get(randomNumberOfBuckets); | ||
final int expectedSizeCeil = (seed + expectedDoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that part of this seems to be covered by the other randomized test, the case here seems to be somehow more carfuly crafted so I assumed it might have a special purpose. Maybe @colings86 who authored the initial commit of this test can elaborate more on this? If we want to keep it we should probably add a short note about the special purpose of this test.
...org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this code vs master; cut down test run time from 30-40s to about 10s (local machine, small sample size, etc). Nice work! There's probably some further work we can do to reduce times, but we can tweak the randomization parameters to find a balance between test coverage and execution time.
histogram -> { | ||
final List<? extends Histogram.Bucket> buckets = histogram.getBuckets(); | ||
final int expectedDocCount = bucketsToExpectedDocCountMap.get(randomNumberOfBuckets); | ||
final int expectedSizeCeil = (seed + expectedDoc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm +1 on removing this test.
…ests' into fix-AutoDateHistogramAggregatorTests # Conflicts: # server/src/test/java/org/elasticsearch/search/aggregations/bucket/histogram/AutoDateHistogramAggregatorTests.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ekalgolas Thanks for the last update, LGTM now. I will kick of the CI tests and merge after we get a successful run.
@elasticmachine test this please |
@ekalgolas there were some unused imports creating issues with our "checkstyle" tests. I removed them with my last commit. I hope you don't mind me pushing to your branch to speed up getting a clean CI built. |
@elasticmachine test this please |
@ekalgolas thanks a lot for working on this, I have merged to master and will also backport to 6.x. Great speed improvement for this test indeed! |
Randomize test assertion and test set size instead of asserting on an exhaustive list of dates with fixed test set size. Also refactor common objects used to avoid recreating them, avoid date to string conversion and reduce duplicate test code Closes #33181
Thanks for removing the unused imports @cbuescher |
@ekalgolas thank you for your contribution here, much appreciated! |
Fixes #33181
Randomize test assertion and test set size instead of asserting on an
exhaustive list of dates with fixed test set size.
Also refactor common objects used to avoid recreating them, avoid date
to string conversion and reduce duplicate test code