Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STORM-3121: Fix flaky metrics tests in storm-core #2735

Merged
merged 1 commit into from Jun 26, 2018

Conversation

srdo
Copy link
Contributor

@srdo srdo commented Jun 24, 2018

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks great and I ran Travis build 5 times and all succeed, though the test before the patch is also failing rarely so it may not be right approach to verify.

Just would like to confirm my understanding of the issue before giving +1.

(try
(do
(wait-for-atleast-N-buckets! min-buckets comp-id metric-name cluster)
(.until
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as far as I understand the description of issue, the metric value may not be added to exactly N bucket though we simulate the time, so we need to wait more to check if condition is met, and though we are now having another waiting logic, we still have to call wait-for-atleast-N-buckets! because of advancing simulated time, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The part about the metric value not being added to exactly bucket N is right. The reason we call wait-for-atleast-N-buckets is to preserve the existing test logic as much as possible. For example, one of the replaced assert-buckets were asserting that a metric had buckets [1 0 0 0 0 0]. The new assert would just assert that the running sum is 1. By also waiting for 6 buckets, we are able to assert that the running sum is 1 within the first 6 buckets, rather than potentially stopping the check as soon as the first bucket is available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details. Now I can understand what the patch is addressing clearer.

@srdo
Copy link
Contributor Author

srdo commented Jun 24, 2018

I saw failures in the metrics tests very often when adding Java 10 to the testing matrix, e.g. https://travis-ci.org/srdo/storm/builds/395663084. Maybe Travis runs the tests on the same machine, and the extra load happens to trigger the failure? I've also been unable to trigger the error locally or on master.

@HeartSaVioR
Copy link
Contributor

@srdo Yeah... strange that it fails on JDK 8 instead of JDK 10, which happened rarely. We would be better to fix it anyway.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

<dependency>
<groupId>org.awaitility</groupId>
<artifactId>awaitility</artifactId>
</dependency>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: like in the other pull request, it would be nice to have this marked as test here too, just so it is obvious.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scope is inherited from dependencyManagement. Do you want me to add it anyway?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we are consistent regarding this, but IMHO explicitly marking it looks better for me, though we know the scope is inherited. I'm OK to not adding here if we are already not consistent on this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added scope

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 again

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants