New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STORM-2525: Fix flaky integration tests #2127
Conversation
@@ -32,18 +32,19 @@ function list_storm_processes() { | |||
|
|||
list_storm_processes || true | |||
# increasing swap space so we can run lots of workers | |||
sudo dd if=/dev/zero of=/swapfile.img bs=8192 count=1M | |||
sudo dd if=/dev/zero of=/swapfile.img bs=4096 count=1M |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ubuntu image has a 10GB disk, so 8GB swap doesn't fit. I didn't see the swap exceed a few hundred megs anyway, so I lowered this a bit.
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.mockito</groupId> | ||
<artifactId>mockito-all</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like us to transition to using mockito-core instead of this. The all artifact is no longer published, and it can cause issues when hamcrest libraries are also on the classpath.
bf6fd17
to
ebd3288
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I am not an expert on the windowing functions so I would prefer is someone who is also took a look. All the changes made since to me.
Well no one else has looked at it, and it does cut 30 mins off of a build, so I am going to check it in, and if we run into issues in the future we can fix them then. |
Sounds good, thanks for the review. |
https://issues.apache.org/jira/browse/STORM-2525
final int numberOfWindows = allBoltLog.size() - windowSec / slideSec;
, but it seems pretty reasonable to me to just check all the windows instead.I've run the integration tests 5 times with no errors, so it seems stable now.
There is another issue semi-related to this that I'd like some input on. I'll raise another issue if this behavior seems broken to anyone else:
The windowing docs claim that windowing offers at least once processing. This is true for configurations using tuple timestamps (i.e. those ensuring that
storm/storm-client/src/jvm/org/apache/storm/topology/IWindowedBolt.java
Line 50 in 64e29f3
storm/storm-client/src/jvm/org/apache/storm/windowing/WindowManager.java
Line 50 in a4afacd
storm/storm-client/src/jvm/org/apache/storm/windowing/WindowManager.java
Line 215 in a4afacd
When tuple timestamps are used, tuples will only expire when the watermark moves, and when the watermark moves, the trigger policy ensures all stored tuples get passed to the user's bolt in an appropriate window. When tuple timestamps, and thus watermarks, are disabled, tuples may get expired during compaction regardless of whether the user's bolt has seen those tuples yet.
I'm not sure if we should just put a note in the windowing docs about this, or if it is better to remove the compaction mechanism? I'm leaning towards removing the compaction mechanism. I don't think it's actually saving much memory since evicted tuples remain stored in a list until the user's bolt is passed the next window (at which point the expiration code would be run again, and the tuples would be expired anyway), and the behavior is not that easy to understand without reading the source. For example you can only get a count window larger than 100 if you use tuple timestamps, since the manager will expire tuples once the window reaches that size if watermarking is disabled.