New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-2606] make WindowFnTestUtils use the value in addition to the timestamp of the elements #3592
[BEAM-2606] make WindowFnTestUtils use the value in addition to the timestamp of the elements #3592
Conversation
retest this please |
Changes Unknown when pulling 3f02004 on echauchot:use_timestampeValue_WindowFnTestUtils into ** on apache:master**. |
@kennknowles now that PR #3286 is merged, do you think it's useful to add tests for each public method |
+R: @bjchambers |
timestampedValues.add(TimestampedValue.of((T) null, new Instant(timestamp))); | ||
} | ||
return runWindowFnWithValue(windowFn, timestampedValues); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: unnecessary blank line
@@ -67,14 +68,30 @@ | |||
public static <T, W extends BoundedWindow> Map<W, Set<String>> runWindowFn( | |||
WindowFn<T, W> windowFn, | |||
List<Long> timestamps) throws Exception { | |||
// lift List<Timestamp> into List<TimestampedValue> to factorize implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: comment is probably unnecessary -- the code is relatively straightforward
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
// lift List<Timestamp> into List<TimestampedValue> to factorize implementation | ||
List<TimestampedValue<T>> timestampedValues = new ArrayList<>(); | ||
for (Long timestamp : timestamps){ | ||
timestampedValues.add(TimestampedValue.of((T) null, new Instant(timestamp))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this should take the value rather than using null
? Alternatively, it could take an Iterable values and pair them up. Then it could use Iterables.cycle((T) null)
as a default behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand. There is already a runWindowFn()
version that takes the values, it is runWindowFnWithValue()
. Similarly, I did 2 versions of each public method (one that takes only the timestamps and one that takes also the value with a TimestampedValue
). Indeed, I wanted to keep the existing API and enhance it with a version that also relies on values of elements for window assignment. The *withValue
version is called with null
by the non-value version just to factorize the code.
public static <T, W extends BoundedWindow> Collection<W> assignedWindows( | ||
WindowFn<T, W> windowFn, long timestamp) throws Exception { | ||
return windowFn.assignWindows(new TestAssignContext<T, W>(new Instant(timestamp), windowFn)); | ||
// lift Timestamp into TimestampedValue to factorize implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this comment is probably unnecessary too
@@ -197,9 +229,21 @@ public void merge(Collection<W> otherWindows, W window) { | |||
*/ | |||
public static <T, W extends BoundedWindow> void validateNonInterferingOutputTimes( | |||
WindowFn<T, W> windowFn, long timestamp) throws Exception { | |||
Collection<W> windows = WindowFnTestUtils.<T, W>assignedWindows(windowFn, timestamp); | |||
// lift Timestamp into TimestampedValue to factorize implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: ditto
*/ | ||
public static <T, W extends BoundedWindow> void validateNonInterferingOutputTimesWithValue( | ||
WindowFn<T, W> windowFn, TimestampedValue<T> timestampedValue) throws Exception { | ||
Collection<W> windows = assignedWindowsWithValue(windowFn, timestampedValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For methods like this where the value isn't used, I think it would be better to go the other direction -- move the logic into a single method that takes an Instant timestamp, and then call that using timestampedValue.getTimestamp()
.
It seems odd to have to create a TimestampedValue
using null
only to discard it and go back to just the timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this method the value is actually used in assignedWindowsWithValue()
which creates a TestAssignContext
initialized with a proper timestampedValue
that has non-null value. It is there to allow assigning elements to windows based on values in addition to timestamp (see WindowTest#testMergingCustomWindows
). If I move the code to a method that only takes the timestamp, I'm unable to assign windows based on values anymore. Of course, there is still the version that only takes the timestamp. In order to have for each public method the 2 versions, do you see a better way to factorize than generalizing timestampedValue
and pass it null
for the timestamp-only version?
@@ -269,15 +333,46 @@ void validateGetOutputTimestamps( | |||
TimestampCombiner timestampCombiner, | |||
List<List<Long>> timestampsPerWindow) throws Exception { | |||
|
|||
// lift List<List<Timestamp>> into List<List<TimestampedValue>> to factorize implementation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: same here
|
||
/** | ||
* Runs the {@link WindowFn} over the provided input, returning a map | ||
* of windows to the timestamps in those windows. This version allows to pass a list of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated, but odd, and maybe worth addressing. This says it is a map from windows to the timestamps in those windows, but it is a Set<String>
. Should this be Set<Instant>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this comes from the original API that transforms the timestamp into a String
see
private static String timestampValue(long timestamp) {
return "T" + new Instant(timestamp);
}
Once again, I tried to not alter the public test API. Maybe using String
instead of Instant
was chosen to avoid creating many instances of Instant
in the tests. See for example FixedWindowsTest
or other tests, they use long primitive type and use WindowFnTestUtils.set()
which uses WindowFnTestUtils.timestampValue()
to convert to String
.
@@ -67,14 +68,30 @@ | |||
public static <T, W extends BoundedWindow> Map<W, Set<String>> runWindowFn( | |||
WindowFn<T, W> windowFn, | |||
List<Long> timestamps) throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using List<Long>
to represent timestamps, we should probably use List<Instant>
so the types are more explicit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I totally agree, more explicit indeed. I did not change it to keep the same test API. I guess it was originally chosen to use long
instead of Instant
to avoid creating a lot of Instant
instances in the tests and also make the test code more concise. But I can change it to Instant
and update all the tests that use WindowFnTestUtils
if you think it's better
Thanks Ben for your review |
…imestamp of the elements
3f02004
to
396977c
Compare
@bjchambers, I only rebased on master and cleaned unneeded comments. PTAL at my responses to your comments. |
@bjchambers can you please take a look at my answers to your comments? |
Merging |
Thank you guys :) |
Follow this checklist to help us incorporate your contribution quickly and easily:
[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue.mvn clean verify
to make sure basic checks pass. A more thorough check will be performed on your pull request automatically.R: @kennknowles
Besides, I saw that the current
WindowFnTestUtils
has no test. I guess that is because all windows useWindowFnTestUtils
to test themselves. So I did not add any test. That is said, the methods added in that PR are actually called by the current windows tests but always with anull
value inTimestampedValue
. I was thinking I could wait for that PR #3286 to be merged. Indeed it defines aCustomWindowFn
inWindowTest
that relies on values for window assignment. I could then add a test of thisCustomWindowFn
with the newWindowFnTestUtils
.@kennknowles WDYT?