-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deterministic ordering in window tests #10205
Deterministic ordering in window tests #10205
Conversation
Fixes NVIDIA#10195. This is similar to the fix in NVIDIA#10143. This commit changes the test datagens used in the window function tests such that the order-by columns produce deterministic ordering. When the ordering is ambiguous, it can produce unexpected results from window functions, if the `order-by` spec includes the ambiguous columns. Signed-off-by: MithunR <mythrocks@gmail.com>
Build |
@@ -57,17 +57,17 @@ | |||
_grpkey_longs_with_decimals = [ | |||
('a', RepeatSeqGen(LongGen(nullable=False), length=20)), | |||
('b', DecimalGen(precision=18, scale=3, nullable=False)), | |||
('c', DecimalGen(precision=18, scale=3))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously the key was a decimal but now it's a long, so are we losing test coverage with this change or are we not worried about that here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make a UniqueIntegerGen and a UniqueDecimalGen too then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jlowe, I did consider that lost coverage. There's a couple of things working in favour of this change:
- Note that in the tests that use these gens, the query orders by both
b
andc
, both beingDecimalGen
. We're only changingc
, so the test still orders byDECIMAL
. I don't think we're losing coverage. - The focus of the
negative_rows
test is on whether row offsets< 0
work. It isn't really affected much by the order-by column, as with Fixtest_window_aggs_for_batched_finite_row_windows_partitioned
fail #10143.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@revans2, we could consider introducing a UniqueDecimalGen
, but I don't think that's strictly necessary for this test.
I'm open to adding it, if you think it's wise.
Fixes #10195.
This is similar to the fix in #10143. This commit changes the test datagens used in the window function tests such that the order-by columns produce deterministic ordering.
When the ordering is ambiguous, it can produce unexpected results from window functions, if the
order-by
spec includes the ambiguous columns.