Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic ordering in window tests #10205

Merged
merged 1 commit into from
Jan 18, 2024

Conversation

mythrocks
Copy link
Collaborator

Fixes #10195.

This is similar to the fix in #10143. This commit changes the test datagens used in the window function tests such that the order-by columns produce deterministic ordering.

When the ordering is ambiguous, it can produce unexpected results from window functions, if the order-by spec includes the ambiguous columns.

Fixes NVIDIA#10195.

This is similar to the fix in NVIDIA#10143.  This commit changes the test datagens used in
the window function tests such that the order-by columns produce deterministic ordering.

When the ordering is ambiguous, it can produce unexpected results from window functions,
if the `order-by` spec includes the ambiguous columns.

Signed-off-by: MithunR <mythrocks@gmail.com>
@mythrocks mythrocks self-assigned this Jan 16, 2024
@mythrocks mythrocks added the test Only impacts tests label Jan 16, 2024
@mythrocks
Copy link
Collaborator Author

Build

@@ -57,17 +57,17 @@
_grpkey_longs_with_decimals = [
('a', RepeatSeqGen(LongGen(nullable=False), length=20)),
('b', DecimalGen(precision=18, scale=3, nullable=False)),
('c', DecimalGen(precision=18, scale=3))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the key was a decimal but now it's a long, so are we losing test coverage with this change or are we not worried about that here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make a UniqueIntegerGen and a UniqueDecimalGen too then?

Copy link
Collaborator Author

@mythrocks mythrocks Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jlowe, I did consider that lost coverage. There's a couple of things working in favour of this change:

  1. Note that in the tests that use these gens, the query orders by both b and c, both being DecimalGen. We're only changing c, so the test still orders by DECIMAL. I don't think we're losing coverage.
  2. The focus of the negative_rows test is on whether row offsets < 0 work. It isn't really affected much by the order-by column, as with Fix test_window_aggs_for_batched_finite_row_windows_partitioned fail #10143.

Copy link
Collaborator Author

@mythrocks mythrocks Jan 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@revans2, we could consider introducing a UniqueDecimalGen, but I don't think that's strictly necessary for this test.

I'm open to adding it, if you think it's wise.

@mythrocks mythrocks merged commit a168f6e into NVIDIA:branch-24.02 Jan 18, 2024
41 checks passed
@mythrocks mythrocks deleted the wintest-negative-rows branch January 18, 2024 18:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] test_window_aggs_for_negative_rows_partitioned failure in CI
3 participants