Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] test_range_running_window_float_decimal_sum_runs_batched fails intermittently #10378

Closed
mythrocks opened this issue Feb 5, 2024 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@mythrocks
Copy link
Collaborator

On certain CI runs, one sees failures in the window function test named test_range_running_window_float_decimal_sum_runs_batched:

[2024-02-04T23:11:11.547Z] FAILED ../../src/main/python/window_function_test.py::test_range_running_window_float_decimal_sum_runs_batched[1000][DATAGEN_SEED=1707064878, INJECT_OOM, IGNORE_ORDER({'local': True}), APPROXIMATE_FLOAT]

The reported diffs are as follows:

-Row(... double_sum=4.797632847838746e-69...)
+Row(... double_sum=4.797632847838745e-69...)
...
-Row(... double_sum=3.1738776725114095e+88...)
+Row(... double_sum=3.173877672511409e+88...)

This is a little strange, given that the test is marked as @approximate_float. So far, it has also proven impossible to reproduce locally.

@mythrocks mythrocks added bug Something isn't working ? - Needs Triage Need team to review and classify labels Feb 5, 2024
@mythrocks mythrocks self-assigned this Feb 5, 2024
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Feb 6, 2024
@mythrocks
Copy link
Collaborator Author

Here is a sampling of the most egregious diffs in the output:

-Row(p=None, oby=None, short_double_sum=None, double_sum=3.320125371694111e-34, short_float_sum=None, float_sum=1.425792694093375e+33, dec_sum=None)
+Row(p=None, oby=None, short_double_sum=None, double_sum=3.3201253716941104e-34, short_float_sum=None, float_sum=1.425792694093375e+33, dec_sum=None) 
-Row(p=None, oby=None, short_double_sum=None, double_sum=1.423710382667759e+126, short_float_sum=None, float_sum=3.2242434807184336e-06, dec_sum=None) 
+Row(p=None, oby=None, short_double_sum=None, double_sum=1.423710382667759e+126, short_float_sum=None, float_sum=3.224243480718433e-06, dec_sum=None)
-Row(p=None, oby=None, short_double_sum=None, double_sum=1.9011342475589104e+267, short_float_sum=None, float_sum=1.3461754566575804e+19, dec_sum=None)
+Row(p=None, oby=None, short_double_sum=None, double_sum=1.9011342475589104e+267, short_float_sum=None, float_sum=1.3461754566575806e+19, dec_sum=None)
-Row(p=None, oby=None, short_double_sum=None, double_sum=3.536650741791709e+271, short_float_sum=None, float_sum=1.5652249836245965e+32, dec_sum=None)
+Row(p=None, oby=None, short_double_sum=None, double_sum=3.536650741791709e+271, short_float_sum=None, float_sum=1.5652249836245963e+32, dec_sum=None)
-Row(p=None, oby=-32768, short_double_sum=-360448.0, double_sum=4.977188684129671e-210, short_float_sum=-360448.0, float_sum=1.2578720332658084e+29, dec_sum=Decimal('-360448.0')) 
+Row(p=None, oby=-32768, short_double_sum=-360448.0, double_sum=4.977188684129671e-210, short_float_sum=-360448.0, float_sum=1.2578720332658082e+29, dec_sum=Decimal('-360448.0'))
-Row(p=None, oby=-32436, short_double_sum=-556519.0, double_sum=1.5976044957509647e-78, short_float_sum=-556519.0, float_sum=1857.6042175782272, dec_sum=Decimal('-556519.0'))
+Row(p=None, oby=-32436, short_double_sum=-556519.0, double_sum=1.5976044957509644e-78, short_float_sum=-556519.0, float_sum=1857.6042175782272, dec_sum=Decimal('-556519.0'))
-Row(p=None, oby=-30594, short_double_sum=-964181.0, double_sum=1.4251673808780018e+16, short_float_sum=-964181.0, float_sum=1.6525291491337176e-27, dec_sum=Decimal('-964181.0'))
+Row(p=None, oby=-30594, short_double_sum=-964181.0, double_sum=1.4251673808780016e+16, short_float_sum=-964181.0, float_sum=1.6525291491337176e-27, dec_sum=Decimal('-964181.0'))
-Row(p=None, oby=-30508, short_double_sum=-994689.0, double_sum=2.3460384328273062e+165, short_float_sum=-994689.0, float_sum=2.6449957901576276e-12, dec_sum=Decimal('-994689.0'))
+Row(p=None, oby=-30508, short_double_sum=-994689.0, double_sum=2.3460384328273062e+165, short_float_sum=-994689.0, float_sum=2.6449957901576272e-12, dec_sum=Decimal('-994689.0'))
-Row(p=None, oby=-28407, short_double_sum=-1997254.0, double_sum=1.575301213703218e+61, short_float_sum=-1997254.0, float_sum=3.940248080377917e+32, dec_sum=Decimal('-1997254.0'))
+Row(p=None, oby=-28407, short_double_sum=-1997254.0, double_sum=1.5753012137032178e+61, short_float_sum=-1997254.0, float_sum=3.940248080377916e+32, dec_sum=Decimal('-1997254.0'))
-Row(p=None, oby=-27989, short_double_sum=-2166308.0, double_sum=5.7222003320949304e-111, short_float_sum=-2166308.0, float_sum=1.8995236069283106e+32, dec_sum=Decimal('-2166308.0'))
+Row(p=None, oby=-27989, short_double_sum=-2166308.0, double_sum=5.72220033209493e-111, short_float_sum=-2166308.0, float_sum=1.8995236069283103e+32, dec_sum=Decimal('-2166308.0'))
-Row(p=None, oby=-22422, short_double_sum=-3762895.0, double_sum=1.3124717977250846e-206, short_float_sum=-3762895.0, float_sum=5.110702103270278e-35, dec_sum=Decimal('-3762895.0'))
+Row(p=None, oby=-22422, short_double_sum=-3762895.0, double_sum=1.3124717977250843e-206, short_float_sum=-3762895.0, float_sum=5.110702103270278e-35, dec_sum=Decimal('-3762895.0'))
-Row(p=None, oby=-19922, short_double_sum=-4435771.0, double_sum=2.9331653242274835e-241, short_float_sum=-4435771.0, float_sum=1.5316559583604917e+27, dec_sum=Decimal('-4435771.0'))
+Row(p=None, oby=-19922, short_double_sum=-4435771.0, double_sum=2.9331653242274832e-241, short_float_sum=-4435771.0, float_sum=1.5316559583604914e+27, dec_sum=Decimal('-4435771.0'))
-Row(p=None, oby=-14911, short_double_sum=-5405015.0, double_sum=2.9797860062816927e-167, short_float_sum=-5405015.0, float_sum=1.4775188260736515e-29, dec_sum=Decimal('-5405015.0'))
+Row(p=None, oby=-14911, short_double_sum=-5405015.0, double_sum=2.9797860062816923e-167, short_float_sum=-5405015.0, float_sum=1.4775188260736515e-29, dec_sum=Decimal('-5405015.0'))
-Row(p=None, oby=-14022, short_double_sum=-5591232.0, double_sum=2.3870085730304104e-255, short_float_sum=-5591232.0, float_sum=1.6656914299845682e+31, dec_sum=Decimal('-5591232.0'))
+Row(p=None, oby=-14022, short_double_sum=-5591232.0, double_sum=2.3870085730304104e-255, short_float_sum=-5591232.0, float_sum=1.6656914299845685e+31, dec_sum=Decimal('-5591232.0'))
-Row(p=None, oby=-12091, short_double_sum=-5889954.0, double_sum=7.01490562779971e+222, short_float_sum=-5889954.0, float_sum=1.2687089330344853e+31, dec_sum=Decimal('-5889954.0'))
+Row(p=None, oby=-12091, short_double_sum=-5889954.0, double_sum=7.01490562779971e+222, short_float_sum=-5889954.0, float_sum=1.2687089330344856e+31, dec_sum=Decimal('-5889954.0'))
-Row(p=None, oby=-10653, short_double_sum=-6057595.0, double_sum=1347267998.1720455, short_float_sum=-6057595.0, float_sum=1.495041434408932e+17, dec_sum=Decimal('-6057595.0'))
+Row(p=None, oby=-10653, short_double_sum=-6057595.0, double_sum=1347267998.1720452, short_float_sum=-6057595.0, float_sum=1.495041434408932e+17, dec_sum=Decimal('-6057595.0'))
-Row(p=None, oby=4621, short_double_sum=-6544450.0, double_sum=9.01343733242911e+182, short_float_sum=-6544450.0, float_sum=2.1577671275371775e-08, dec_sum=Decimal('-6544450.0'))
+Row(p=None, oby=4621, short_double_sum=-6544450.0, double_sum=9.01343733242911e+182, short_float_sum=-6544450.0, float_sum=2.1577671275371772e-08, dec_sum=Decimal('-6544450.0'))

I'm about halfway through the eyeballing the output. I'll post here, if I find any deviations that are worse. I think the above should have passed the @approximate_float test.

It appears that this error didn't occur on the last run, although a different test did fail: #10388.

@mythrocks
Copy link
Collaborator Author

Ah, shoot. Here it is:

-Row(p=-1537828595, oby=26650, short_double_sum=32330.0, double_sum=inf, short_float_sum=32330.0, float_sum=7.066224196393988e+23, dec_sum=Decimal('32330.0'))
+Row(p=-1537828595, oby=26650, short_double_sum=32330.0, double_sum=1.7976931348623157e+308, short_float_sum=32330.0, float_sum=7.066224196393988e+23, dec_sum=Decimal('32330.0'))

Looks like the GPU result produces a very large number, not inf. I'll try to repro this.

@mythrocks mythrocks assigned mythrocks and unassigned mythrocks Feb 21, 2024
@mythrocks
Copy link
Collaborator Author

I should mention here that this test failed once a couple of weeks ago, and hasn't been reproducible since. :/

@mythrocks
Copy link
Collaborator Author

Closing this as not reproducible. We'll reopen if this occurs again.

@mythrocks mythrocks closed this as not planned Won't fix, can't repro, duplicate, stale Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants