-
Notifications
You must be signed in to change notification settings - Fork 589
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
馃悰 Expression.minmax() not returning proper value for datetimes #1469
base: master
Are you sure you want to change the base?
Conversation
tests/agg_test.py
Outdated
@@ -509,3 +509,11 @@ def test_agg_count_with_custom_name(): | |||
df_grouped = df.groupby(df.x, sort=True).agg({'mycounts': vaex.agg.count(), 'mycounts2': 'count'}) | |||
assert df_grouped['mycounts'].tolist() == [3, 2, 1] | |||
assert df_grouped['mycounts2'].tolist() == [3, 2, 1] | |||
|
|||
|
|||
def test_minmax_ts(df): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer that in addition to this, the test asserts for the min and max against the actual min and max values. It could happen that minmax
is consistent with min
and min
but still objectively wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Not sure why I didn't have that check in the first commit.
Somehow missed all of those tests failing. Will check again on this PR later tonight. |
yeah, seems odd, you could try a rebase.. although I don't remember seeing these issues before. |
d2cb499
to
02464a2
Compare
The latest commit should fix the issue. I was returning a list instead of the np array like before. Also, my previous implementation didn't respect delayed execution, so I changed the code to use the previous structure and just swapped out the This may suggest that the root case bug in |
Not sure why it's failing on non-Windows builds... The errors in the Linux and Mac builds are notebook timeout errors. |
Ah don't worry about it. Those errors are unrelated. I hope we can clean them up soon. Looks good to me! But let's wait for @maartenbreddels to also give his blessing :) Thanks! |
Thanks for this PR! |
Ups I completely forgot about this one :( Well, indeed it would be great to do min and max in one pass over the data. If one is interested only in one of them, is there any performance benefit to doing only min for example? @maartenbreddels |
Yes, we have both min, max and minmax for that reason. |
369423b
to
6927b35
Compare
02464a2
to
d75f2a9
Compare
Fixes #1456
I left some comments in the #development Slack channel, as it looks like
min()
,max()
andminmax()
were using different approaches. I'm not sure which is more efficient/robust long-term, but I opted for the simpler solution.Note: In this solution,
StatOpMinMax
is no longer used anywhere in the code base.