New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support complex moment #10009
Support complex moment #10009
Conversation
Can one of the admins verify this patch? Admins can comment |
@jrbourbeau would you have some time to review this? Maybe @charlesbluca might have some interest in reviewing too? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wkrasnicki! Don't have a deep understanding of the array internals but some comments around the warning filtering
if name.startswith("var") or name.startswith("moment"): | ||
warnings.simplefilter("ignore", np.ComplexWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this it make sense to only filter warnings here if meta
is a complex dtype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would suppress a ComplexWarning
on all operations where downcasting occurs, wouldn't it? I wanted to replicate the behaviour of numpy, which raises such a warning when we call var
on a complex array, but specify a real dtype
as output:
np.var(a)
- no complex warning in numpy, but without this if clause we would get a warning in dask, due to outputdtype
being a floatnp.var(a, dtype='c8')
- no warning in numpy and dasknp.var(a, dtype='f4')
- we should get a warning in both numpy and dask here
All of these cases are covered by unit tests. I know this block is not elegant, but a better one would probably require going deeper into Dask internals and I want to keep the changes minimal, considering it is a minor issue and I'm new to Dask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of keeping the check on the aggregation name
and including a check for meta
being complex, so something like
if name.startswith("var") or name.startswith("moment"): | |
warnings.simplefilter("ignore", np.ComplexWarning) | |
if name.startswith("var") or name.startswith("moment") and np.iscomplexobj(meta): | |
warnings.simplefilter("ignore", np.ComplexWarning) |
With the thought being that we would want this warning suppressed only in the case where we know downcasting would occur during a var
or moment
operation - does that make sense, or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, but I don't see the point. For the complex warning to be raised, we would need to have a complex dtype
, unless there is an edge case I'm missing? This logic just seems redundant to me, but I'm not that experienced in low-level numpy stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair point, mostly wasn't sure if there was any other case where we would potential want to emit the ComplexWarning
- okay with leaving as is if there isn't a particular edge case you can think of
@charlesbluca I've applied your suggestions in the unit tests, but think the situation on filtering in |
@charlesbluca This needs another look, if you're able. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wkrasnicki!
I only see approvals here, is there anything preventing a merger? (I haven't reviewed the code myself) |
@martindurant things looked good when I last checked, though as I'm not the most familiar w/ Dask array internals might make sense for someone to take a second look |
Pinging to move this fix forward |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have finally had a look at the code myself. I appreciate the effort you have taken to cover all the possible cases in tests.
It would be good to describe what implicit_complex_dtype does for the user - this isn't an argument derived from numpy, so should be covered by our docs, I think.
For the tests, there are a bunch of warning ignore blocks, but I think we want to assert a warning IS raised for these cases, with pytest.warns
.
@martindurant I've added the Regarding hte |
If this is not something the user might ever see, it's fine. Please run the pre-commit checks to fix the linting. |
@martindurant done |
I can't see any reason anything in this PR would cause the docbuild to fail. |
@wkrasnicki , unfortunately, I can't push to your branch. |
@martindurant I've added you as collaborator |
@dask/Maintenance someone merge this? |
Thanks for all the work here @wkrasnicki 🙂 |
pre-commit run --all-files
Fixes an inconsistency with
numpy
where thestd
andvar
methods do not apply the absolute value on complex arrays. Some redundantdtype
castings were removed from the_moment_combine
and_moment_agg
functions and some are conditional on the fact that the user passed a complex array without explicitly specifying a realdtype
.This last information is passed to the chunk callback as an argument, while the other callbacks simply check if the calculated chunks have complex values. Casting is utilized in later steps.
Additional test cases were added for complex array reductions, together with supressing some unnecessary warnings.