Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support complex moment #10009

Merged
merged 10 commits into from Sep 25, 2023
Merged

Support complex moment #10009

merged 10 commits into from Sep 25, 2023

Conversation

krokosik
Copy link
Contributor

@krokosik krokosik commented Mar 1, 2023

Fixes an inconsistency with numpy where the std and var methods do not apply the absolute value on complex arrays. Some redundant dtype castings were removed from the _moment_combine and _moment_agg functions and some are conditional on the fact that the user passed a complex array without explicitly specifying a real dtype.

This last information is passed to the chunk callback as an argument, while the other callbacks simply check if the calculated chunks have complex values. Casting is utilized in later steps.

Additional test cases were added for complex array reductions, together with supressing some unnecessary warnings.

@GPUtester
Copy link
Collaborator

Can one of the admins verify this patch?

Admins can comment ok to test to allow this one PR to run or add to allowlist to allow all future PRs from the same author to run.

@github-actions github-actions bot added the array label Mar 1, 2023
@krokosik krokosik marked this pull request as draft March 1, 2023 12:32
@krokosik krokosik marked this pull request as ready for review March 1, 2023 12:40
@github-actions github-actions bot added the needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. label Apr 3, 2023
@jacobtomlinson
Copy link
Member

jacobtomlinson commented Apr 15, 2023

@jrbourbeau would you have some time to review this?

Maybe @charlesbluca might have some interest in reviewing too?

Copy link
Member

@charlesbluca charlesbluca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wkrasnicki! Don't have a deep understanding of the array internals but some comments around the warning filtering

Comment on lines +387 to +388
if name.startswith("var") or name.startswith("moment"):
warnings.simplefilter("ignore", np.ComplexWarning)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this it make sense to only filter warnings here if meta is a complex dtype?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would suppress a ComplexWarning on all operations where downcasting occurs, wouldn't it? I wanted to replicate the behaviour of numpy, which raises such a warning when we call var on a complex array, but specify a real dtype as output:

  • np.var(a) - no complex warning in numpy, but without this if clause we would get a warning in dask, due to output dtype being a float
  • np.var(a, dtype='c8') - no warning in numpy and dask
  • np.var(a, dtype='f4') - we should get a warning in both numpy and dask here

All of these cases are covered by unit tests. I know this block is not elegant, but a better one would probably require going deeper into Dask internals and I want to keep the changes minimal, considering it is a minor issue and I'm new to Dask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of keeping the check on the aggregation name and including a check for meta being complex, so something like

Suggested change
if name.startswith("var") or name.startswith("moment"):
warnings.simplefilter("ignore", np.ComplexWarning)
if name.startswith("var") or name.startswith("moment") and np.iscomplexobj(meta):
warnings.simplefilter("ignore", np.ComplexWarning)

With the thought being that we would want this warning suppressed only in the case where we know downcasting would occur during a var or moment operation - does that make sense, or am I missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does, but I don't see the point. For the complex warning to be raised, we would need to have a complex dtype, unless there is an edge case I'm missing? This logic just seems redundant to me, but I'm not that experienced in low-level numpy stuff

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point, mostly wasn't sure if there was any other case where we would potential want to emit the ComplexWarning - okay with leaving as is if there isn't a particular edge case you can think of

dask/array/tests/test_reductions.py Outdated Show resolved Hide resolved
dask/array/tests/test_reductions.py Outdated Show resolved Hide resolved
dask/array/tests/test_reductions.py Outdated Show resolved Hide resolved
@krokosik
Copy link
Contributor Author

krokosik commented May 2, 2023

@charlesbluca I've applied your suggestions in the unit tests, but think the situation on filtering in reductions.py is a little bit more nuanced, as I explained in a comment there. Thank you so much for your review ❤️

@j-bennet
Copy link
Contributor

@charlesbluca This needs another look, if you're able. Thank you!

@hendrikmakait hendrikmakait added the needs review Needs review from a contributor. label May 22, 2023
Copy link
Member

@charlesbluca charlesbluca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wkrasnicki!

@martindurant
Copy link
Member

I only see approvals here, is there anything preventing a merger?

(I haven't reviewed the code myself)

@charlesbluca
Copy link
Member

@martindurant things looked good when I last checked, though as I'm not the most familiar w/ Dask array internals might make sense for someone to take a second look

@apatlpo
Copy link
Contributor

apatlpo commented Aug 28, 2023

Pinging to move this fix forward

Copy link
Member

@martindurant martindurant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have finally had a look at the code myself. I appreciate the effort you have taken to cover all the possible cases in tests.

It would be good to describe what implicit_complex_dtype does for the user - this isn't an argument derived from numpy, so should be covered by our docs, I think.

For the tests, there are a bunch of warning ignore blocks, but I think we want to assert a warning IS raised for these cases, with pytest.warns.

@krokosik
Copy link
Contributor Author

krokosik commented Sep 8, 2023

@martindurant I've added the pytest.warns context, didn't know about this feature, very cool!

Regarding hte implicit_complex_dtype I can provide the documentation, but I'm not really sure where. It only exists on the internal moment_chunk and moment_agg helper functions. Should I introduce docstrings for them?

@martindurant
Copy link
Member

Regarding hte implicit_complex_dtype I can provide the documentation, but I'm not really sure where.

If this is not something the user might ever see, it's fine.

Please run the pre-commit checks to fix the linting.

@krokosik
Copy link
Contributor Author

krokosik commented Sep 8, 2023

@martindurant done

@martindurant
Copy link
Member

I can't see any reason anything in this PR would cause the docbuild to fail.
Otherwise, I'm happy to merge here. @jrbourbeau , are we seeing docbuild failures elsewhere, perhaps we should just run it again?

@martindurant
Copy link
Member

@wkrasnicki , unfortunately, I can't push to your branch.

@krokosik
Copy link
Contributor Author

@martindurant I've added you as collaborator

@martindurant
Copy link
Member

@dask/Maintenance someone merge this?

@charlesbluca charlesbluca merged commit 9928562 into dask:main Sep 25, 2023
23 checks passed
@charlesbluca
Copy link
Member

Thanks for all the work here @wkrasnicki 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
array needs attention It's been a while since this was pushed on. Needs attention from the owner or a maintainer. needs review Needs review from a contributor.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

da.std and da.var handle complex values incorrectly
8 participants