Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster and simpler iris.util.array_equal #5610

Merged
merged 1 commit into from Dec 8, 2023

Conversation

bouweandela
Copy link
Member

🚀 Pull Request

Description

The function iris.util.array_equal realizes the data multiple times, instead of just once. This means that data is pulled from disk multiple times, which is slower than necessary. This pull request simplifies the code so it lets Dask figure out what the best way to handle the computation is.


Consult Iris pull request check list

@bouweandela bouweandela marked this pull request as ready for review December 5, 2023 14:50
Copy link

codecov bot commented Dec 5, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (db43fde) 89.68% compared to head (6bcc078) 89.69%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #5610   +/-   ##
=======================================
  Coverage   89.68%   89.69%           
=======================================
  Files          90       90           
  Lines       22818    22807   -11     
  Branches     5446     5441    -5     
=======================================
- Hits        20465    20456    -9     
+ Misses       1619     1618    -1     
+ Partials      734      733    -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@pp-mo pp-mo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, this is a really useful simplification

  • I must say I found the old code hard to grasp, though I think I originally wrote it !

@pp-mo pp-mo merged commit 80ac9d4 into SciTools:main Dec 8, 2023
17 checks passed
@pp-mo
Copy link
Member

pp-mo commented Dec 8, 2023

Thanks @bouweandela a definite improvement !

And this really shows how the array API dispatch improves things -- so the new code is no longer concerned at all whether data is lazy or not.

However, it does rely on the odd fact that bool() forces computation of lazy arrays.
Out of interest, do you know why that is ?
I have never quite understood this, since I never found any explanation of it in the Dask docs.

@bouweandela
Copy link
Member Author

Thanks for reviewing!

However, it does rely on the odd fact that bool() forces computation of lazy arrays.
Out of interest, do you know why that is ?
I have never quite understood this, since I never found any explanation of it in the Dask docs.

The answer is simpler than I expected, have a look at the dask.array.Array.__bool__ method:

https://github.com/dask/dask/blob/191d39177009d2cce25b818878118e35329b6db3/dask/array/core.py#L1860-L1867

@bouweandela bouweandela deleted the faster-array-equal branch December 8, 2023 11:19
@pp-mo
Copy link
Member

pp-mo commented Dec 8, 2023

The answer is simpler than I expected, have a look at the dask.array.Array.__bool__ method:
https://github.com/dask/dask/blob/191d39177009d2cce25b818878118e35329b6db3/dask/array/core.py#L1860-L1867

Thanks, I see !

I still think it's a shame this is not highlighted in the user docs somewhere, though.
I was surprised when I found out !

@bouweandela
Copy link
Member Author

Good point. I opened a pull request to document it: dask/dask#10686

tkknight added a commit to tkknight/iris that referenced this pull request Dec 13, 2023
* main:
  Adopt inital noop ruff linter (SciTools#5623)
  Update links to https (SciTools#5621)
  Bump actions/stale from 8 to 9 (SciTools#5616)
  Faster and simpler iris.util.array_equal (SciTools#5610)
  Updated environment lockfiles (SciTools#5608)
  Bump actions/setup-python from 4 to 5 (SciTools#5614)
  Possible citation updates -- not clear whether appropriate. (SciTools#5453)
  Whats new updates for v3.7.0 . (SciTools#5451)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants