Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerate import - prevent numba warnings #1378

Merged
merged 3 commits into from May 26, 2023
Merged

Accelerate import - prevent numba warnings #1378

merged 3 commits into from May 26, 2023

Conversation

aulemahal
Copy link
Collaborator

@aulemahal aulemahal commented May 26, 2023

Pull Request Checklist:

  • This PR addresses an already opened issue (for bug fixes / features)
    • This PR fixes an issue raised verbally by xscen developers.
  • Tests for the changes have been added (for bug fixes / features)
    • (If applicable) Documentation has been added / updated (for bug fixes / features)
  • CHANGES.rst has been updated (with summary of main changes)
    • Link to issue (:issue:number) and pull request (:pull:number) has been added

What kind of change does this PR introduce?

PR #1135 removed all signatures where possible, so that @jit-decorated function would be compiled at run time (you know "just in time"...). However, @guvectorize -decorated functions must have signatures and thus must be compiled at compile time, which means upon import.

This PR adds cache=True to these function, so numba compiles them once and than saves the compilation to a __pycache__ folder if possible. The first import will be slow as before, but subsequent imports should be much faster.
I'm sure there are edgecases where the user of xclim doesn't have write access to the source folder, which might mean this caching will be unavailable. But for most users, this should be a nice performance boost.

On my machine I went from 13s to 2.5s.

This PR also forces nopython=True to these @guvectorize functions. Numba 0.59 will change the default value of this parameter and it was raising warning complaining about our lack of precision.

Does this PR introduce a breaking change?

It shouldn't. But I do not know all the ramifications of caching these functions...

Other information:

@RondeauG, fyi.

@aulemahal aulemahal requested a review from Zeitsperre May 26, 2023 19:54
@github-actions github-actions bot added indicators Climate indices and indicators sdba Issues concerning the sdba submodule. labels May 26, 2023
@Zeitsperre
Copy link
Collaborator

On my machine I went from 13s to 2.5s.

Wow!

Copy link
Collaborator

@Zeitsperre Zeitsperre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Approved for additional tests label May 26, 2023
@aulemahal
Copy link
Collaborator Author

aulemahal commented May 26, 2023

In details.

Previous behaviour

image
Notice the 3 main culprits : "sdba.nbutils" (2 guvectorize funcs), "indices.fire._ffdi" (2 guvectorize funcs) and "core.units".

After this change, first import

image
Same as above.

After this change, second import

image
Only core.units left as significant. Around 50% of what is left comes from our dependencies. I guess 50% of import time is still large...

I am assuming the its the creation of the units registry that takes all this time in core.units, but I can't really test that easily with a simple python -X importtime .

@Zeitsperre
Copy link
Collaborator

Would it be possible to deal with some of the numba-related warnings being raised right now? For instance:

$ xclim show_version_info

INSTALLED VERSIONS
------------------
/home/runner/work/xclim/xclim/xclim/indices/fire/_cffwis.py:207: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
python: 3.8.16
  def _day_length(lat: int | float, mth: int):  # pragma: no cover
boltons: installed
/home/runner/work/xclim/xclim/xclim/indices/fire/_cffwis.py:227: NumbaDeprecationWarning: The 'nopython' keyword argument was not supplied to the 'numba.jit' decorator. The implicit default value for this argument is currently False, but it will be changed to True in Numba 0.59.0. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details.
  def _day_length_factor(lat: float, mth: int):  # pragma: no cover
bottleneck: 1.3.7
cf_xarray: 0.8.1
cftime: 1.6.2
click: 8.1.3
clisops: None
dask: 2023.5.0
flox: None
jsonpickle: 3.0.1
lmoments3: 1.0.5
numba: 0.57.0
pandas: 1.5.3
pint: 0.21.1
scipy: 1.8.1
sklearn: 1.2.2
statsmodels: 0.14.0
xarray: 2023.1.0
xclim: 0.43.2-beta
Anaconda-based environment: no

@aulemahal
Copy link
Collaborator Author

Of course! I forgot to look into this module (cffwis) as it had no pre-compiled numba funcs.

@aulemahal aulemahal merged commit 21ffe85 into master May 26, 2023
13 checks passed
@aulemahal aulemahal deleted the acc-nb-import branch May 26, 2023 21:06
aulemahal added a commit that referenced this pull request Jun 8, 2023
<!--Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [ ] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #xyz
- [ ] Tests for the changes have been added (for bug fixes / features)
- [ ] (If applicable) Documentation has been added / updated (for bug
fixes / features)
- [ ] CHANGES.rst has been updated (with summary of main changes)
- [ ] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added

### What kind of change does this PR introduce?

PR #1378 introduced caching for the `guvectorize` functions. Turns out
that this caching interferes with the development of these function as
the cached version doesn't get updated when the code is modified. (As
tested by @coxipi )

However, when installing a new version of xclim, cached copies are
removed and thus updated. The bug mentionned above only affects locally
installed versions (`pip install -e`). A note was added to the
contributing notes.

### Does this PR introduce a breaking change?
No

### Other information:
No
@aulemahal aulemahal mentioned this pull request Oct 31, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved for additional tests indicators Climate indices and indicators sdba Issues concerning the sdba submodule.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants