Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Faster rl.{first|last}_run implementation #1405

Closed
1 task done
coxipi opened this issue Jun 28, 2023 · 0 comments · Fixed by #1530
Closed
1 task done

Faster rl.{first|last}_run implementation #1405

coxipi opened this issue Jun 28, 2023 · 0 comments · Fixed by #1530
Assignees
Labels
information For development/intsructional purposes

Comments

@coxipi
Copy link
Contributor

coxipi commented Jun 28, 2023

Generic Issue

Sub-function _boundary_run uses rle, but it could use _cumsum_reset_on_zero, effectively avoiding these two steps in rle:

cs_s = cs_s.where(da.shift({dim: -1}, fill_value=0) == 0)
out = cs_s.where(da == 1, 0)  # Reinsert 0's at their original place

These steps are probably much less demanding than the cumsum operations, but still, it's worth considering. I think there could be merit in keeping things as they are if we feel it makes the code more understandable, but I'm not sure about this, and also the run_length functions are somewhat at a low level in the code, so I have the feeling that performance should be prioritized here.

Side note about return type of first_run

I currently have to convert the result of first_run to be able to use it for time selection. There is no reason really for the output not to be already a int (of course, in the case where coord is False or None)

end_season_index = rl.first_run(sim==0,window=14).astype(int)
mask = xr.where(sim>0, 0,np.NaN)
mask[{"time":end_season_index}] = 1

This could be fixed by adding a else statement in coord_transform:

if coord:
    crd = da[dim]
    if isinstance(coord, str):
        crd = getattr(crd.dt, coord)

    out = lazy_indexing(crd, out)
else:
    out = out.astype(int)

Code of Conduct

  • I agree to follow this project's Code of Conduct
@coxipi coxipi changed the title Faster rl.first_run implementation Faster rl.{first|last}_run implementation Jun 28, 2023
@Zeitsperre Zeitsperre added the information For development/intsructional purposes label Jul 25, 2023
@Zeitsperre Zeitsperre added this to the xclim Long Term Goals milestone Jul 25, 2023
coxipi added a commit that referenced this issue Nov 30, 2023
<!--Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #1405
- [ ] Tests for the changes have been added (for bug fixes / features)
- [ ] (If applicable) Documentation has been added / updated (for bug
fixes / features)
- [x] CHANGES.rst has been updated (with summary of main changes)
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added

### What kind of change does this PR introduce?

* Use an intermediate step instead of `rle` in first/last run
computations

### Does this PR introduce a breaking change?
No

### Other information:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
information For development/intsructional purposes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants