Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid double-loop and branching for leap years #430

Merged
merged 3 commits into from
Nov 15, 2024

Conversation

MichaelChirico
Copy link
Contributor

Supersedes #429.

Actually, we can avoid the variable-length array issue altogether. The old implementation uses leap to:

  1. Run a full nyear loop to populate whether each year is leap
  2. Run a second loop to read from this array to count the number of days. It also has a branch in each iteration which will be pretty volatile since branch prediction will be wrong ~half the time.

It's better to just count the number of days directly in a single loop. This will be slightly more memory-efficient, and should be more computationally efficient too.

@MichaelChirico
Copy link
Contributor Author

Locally I added a fast parameter to the routine and put the master code under !fast, this branch under fast to compare:

library(microbenchmark)
soy=xts:::startOfYear

# Comparing defaults: ~10% faster
microbenchmark(times=1e6, fast=soy(fast=TRUE), slow=soy(fast=FALSE))
# Unit: microseconds
#  expr   min    lq     mean median    uq      max neval
#  fast 2.411 4.142 5.498975  4.251 4.421 55194.07 1e+06
#  slow 2.571 4.621 5.851015  4.731 4.911 13580.37 1e+06

# Comparing huge edge case: ~40% faster
microbenchmark(times=1e4, fast=soy(from=0,to=1e4, fast=TRUE), slow=soy(from=0, to=1e4, fast=FALSE))
# Unit: microseconds
#  expr   min    lq     mean median     uq      max neval
#  fast 35.92 42.36 50.34491 43.301 44.471 12413.64 10000
#  slow 48.62 57.82 72.48831 59.040 60.631 12848.52 10000

This internal routine is only called once, here, so the default comparison is the fairest:

xts/R/index.R

Line 373 in ae39d2b

((startOfYear() * 86400 + (3 * 86400)) %/% 86400 %/% 7)[.indexyear(x) + 1]

@joshuaulrich
Copy link
Owner

joshuaulrich commented Nov 13, 2024

Thanks for this and the prior PR! I agree that this one is better, mostly because it's simpler.

Also, I wouldn't worry about the 'fast' parameter. The routine is only used in one function to find the week of the year, and the improvement is only a handful of microseconds.

@MichaelChirico
Copy link
Contributor Author

Also, I wouldn't worry about the 'fast' parameter.

Sorry, to clarify, I only did that to make the benchmark simpler, no plan to add it to the PR.

@joshuaulrich
Copy link
Owner

Oh, I misunderstood! Now I understand that your PR is ~40% faster than the current code. That's a nice added bonus to the simpler code!

@MichaelChirico
Copy link
Contributor Author

I think 10% is a fairer comparison since the function is only ever called with defaults as of now :)

@joshuaulrich joshuaulrich merged commit bfec4d1 into joshuaulrich:main Nov 15, 2024
@joshuaulrich
Copy link
Owner

Thanks for your work on this!

@joshuaulrich joshuaulrich added this to the [release candidate] milestone Nov 15, 2024
@MichaelChirico MichaelChirico deleted the patch-3 branch November 15, 2024 15:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants