Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move static local variable out of templated function #299

Merged
merged 4 commits into from May 17, 2023

Conversation

DavisVaughan
Copy link
Member

@DavisVaughan DavisVaughan commented Nov 11, 2022

Closes #298

@jimhester would you mind reviewing this for me? You are the last one to have touched this code, and probably understand it better than anyone else. The #298 issue provides the backstory.

A rerun of the benchmark in #298 gives me:

# Main
bench::mark(test(x), test2(x), iterations = 20)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 test(x)    715.35ms 751.55ms      1.31    7.63MB     1.96
#> 2 test2(x)     7.65ms   9.11ms    100.      7.63MB    35.1

# This PR
bench::mark(test(x), test2(x), iterations = 20)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 test(x)     69.13ms  79.75ms      11.7    7.63MB     17.5
#> 2 test2(x)     7.47ms   9.34ms      98.1    7.63MB     34.3

Which better matches the performance we had in 0.4.0

# cpp11 0.4.0
bench::mark(test(x), test2(x), iterations = 20)
#> Warning: Some expressions had a GC in every iteration; so filtering is disabled.
#> # A tibble: 2 × 6
#>   expression      min   median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl>
#> 1 test(x)     63.56ms   76.4ms      12.1    7.63MB     18.1
#> 2 test2(x)     7.19ms    8.3ms     103.     7.63MB     36.1

I also confirmed that I can't reproduce the original issue in #244 that forced us to move away from a global static variable.

FWIW the pattern I've used here is fairly similar to what is done in <date> to determine the tzdb file path
https://github.com/HowardHinnant/date/blob/22ceabf205d8d678710a43154da5a06b701c5830/src/tz.cpp#L394-L429

Comment on lines -66 to -68
Rboolean* should_unwind_protect =
reinterpret_cast<Rboolean*>(LOGICAL(should_unwind_protect_sexp));
should_unwind_protect[0] = TRUE;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was previously overwriting the global option to TRUE every time a new compilation unit needed to initialize its static local variable. I don't think we want to do that. I think we only want to initialize to TRUE the very first time this is created, i.e. in the if branch above.

Comment on lines +73 to +77
inline Rboolean* access_should_unwind_protect() {
// Setup is run once per compilation unit, but all compilation units
// share the same global option, so each compilation unit's static pointer
// will point to the same object.
static Rboolean* p_should_unwind_protect = setup_should_unwind_protect();
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function holding the static local variable is no longer templated, so each compilation unit will only have to run setup_should_unwind_protect() 1 time

@romainfrancois
Copy link
Collaborator

LGTM. Not strictly related to this PR's business, but should we be concerned about the option being set to something that is not a logical ?

@romainfrancois
Copy link
Collaborator

I can confirm the performance gain:

main:

> bench::mark(test(x), test2(x), iterations = 20)
# A tibble: 2 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result            memory               time            gc               
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>            <list>               <list>          <list>           
1 test(x)    232.65ms 236.53ms      4.09    7.63MB     7.77    20    38      4.89s <chr [1,000,000]> <Rprofmem [149 × 3]> <bench_tm [20]> <tibble [20 × 3]>
2 test2(x)     4.39ms   4.53ms    191.      7.63MB    38.2     20     4   104.82ms <chr [1,000,000]> <Rprofmem [1 × 3]>   <bench_tm [20]> <tibble [20 × 3]>

#299 :

> bench::mark(test(x), test2(x), iterations = 20)
# A tibble: 2 × 13
  expression      min   median `itr/sec` mem_alloc `gc/sec` n_itr  n_gc total_time result            memory               time            gc               
  <bch:expr> <bch:tm> <bch:tm>     <dbl> <bch:byt>    <dbl> <int> <dbl>   <bch:tm> <list>            <list>               <list>          <list>           
1 test(x)     32.17ms  34.15ms      25.9    7.63MB     38.9    20    30      772ms <chr [1,000,000]> <Rprofmem [225 × 3]> <bench_tm [20]> <tibble [20 × 3]>
2 test2(x)     4.43ms   4.54ms     144.     7.63MB     36.0    20     5      139ms <chr [1,000,000]> <Rprofmem [1 × 3]>   <bench_tm [20]> <tibble [20 × 3]>

@romainfrancois romainfrancois merged commit 3558055 into r-lib:main May 17, 2023
14 checks passed
@DavisVaughan
Copy link
Member Author

@romainfrancois we should also try looking at this comment from Jim, possibly as a (nicer?) alternative to this PR 9a62c3a#r90124412

DavisVaughan added a commit to DavisVaughan/cpp11 that referenced this pull request Jul 26, 2023
DavisVaughan added a commit that referenced this pull request Jul 28, 2023
* Revert #299

* Remove unwind protect global variable altogether

* Add a descriptive test

* Add FAQ bullets about `unwind_protect()`

* Tweak internals vignette advice

* NEWS bullet

* Tweaks from code review
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Nested unwind_protect() optimization isn't working correctly
2 participants