Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional profiling #41

Merged
merged 11 commits into from Apr 12, 2022
Merged

Optional profiling #41

merged 11 commits into from Apr 12, 2022

Conversation

adrian-lison
Copy link
Collaborator

@adrian-lison adrian-lison commented Apr 7, 2022

This PR implements optional profiling of nowcasting models. Profiling of stan models as offered by cmdstan is useful for debugging and performance optimization (see https://mc-stan.org/cmdstanr/articles/profiling.html). However, it also comes with a slight overhead, making it suboptimal to have profiling activated by default. Stan does not offer an option to turn profiling on and off at the moment. This PR implements a workaround which allows to keep the profiling statements in the same model that will be used for deployment by adding a corresponding option to enw_model. If profile=FALSE, all profiling statements are removed from the model code before compilation.

Some basic profiling statements are added to epinowcast.stan for illustration.

Further details:

  • The implementation assumes that the model is syntactically correct. If the model syntax is incorrect, compilation would fail anyway, but might fail in unforeseen ways when profile=FALSE.
  • Nested profile statements are also supported. This is useful for profiling a larger section but having individual profiling for certain subsections of interest.
  • Profiling statements in model code that is included from additional files in the include path are currently not supported. That is, they will not be removed even if profile=FALSE. Extending to these include files at a later stage should not be too difficult however.
  • A further extension for later would be to allow specific profiling statements to be toggled on/off, e.g. by providing a list of profiling identifiers that should be kept, and the rest will be removed.

If param 'profile' is FALSE, all profiling statements are removed from the stan model code before compilation
Removal of profiling statements in other files in include paths not yet supported
@adrian-lison adrian-lison linked an issue Apr 7, 2022 that may be closed by this pull request
@adrian-lison adrian-lison added the enhancement New feature or request label Apr 7, 2022
@codecov
Copy link

codecov bot commented Apr 7, 2022

Codecov Report

Merging #41 (1870025) into main (20274a8) will not change coverage.
The diff coverage is 0.00%.

❗ Current head 1870025 differs from pull request most recent head 2369829. Consider uploading reports for the commit 2369829 to get more accurate results

@@          Coverage Diff          @@
##            main     #41   +/-   ##
=====================================
  Coverage   0.00%   0.00%           
=====================================
  Files         13      13           
  Lines        641     648    +7     
=====================================
- Misses       641     648    +7     
Impacted Files Coverage Δ
R/model.R 0.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 20274a8...2369829. Read the comment docs.

@adrian-lison
Copy link
Collaborator Author

One more interesting thing to note: Generally, switching from profiling to no profiling requires a recompilation of the stan model. However, once the model has been compiled with profile=FALSE, one can always switch between profiling and no profiling without triggering a recompilation. I have verified that this still has an actual effect on whether the model will be profiled or not.

I lack a deeper understanding of what goes on during profiling under the hood, but it seems that the profiling statements are handled separately from the main stan C code.

If we don't want to confuse users with the recompilation, we could hence change the model instantiation function such that if profile=TRUE, it always calls itself with profile=FALSE first.

@adrian-lison adrian-lison self-assigned this Apr 7, 2022
@seabbs seabbs self-requested a review April 7, 2022 07:06
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really great and will be super useful going forward.

Not surprised that the likelihood is so dominant compute wise (though I would expect that to change with more flexible models where more delay distributions need to be calculated). #40 has reinforced my view that optimisation wise most of run-time in the likelihood (that we can do something about) is linked to the coversion from hazard to probability but you are correct being able to profile within functions would be useful to confirm this. Do you think (and can you in fact do this) it is worth inserting some profiling into the generated quantities to see if optimisation is worth doing with that code. Part of me wants it to remain relatively clear but if that comes at a big speed cost then it isn't worth it.

Left some very minor comments on this PR. There are also some issues that the automated linting is flagging that would be good to correct to keep a consistent style. We should maybe talk about setting up automated PR tools to allow styling, redocing etc using github actions.

R/model.R Outdated Show resolved Hide resolved
R/model.R Show resolved Hide resolved
R/model.R Outdated Show resolved Hide resolved
inst/stan/epinowcast.stan Outdated Show resolved Hide resolved
inst/stan/epinowcast.stan Outdated Show resolved Hide resolved
@seabbs
Copy link
Collaborator

seabbs commented Apr 7, 2022

/document

@seabbs
Copy link
Collaborator

seabbs commented Apr 7, 2022

/style

seabbs and others added 5 commits April 7, 2022 17:03
Changed names of the profiling sections. There is always a prefix now that states the corresponding stan block: "transformed", "model", "generated".
Moreover, changed the names of the sections to be more concise and like variables.
@adrian-lison
Copy link
Collaborator Author

adrian-lison commented Apr 11, 2022

@seabbs I added profiling statements to the generated quantities section. If you think everything is fine, you can already merge into main, but let's keep this branch still - I still want to add support for included files. I can submit an additional pull request then.

NEWS.md Show resolved Hide resolved
Copy link
Collaborator

@seabbs seabbs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to go in main to me. Will merge and can do another PR for in function profiling (which will be very nice to have).

@seabbs seabbs merged commit 34f2093 into main Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Profile stan code to increase efficiency
2 participants