Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add OpenCL device selection at runtime and a OpenCL vignette #439

Merged
merged 29 commits into from Apr 15, 2021

Conversation

rok-cesnovar
Copy link
Member

Summary

Add OpenCL device selection at runtime and a OpenCL vignette

This isnt quite ready yet, but wanted to get comments.

Copyright and Licensing

Please list the copyright holder for the work you are submitting
(this will be you or your assignee, such as a university or company):
Rok Češnovar, Uni. of Ljubljana

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses:

R/args.R Outdated Show resolved Hide resolved
@codecov-io
Copy link

codecov-io commented Feb 3, 2021

Codecov Report

Merging #439 (8608e67) into master (9994f7f) will decrease coverage by 1.92%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #439      +/-   ##
==========================================
- Coverage   93.41%   91.48%   -1.93%     
==========================================
  Files          12       12              
  Lines        2930     2948      +18     
==========================================
- Hits         2737     2697      -40     
- Misses        193      251      +58     
Impacted Files Coverage Δ
R/run.R 95.46% <ø> (-1.36%) ⬇️
R/args.R 98.51% <42.85%> (-0.85%) ⬇️
R/model.R 91.47% <100.00%> (-0.90%) ⬇️
R/install.R 61.56% <0.00%> (-10.63%) ⬇️
R/knitr.R 91.66% <0.00%> (-4.17%) ⬇️
R/utils.R 91.22% <0.00%> (-1.17%) ⬇️
R/csv.R 97.29% <0.00%> (-0.99%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9994f7f...8608e67. Read the comment docs.

@rok-cesnovar
Copy link
Member Author

Ready for review.

@rok-cesnovar rok-cesnovar linked an issue Mar 16, 2021 that may be closed by this pull request
4 tasks
Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. I am going to read/edit the vignette now so more comments incoming.

R/run.R Show resolved Hide resolved
man-roxygen/model-common-args.R Outdated Show resolved Hide resolved
tests/testthat/test-opencl.R Show resolved Hide resolved
R/args.R Outdated Show resolved Hide resolved
Copy link
Member

@bbbales2 bbbales2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made some comments and some edits. Hopefully they're not too confusing. I didn't feel super great about the edits I made so if you want to revert anything that's fine.

vignettes/opencl.Rmd Outdated Show resolved Hide resolved
vignettes/opencl.Rmd Outdated Show resolved Hide resolved
vignettes/opencl.Rmd Show resolved Hide resolved
vignettes/opencl.Rmd Outdated Show resolved Hide resolved
vignettes/opencl.Rmd Show resolved Hide resolved
vignettes/opencl.Rmd Show resolved Hide resolved
vignettes/opencl.Rmd Outdated Show resolved Hide resolved
Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rok-cesnovar This looks really good. I haven't gone through the vignette yet, but I made a few minor comments on the code.

R/model.R Outdated Show resolved Hide resolved
R/model.R Outdated Show resolved Hide resolved
R/model.R Outdated Show resolved Hide resolved
man-roxygen/model-common-args.R Outdated Show resolved Hide resolved
@jgabry
Copy link
Member

jgabry commented Mar 23, 2021

@rok-cesnovar Are there still more review comments to address or other changes to make to this PR? Do you need me to do another review of this?

@rok-cesnovar
Copy link
Member Author

There are a few comments that I need to address. I have put this one on the backburner for a day or two to think it through. Will tag when ready.

@jgabry
Copy link
Member

jgabry commented Mar 23, 2021 via email

@rok-cesnovar
Copy link
Member Author

Should be ready for another look. Not sure how to exclude the vignette from coverage tests.

Comment on lines 21 to 28
As of version 2.26.1, users can expect speedups with OpenCL when using vectorized
probability distribution/mass functions (functions with the `_lpdf` or `_lpmf`
suffix). You can expect speedups when the input variables contain 20.000 or more elements.

The actual speedup for a model will depend on whether the `lpdf/lpmf` functions
are the bottlenecks of the model and the `lpdf/lpmf` function used.
The more computationally complex the function is, the larger the expected speedup.
The biggest speedups are expected when using the GLM functions.
Copy link
Sponsor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man this is great!

Copy link
Sponsor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one clarification, what are 20.000 elements? Is that data or input parameters? I just want to make sure that the vignette has no ambiguities.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 20k elements here are meant as N of the input vectors/row_vectors/arrays for the worst-case.

If most of them are data, the speedups will be bigger and typically a GPU will be faster than a CPU at smaller N.
The worst-case is all of the inputs are vector parameters as there is a lot more copies to be done there.

See for example:
image

The best case is the non-data inputs are scalars:

image

The theoretical best-case is all inputs are data, but an all-data lpdf/lpmf call in the model/transformed parameters blocks are just inefficient anyways. Its still shown on the above charts as that will be achievable once we turn on support for non-lpdf/lpmf functions as well (the inputs are calculated on the GPU -> less copies).

This is a "ballpark" number, as mentioned it will depend on the exact CPU/GPU/lpdf/lpmf.

Copy link
Sponsor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@rok-cesnovar
Copy link
Member Author

Ready for review.

@jgabry
Copy link
Member

jgabry commented Apr 12, 2021

Code changes look good. I'll take a look at the vignette soon.

Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rok-cesnovar This is really good. I pushed a few very minor edits and made just a few minor review comments, but this looks pretty much ready to go.

vignettes/opencl.Rmd Outdated Show resolved Hide resolved
vignettes/opencl.Rmd Outdated Show resolved Hide resolved
vignettes/opencl.Rmd Outdated Show resolved Hide resolved
vignettes/opencl.Rmd Show resolved Hide resolved
@jgabry
Copy link
Member

jgabry commented Apr 15, 2021

crap I just noticed one more thing. can you regenerate it again but set refresh=0 in the fit_cl line? Right now it's printing all the iteration updates for that one but has refresh=0 for the cpu one. Sorry I didn't notice that before! Otherwise I think it's ready to go.

@rok-cesnovar
Copy link
Member Author

Good call. Fixed. Thanks!

@jgabry
Copy link
Member

jgabry commented Apr 15, 2021

Ok cool, thanks. Should we go ahead and merge this or is there anything else you wanted to add/change?

@rok-cesnovar
Copy link
Member Author

I am good. nothing else to add.

@jgabry
Copy link
Member

jgabry commented Apr 15, 2021

ok great, i'll merge now

@jgabry jgabry merged commit 2e7c45e into master Apr 15, 2021
@jgabry jgabry deleted the opencl_runtime_args branch April 15, 2021 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support cmdstan 2.26 interface changes
5 participants