GPU Lower Triangular Inverse #1030

SteveBronder · 2018-09-13T02:48:56Z

Summary

Adds the kernels and functions for performing inversions of lower triangular matrices on the GPU.

At the stan math user level, this PR adds the function lower_triangular_inverse(A) that accepts a matrix_gpu. Internally, we add 3 kernels which break the inversion into three separate steps following the steps described on page 2 of the Stan OpenCL paper. step1 performs a matrix inversion with no blocking on the top lower triangular while step2 and step3 Calculates the intermediate and final products needed in the parallel blocked version of the inverse.

Tests

Tests are available in lower_triangular_inverse_test.cpp

inverse_gpu_exception

Checks for an std::invalid_argument on a non-square matrix_gpu

inverse_gpu_small

Checks whether the CPU and GPU version of the matrix inverse have near the same level of precision.

inverse_gpu_big

Checks whether the CPU and GPU version of the matrix inverse have near the same level of precision.

Checklist

Math issue Add GPU Inverse of a lower triangular matrix #1028
Copyright holder: (fill in copyright holder information)

By submitting this pull request, the copyright holder is agreeing to the license the submitted work under the following licenses:
- Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)
- Documentation: CC-BY 4.0 (https://creativecommons.org/licenses/by/4.0/)

Rok Češnovar and Erik Štrumbelj (Faculty of Computer and Information Science, University of Ljubljana)
Steve Bronder
the basic tests are passing
- unit tests pass (to run, use: ./runTests.py test/unit)
- header checks pass, (make test-headers)
- docs build, (make doxygen)
- code passes the built in C++ standards checks (make cpplint)
the code is written in idiomatic C++ and changes are documented in the doxygen
the new changes are tested

…tan-dev#759 (Issue stan-dev#210) Differences: 1. Technically support integrals with infinite limits (doesn't work great) 2. No more relative/absolute tolerance split

…210)

…ssue 210)

…stable/2017-11-14)

change to 0 if function == 0

…stable/2017-11-14)

this fails for a couple of commented out tests

…_501/final)

…the docs

# Conflicts: # stan/math/gpu/opencl_context.hpp

# Conflicts: # stan/math/gpu/opencl_context.hpp # test/unit/math/gpu/multiply_test.cpp

…gs/RELEASE_500/final)

SteveBronder · 2018-10-14T00:55:44Z

I added a few things from the paper and changed the names. I think these names are pretty fine. Rok if you have disagreements on those then I'm very open to changing them, though I think they are an okay mix of shorthand and 'getting the point across'. We should make a wiki page out of the google doc and link to it in each kernel.

SteveBronder · 2018-10-14T00:56:32Z

Either Jenkins is down or I goofed up p bad

rok-cesnovar · 2018-10-14T06:19:19Z

I am getting a 404 error on the Jenkins Details link, so I guess its down.

seantalts · 2018-10-14T16:32:18Z

I installed a security update and that has made a lot of the logged out links not work for some reason. Can you try logging in on the home page and then clicking the link?

rok-cesnovar · 2018-10-14T17:22:48Z

@seantalts It works! Thanks

…stable/2017-11-14)

bob-carpenter · 2018-10-16T03:04:51Z

The math library has been inconsistent in how verbose the names are, but we've been pretty consistent with diagonal -> diag inverse -> inv negative -> neg rectangular -> rect triangular -> tri which would yield the much less unwieldy diag_inv inv_lower_tri_multiply neg_rect_lower_tri_multiply

SteveBronder · 2018-10-16T03:19:09Z

I like those and agree, changing now. Also I'm cleaning up the developer docs and placing it in the stancon2018 GPU repo as a pdf. I think including a link to it in the code docs will be the easiest way to give people who want a deeper understanding of the implementation more background. Is that alright with everyone?

seantalts · 2018-10-16T03:25:15Z

I think we can and should incorporate that doc into the code, I just haven't had time to work on it yet. Hopefully tomorrow. I think we can doc the high level overview under stan/math/gpu/lower_tri_inverse.hpp and put the rest of the step specific doc in with the steps - there are even parts of the code copied into the doc, so just doc'ing the code where it lies should be more efficient. Plus doc that lives closer to the code is less likely to become out-of-date.

rok-cesnovar · 2018-10-16T03:58:27Z

I will take a crack at this. Should I link to the images or just ignore them?

SteveBronder · 2018-10-16T03:58:58Z

Oh nice I agree with that. I spent some time tonight cleaning up the google doc a bit, if you have time tmrw to work on this that would be rad!

seantalts · 2018-10-16T04:01:11Z

Good question - I think they're really helpful, and it seems easier to link than to include them in the repo. I'd throw them up on the Stan Math wiki somewhere and link to them from there.

…

On Mon, Oct 15, 2018 at 11:58 PM Rok Češnovar ***@***.***> wrote: I will take a crack at this. Should I link to the images or just ignore them? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1030 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAxJ7FGYQQKJsQ6-PyTZspk7DZ32xuiIks5ulVlMgaJpZM4Wmezj> .

rok-cesnovar · 2018-10-16T20:22:22Z

I have added everything on the wiki, together with the images that will be linked in the code docs. Link: https://github.com/stan-dev/math/wiki/GPU-Kernels

seantalts · 2018-10-18T15:30:47Z

Sorry, I meant that I wanted to bring the doc into the code as doc strings. So I'm going to try to move all of the text into the code files that already exist and hopefully generate good Doxygen from that. And I'll have to link to images on the wiki. Sorry I've been so busy - back to back conferences and just gave a talk yesterday, but I'll try to set aside some time tonight.

…verse

…gs/RELEASE_500/final)

SteveBronder · 2018-10-20T17:37:57Z

Alright so I changed the names to what Bob suggested and added the images and a lot of comments to the kernel code. @seantalts lmk how you feel about this

…verse

…math into gpu_lower_tri_inverse

seantalts · 2018-10-25T04:17:37Z

Looks good! I went through and polished the doc a little; I think I probably gave you guys the impression I was looking for more detailed doc than I needed. I was hoping you guys could look at this diff and 1) make sure I didn't mess anything up and 2) get a feel for the level of doc that we're looking for - which stuff I deleted, which stuff I added from the wiki. @SteveBronder already did a pass adding in basically all the relevant stuff from the wiki, so that was great. In general we'd prefer doc to live with the code when possible, we usually don't ask people to doc code that is doing fairly standard stuff (but that is obviously a relative question especially when introducing a new framework to an existing project), and if you can give something a descriptive name and delete a comment that's usually a win. I did this in one place (seemed like 'factor' was less helpful as a name than the comment above it that it was the diagonal element) but I need to double check I did that right.

Here's the diff from my two commits:
https://github.com/stan-dev/math/pull/1030/files/34bc01405cf8e0997ca52d2029255bdd6e84b871..416b0970a5becda77c6550d0ded2d3a371793b8e

stan/math/gpu/kernels/diag_inv.hpp

SteveBronder · 2018-10-25T16:21:32Z

Looks good besides one little extra star I saw. Thanks!

rok-cesnovar and others added 30 commits June 6, 2018 21:36

added the lower_tri_inverse

657787b

added ifdef for STAN_OPENCL in new files

20c0114

Merge branch 'gpu_matrix_multiply' into gpu_lower_tri_inverse

f437d17

fixed check headers

ab92f34

Merge branch 'gpu_basic_matrix_algebra' into gpu_lower_tri_inverse

5268baa

Switching 1D integrator to use Boost. Incorporating changes from pull s…

36b22ab

…tan-dev#759 (Issue stan-dev#210) Differences: 1. Technically support integrals with infinite limits (doesn't work great) 2. No more relative/absolute tolerance split

Added missing file and and removed forward mode autodiff code (Issue …

40e73c5

…210)

Updated tests for 1d integrator (Issue stan-dev#210)

dc715f7

Added xc argument to avoid loss of precision for Boost integrators (I…

7493cac

…ssue 210)

Added test to make sure xc/nan behavior is correct (Issue 210)

b67abad

Added missing header (Issue stan-dev#210)

b3c42ca

add more hard tests

829aa92

add Gaussian integral with var sigma

09335d8

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

2941496

…stable/2017-11-14)

Allow endpoints of integration to be vars (Issue stan-dev#210)

f9940db

Fixed cpplint issues n' such (Issue 210)

9dfe356

catch gradient being NaN

7976287

change to 0 if function == 0

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

1720055

…stable/2017-11-14)

add tests that PDFs integrate to 1

0f46b9a

this fails for a couple of commented out tests

[Jenkins] auto-formatting by clang-format version 5.0.1 (tags/RELEASE…

c53996d

…_501/final)

Made tolerance match Boost recommentations (Issue 210)

56cb203

Rearranged some headers (Issue 210)

d9d8e0c

fix weird test error with mpirun

987b24c

weird error was not so weird

6da272a

More granular test stages for better replay on Jenkins

8c05071

Rename tests and files to remove opencl where not necessary, clarify …

f427242

…the docs

added copy on return for size 0 and 1 in copy_triangular

58622f5

Merge branch 'gpu_matrix_multiply' into gpu_lower_tri_inverse

b358925

# Conflicts: # stan/math/gpu/opencl_context.hpp

removing the basic_matrix_algebra header

47412d6

Merge branch 'gpu_matrix_multiply' into gpu_lower_tri_inverse

4ad98f2

# Conflicts: # stan/math/gpu/opencl_context.hpp # test/unit/math/gpu/multiply_test.cpp

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

153ff18

…gs/RELEASE_500/final)

Shorten names so formatting is nicer

f747102

[Jenkins] auto-formatting by clang-format version 6.0.0 (tags/google/…

b326a7c

…stable/2017-11-14)

SteveBronder and others added 3 commits October 20, 2018 12:21

Merge remote-tracking branch 'upstream/develop' into gpu_lower_tri_in…

d6195af

…verse

Fixes up names of functions and adds inline documentation

dcc0800

[Jenkins] auto-formatting by clang-format version 5.0.0-3~16.04.1 (ta…

c9fe692

…gs/RELEASE_500/final)

SteveBronder and others added 4 commits October 22, 2018 21:57

Merge remote-tracking branch 'upstream/develop' into gpu_lower_tri_in…

0b4a7da

…verse

Merge branch 'gpu_lower_tri_inverse' of https://github.com/bstatcomp/…

34bc014

…math into gpu_lower_tri_inverse

Add link to report

c60d0e9

Polish doc a little

416b097

SteveBronder commented Oct 25, 2018

View reviewed changes

stan/math/gpu/kernels/diag_inv.hpp Show resolved Hide resolved

seantalts approved these changes Oct 25, 2018

View reviewed changes

seantalts merged commit 00e943d into stan-dev:develop Oct 25, 2018

SteveBronder deleted the gpu_lower_tri_inverse branch May 22, 2019 03:44

Uh oh!

Uh oh!

GPU Lower Triangular Inverse #1030

GPU Lower Triangular Inverse #1030

Uh oh!

Conversation

SteveBronder commented Sep 13, 2018

Summary

Tests

Checklist

Uh oh!

SteveBronder commented Oct 14, 2018

Uh oh!

SteveBronder commented Oct 14, 2018

Uh oh!

rok-cesnovar commented Oct 14, 2018

Uh oh!

seantalts commented Oct 14, 2018

Uh oh!

rok-cesnovar commented Oct 14, 2018

Uh oh!

bob-carpenter commented Oct 16, 2018 via email

Uh oh!

SteveBronder commented Oct 16, 2018

Uh oh!

seantalts commented Oct 16, 2018

Uh oh!

rok-cesnovar commented Oct 16, 2018

Uh oh!

SteveBronder commented Oct 16, 2018

Uh oh!

seantalts commented Oct 16, 2018 via email

Uh oh!

rok-cesnovar commented Oct 16, 2018

Uh oh!

seantalts commented Oct 18, 2018

Uh oh!

SteveBronder commented Oct 20, 2018

Uh oh!

seantalts commented Oct 25, 2018

Uh oh!

Uh oh!

SteveBronder commented Oct 25, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants