Rcpp implementation of final size with risk groups #66

pratikunterwegs · 2022-10-07T14:46:45Z

This PR implements an Rcpp function final_size_grps_cpp, using the Eigen C++ library, and fixes #40.

The Rcpp code ports C++ code from https://gitlab.com/epidemics-r/code_snippets/-/blob/feature/newton_solver/include/finalsize.hpp.

The PR also fixes #69 by correcting the preparation of the contact matrix, the corrected version in the *_solver.R follows its *_solver.h implementation. Language equivalence tests are included in 3c6ea00.

src/iterative_solver.h

Bisaloo

The C++ implementation is triggering segfaults on my computer. Please leave me some time to try and figure out what's going on.

Bisaloo · 2022-10-12T14:58:53Z

R/iterative_solver.R

@@ -38,7 +38,7 @@ solve_final_size_iterative <- function(contact_matrix,
  epi_final_size[i_here] <- 0.0

  # matrix filled by columns
-  contact_matrix_ <- contact_matrix * demography_vector %o% susceptibility


Is it normal that the fix here differs from the fix in R/newton_solver.R?

That's a great question, which I don't have the theoretical background to answer. I was also surprised to find that the contact matrix filling differs between the Newton and iterative solvers in Edwin's C++ code, see

Iterative solver (only multiplying cols by demography): https://gitlab.com/epidemics-r/code_snippets/-/blob/feature/newton_solver/include/finalsize.hpp#L51

Newton solver (multiplying rows by susceptibility, and cols by demography): https://gitlab.com/epidemics-r/code_snippets/-/blob/feature/newton_solver/include/finalsize.hpp#L144

Initially, I thought that this must be a mistake, so I implemented both to be the same (taking the Newton solver as 'correct'). While fixing #69 I decided to go with what Edwin had written in case I was wrong (about him making a mistake), and returned it to his implementation. I tried both implementations, and they both pass the solver equivalence tests in both R and C++, even with differences among susceptibility groups, which should make some difference. I have no idea why.

Bisaloo · 2022-10-12T15:02:12Z

R/newton_solver.R

-    # partial pivoting LU decomposition
-    cache_m_pivlu <- Matrix::lu(cache_m)
-    cache_m_pivlu <- Matrix::expand(cache_m_pivlu)
-    cache_m_pivlu <- cache_m_pivlu$L %*% cache_m_pivlu$U


Could you comment why you removed this? Was it a mistake? Or just unnecessary?

It would probably be good to re-run the benchmarks now as this looks like a pretty expensive step performance-wise (and it was called inside the loop).

I initially added this step when the Newton solver was failing at particular r0 (~2, 4, 12), because (partial?) pivoting LU decomposition is used in Edwin's code: https://gitlab.com/epidemics-r/code_snippets/-/blob/feature/newton_solver/include/finalsize.hpp#L166, and using Matrix::lu appeared to be the correct way to do this in R.

After finding that the implementation of contact matrix filling (your comment above) did not impact the results, I looked to see what other code might be redundant. This chunk seemed suspicious, and the full test suite (including language and solver equivalence) passed after removing it, so I did. (I've also completed a Julia implementation, which doesn't require this pivoting LU decomp either). Overall the evidence points to this step being unnecessary except for Eigen's solve. Once again, I don't have the theoretical background to guess why this is the case.

You're correct about the speed gain, of course, see this benchmarking from this morning from a conversation with Tim. The Newton solver is now takes about 2x time of the iterative solver in R, whereas it was about 5x before this step was removed.

library(finalsize) # prepare arguments contact_matrix <- c( 5.329620e-08, 1.321156e-08, 1.832293e-08, 7.743492e-09, 5.888440e-09, 2.267918e-09, 1.321156e-08, 4.662496e-08, 1.574182e-08, 1.510582e-08, 7.943038e-09, 3.324235e-09, 1.832293e-08, 1.574182e-08, 2.331416e-08, 1.586565e-08, 1.146566e-08, 5.993247e-09, 7.743492e-09, 1.510582e-08, 1.586565e-08, 2.038011e-08, 1.221124e-08, 9.049331e-09, 5.888440e-09, 7.943038e-09, 1.146566e-08, 1.221124e-08, 1.545822e-08, 8.106812e-09, 2.267918e-09, 3.324235e-09, 5.993247e-09, 9.049331e-09, 8.106812e-09, 1.572736e-08 ) |> matrix(6, 6) # make a demography vector demography_vector <- c( 10831795, 11612456, 13511496, 11499398, 8167102, 4587765 ) # get an example r0 r0 <- 1.3 contact_matrix = r0 * contact_matrix # a p_susceptibility matrix p_susceptibility <- matrix(0.5, nrow(contact_matrix), 2) susceptibility <- matrix(c(0.1, 0.7), nrow(contact_matrix), 2, byrow = T) # benchmark implementations microbenchmark::microbenchmark( times = 1000, "iterative_solver_r" = final_size_grps( contact_matrix = contact_matrix, demography_vector = demography_vector, susceptibility = susceptibility, p_susceptibility = p_susceptibility, solver = "iterative", control = list( iterations = 10000, tolerance = 1e-6 ) ), "newton_solver_r" = final_size_grps( contact_matrix = contact_matrix, demography_vector = demography_vector, susceptibility = susceptibility, p_susceptibility = p_susceptibility, solver = "newton", control = list( iterations = 10000, tolerance = 1e-6 ) ), "iterative_solver_cpp" = final_size_grps_cpp( contact_matrix = contact_matrix, demography_vector = demography_vector, susceptibility = susceptibility, p_susceptibility = p_susceptibility, solver = "iterative", control = list( iterations = 10000, tolerance = 1e-6, step_rate = 1.9, adapt_step = TRUE ) ), "newton_solver_cpp" = final_size_grps_cpp( contact_matrix = contact_matrix, demography_vector = demography_vector, susceptibility = susceptibility, p_susceptibility = p_susceptibility, solver = "newton", control = list( iterations = 10000, tolerance = 1e-6 ) ) ) #> Warning in microbenchmark::microbenchmark(times = 1000, iterative_solver_r = #> final_size_grps(contact_matrix = contact_matrix, : less accurate nanosecond #> times to avoid potential integer overflows #> Unit: microseconds #> expr min lq mean median uq max #> iterative_solver_r 63.509 67.7730 75.760374 69.577 74.3535 4012.588 #> newton_solver_r 136.653 145.2425 166.597596 148.789 156.4970 5910.601 #> iterative_solver_cpp 5.371 5.8630 6.413671 6.273 6.6420 15.170 #> newton_solver_cpp 12.628 13.3660 14.171486 13.858 14.5140 53.628 #> neval #> 1000 #> 1000 #> 1000 #> 1000

^{Created on 2022-10-12 by the reprex package (v2.0.1)}

Bisaloo · 2022-10-12T15:20:52Z

tests/testthat/test-language_equivalence.R

+    7.943038e-09, 1.146566e-08, 1.221124e-08, 1.545822e-08, 8.106812e-09,
+    2.267918e-09, 3.324235e-09, 5.993247e-09, 9.049331e-09, 8.106812e-09,
+    1.572736e-08
+  ) |> matrix(6, 6)


I was thinking of opening an issue about this topic later but please refrain from using the native pipe and lambda functions anywhere in the package, including the examples and the tests as we want:

all users to be able to run examples

R CMD check to pass on all supported R versions

I'm converting this to an issue as there are multiple instances of pipes and lambdas in the tests, will remove.

f0de8d7 fixes #70

tests/testthat/test-finalsize_grps_cpp_solver_equivalence.R

Bisaloo · 2022-10-12T15:26:43Z

tests/testthat/test-finalsize_grps_cpp_newton.R

+# check for correct final size calculation in complex data case
+# using newton solver


Do we have a case, e.g. a published paper, with known values of final size? This would be even better than checking the range.

You now added tests to ensure all implementations return the same result but what if the result was wrong in all cases? Do we have a reference implementation / set of values?

I'll look around, and also ask one of the PIs. My understanding is that the tests for 'correct answers' in the files test-*_solver.R, as well as those using the upper_limit function in the files test-*_solver_vary_r0.R ensure that the answers are exactly correct in some simple cases at least.

tests/testthat/test-finalsize_grps_cpp_iterative.R

Co-authored-by: Hugo Gruson <Bisaloo@users.noreply.github.com>

pratikunterwegs · 2022-10-13T09:09:30Z

The C++ implementation is triggering segfaults on my computer. Please leave me some time to try and figure out what's going on.

Thanks @Bisaloo, hope the C++ code works for you - if the errors persist I can also look into them.

Bisaloo · 2022-10-13T09:29:22Z

Turns out it's not a segfault. I assumed wrongly when RStudio crashed. When running from the terminal, I get the following error report. Not entirely sure what to get from this 🤔:

R: /home/hugo/.local/share/R/x86_64-pc-linux-gnu-library/4.2/RcppEigen/include/Eigen/src/Core/CwiseBinaryOp.h:110: Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::CwiseBinaryOp(const Lhs&, const Rhs&, const BinaryOp&) [with BinaryOp = Eigen::internal::scalar_product_op<double, double>; LhsType = const Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; RhsType = const Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Lhs = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Rhs = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >]: Assertion `aLhs.rows() == aRhs.rows() && aLhs.cols() == aRhs.cols()' failed.

pratikunterwegs · 2022-10-13T10:05:14Z

Turns out it's not a segfault. I assumed wrongly when RStudio crashed. When running from the terminal, I get the following error report. Not entirely sure what to get from this 🤔:

R: /home/hugo/.local/share/R/x86_64-pc-linux-gnu-library/4.2/RcppEigen/include/Eigen/src/Core/CwiseBinaryOp.h:110: Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::CwiseBinaryOp(const Lhs&, const Rhs&, const BinaryOp&) [with BinaryOp = Eigen::internal::scalar_product_op<double, double>; LhsType = const Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; RhsType = const Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Lhs = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Rhs = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >]: Assertion `aLhs.rows() == aRhs.rows() && aLhs.cols() == aRhs.cols()' failed.

Thanks. I see that Eigen::Map is invovled, and I think it can be safely removed in epi_spread.h, will do so and convert to vectors instead, maybe that will help

pratikunterwegs · 2022-10-13T13:46:12Z

Turns out it's not a segfault. I assumed wrongly when RStudio crashed. When running from the terminal, I get the following error report. Not entirely sure what to get from this 🤔:

R: /home/hugo/.local/share/R/x86_64-pc-linux-gnu-library/4.2/RcppEigen/include/Eigen/src/Core/CwiseBinaryOp.h:110: Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::CwiseBinaryOp(const Lhs&, const Rhs&, const BinaryOp&) [with BinaryOp = Eigen::internal::scalar_product_op<double, double>; LhsType = const Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; RhsType = const Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Lhs = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >; Eigen::CwiseBinaryOp<BinaryOp, Lhs, Rhs>::Rhs = Eigen::ArrayWrapper<Eigen::Map<Eigen::Matrix<double, -1, -1>, 0, Eigen::Stride<0, 0> > >]: Assertion `aLhs.rows() == aRhs.rows() && aLhs.cols() == aRhs.cols()' failed.

Thanks. I see that Eigen::Map is invovled, and I think it can be safely removed in epi_spread.h, will do so and convert to vectors instead, maybe that will help

I've changed how Eigen::Map is used, and cut down on its use where possible. I also edited the members of the epi_spread function to be VectorXd rather than MatrixXd (although this shouldn't be an issue). Hopefully this will resolve the error.

pratikunterwegs self-assigned this Oct 7, 2022

pratikunterwegs added the New feature New feature or request label Oct 7, 2022

pratikunterwegs added 10 commits October 11, 2022 14:50

Implement final_size_grps_cpp, iterative solver, epi spread

38afc6a

Add basic test finalsize grps with iterative solver

c278e58

Rcpp procedural exports

06d0ba9

Add working Newton solver

453e31f

Final size with risk groups, two solver options

354de0c

Add basic tests for finalsize_grps_cpp

5b3ac7a

Documentation and NAMESPACE for finalsize_grps_cpp

a83d569

Add comments, procedural Rcpp exports

6d01908

Remove p_susc from iterative solver and epi_spread_data

da1935d

Check cpp solver equivalence with test

0c3332a

pratikunterwegs force-pushed the feature/finalsize_grps_cpp branch from dd8697d to 0c3332a Compare October 11, 2022 12:50

TimTaylor reviewed Oct 11, 2022

View reviewed changes

src/iterative_solver.h Outdated Show resolved Hide resolved

pratikunterwegs added 10 commits October 11, 2022 16:52

Compact code and pass solver settings as control

ca2809c

Update to pass control list

34ab1f1

Update control documentation

91f4db8

Correct contact_matrix prep, both langs now identical

4cd4d33

Rm pivLU in R, correct contact_matrix prep

0019285

Test lang answers equivalent for single and mult risk grps

3c6ea00

Minor styler run

0840d73

Add message for error/tolerance, add msg tests

3fe7157

Update docu

3d74dba

Update DESC, style code, update Rcpp exports

61a4572

pratikunterwegs added Cleanup Clean up files or code for readability. Bug Something isn't working labels Oct 12, 2022

pratikunterwegs marked this pull request as ready for review October 12, 2022 12:31

Bisaloo requested changes Oct 12, 2022

View reviewed changes

pratikunterwegs mentioned this pull request Oct 12, 2022

Remove pipes and lambdas in all R code #70

Closed

Simplify expect_equal use

4857b22

Co-authored-by: Hugo Gruson <Bisaloo@users.noreply.github.com>

pratikunterwegs and others added 2 commits October 12, 2022 19:02

Fewer solver iterations for tests

782ef5d

Co-authored-by: Hugo Gruson <Bisaloo@users.noreply.github.com>

Remove pipes and lambdas, fixes #70

f0de8d7

pratikunterwegs mentioned this pull request Oct 13, 2022

Possible dimension mismatch error #71

Closed

pratikunterwegs added 4 commits October 13, 2022 14:33

Better use of Eigen Map, return p_susc, WIP #71

e4d440f

Reduce Map use, WIP #71

e7bf505

Clang formatting

12d4835

Update documentation

edfc264

Bisaloo approved these changes Oct 18, 2022

View reviewed changes

pratikunterwegs merged commit 7390f86 into main Oct 18, 2022

pratikunterwegs deleted the feature/finalsize_grps_cpp branch October 18, 2022 08:46

This was referenced Oct 25, 2022

final_size using Cpp Newton solver #4

Closed

Uniform argument names final_size and final_size_cpp #6

Closed

Compare output accuracy: final_size v final_size_cpp #7

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rcpp implementation of final size with risk groups #66

Rcpp implementation of final size with risk groups #66

pratikunterwegs commented Oct 7, 2022 •

edited

Loading

Bisaloo left a comment

Bisaloo Oct 12, 2022

pratikunterwegs Oct 12, 2022

Bisaloo Oct 12, 2022

pratikunterwegs Oct 12, 2022

Bisaloo Oct 12, 2022

pratikunterwegs Oct 12, 2022

pratikunterwegs Oct 13, 2022

Bisaloo Oct 12, 2022

pratikunterwegs Oct 12, 2022 •

edited

Loading

pratikunterwegs commented Oct 13, 2022

Bisaloo commented Oct 13, 2022

pratikunterwegs commented Oct 13, 2022

pratikunterwegs commented Oct 13, 2022

		# check for correct final size calculation in complex data case
		# using newton solver

Rcpp implementation of final size with risk groups #66

Rcpp implementation of final size with risk groups #66

Conversation

pratikunterwegs commented Oct 7, 2022 • edited Loading

Bisaloo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pratikunterwegs Oct 12, 2022 • edited Loading

Choose a reason for hiding this comment

pratikunterwegs commented Oct 13, 2022

Bisaloo commented Oct 13, 2022

pratikunterwegs commented Oct 13, 2022

pratikunterwegs commented Oct 13, 2022

pratikunterwegs commented Oct 7, 2022 •

edited

Loading

pratikunterwegs Oct 12, 2022 •

edited

Loading