Skip to content

More unit tests and streamlined matrix multiplication#68

Merged
hpages merged 3 commits intoBioconductor:masterfrom
LTLA:master
Sep 25, 2020
Merged

More unit tests and streamlined matrix multiplication#68
hpages merged 3 commits intoBioconductor:masterfrom
LTLA:master

Conversation

@LTLA
Copy link
Contributor

@LTLA LTLA commented Sep 23, 2020

The general problem of capturing error messages in a cbind has been solved by switching from bpiterate to bplapply2, so now errors inside the matrix multiplication lead to errors in the parent process. This has the additional bonus of not requiring BiocParallel to be installed to do matrix multiplication.

I also greatly streamlined the decision-making process of choosing the iteration scheme in the matrix multiplication. Now it just prioritizes the larger matrix as the one to split up for distributed computation across workers. This is not really any worse than the previous approach, which relied on the presence of a non-NULL return from chunkGrid as a heuristic to identify which matrix was more likely to be file-backed... and then added an arbitrary penalty of 10 to the access cost of the chunks of that matrix. The new approach is very simple and easy to reason about.

The streamlining was also assisted by the realization that getAutoMultParallelAgnostic() was never actually exported. No users should have been able to use the non-agnostic matrix multiplication scheme, so I just got rid of it in the general case to make maintenance easier. I retained the non-agnostic option for self-cross-products due to its clear performance advantage though the default is still to use the agnostic mode for consistency with %*%.

(I don't know what the future plans for the matrix multiplication algorithm will be, but being able to achieve consistent results that are agnostic to the number of workers, block size and chunk dimensions is very logistically useful. I know that there will be situations where an agnostic algorithm just won't cut it, e.g., due to chunk or block size constraints, and we may need a different algorithm in such cases; but having agnosticism as the default, where possible, really makes it easy to scale things up and down while still getting the same results.)

Finally, I enhanced the battery of unit tests.

@hpages
Copy link
Contributor

hpages commented Sep 25, 2020

Looks good. Thanks!

@hpages hpages closed this Sep 25, 2020
@hpages hpages reopened this Sep 25, 2020
@hpages hpages merged commit b0d2baf into Bioconductor:master Sep 25, 2020
@hpages
Copy link
Contributor

hpages commented Sep 25, 2020

of course I need to merge first, doh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants