Future Improvements

Jump to bottom

Nick Georgakopoulos edited this page Oct 3, 2021 · 5 revisions

Short Term

transferdifferential and rankmult are the only functions that don't use preallocation or vectorization. The problem is that computing the sizes to be preallocated/vectorized incurs a performance penalty that is more than the speed advantage gained. Or so I think, there might be a better way to do it. Ultimately the code here could be improved.

Long Term

Investigate GPU-compute more thoroughly.

Toggle table of contents Pages 10

Clone this wiki locally