Skip to content

Future Improvements

Nick Georgakopoulos edited this page Oct 3, 2021 · 5 revisions

Short Term

  • transferdifferential and rankmult are the only functions that don't use preallocation or vectorization. The problem is that computing the sizes to be preallocated/vectorized incurs a performance penalty that is more than the speed advantage gained. Or so I think, there might be a better way to do it. Ultimately the code here could be improved.

Long Term

  • Investigate GPU-compute more thoroughly.
Clone this wiki locally