Compute Gaussian kernel performance improvements. #9
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
densratio.RuLSIF.compute_kernel_Gaussian
has been updated with a performance-improved implementation. The sheet comparing the baseline (original) and performance-improved implementations is also available at https://bit.ly/3X7asIm; I hope it is pretty self-explanatory.The
densratio.RuLSIF.set_compute_kernel_target
(also available to be imported directly fromdensratio
) accepts one of the following string arguments, and sets the underlying engine to carry out calculations:numpy
- numpy broadcasting optimized. It must be noted the underlying BLAS library (e.g. Intel's MKL) can take advantage of multi threading model.cpu
- numba generalized universal function single thread optimized.parallel
- numba generalized universal function multi thread optimized. Please be advised all threading layer specifics apply.Because of aforementioned multi threading technicalities, the
engine
defaults tocpu
whennumba
is available, ornumpy
otherwise. I do not think adding thenumba
requirement is the best idea, as it can potentially be not backward compatible with other existing projects already dependent ondensratio
.The performance-improved
densratio.RuLSIF.set_compute_kernel_target
implementation returns anumpy.matrix
if any of the first two arguments is of thenumpy.matrix
type. Or it returns and expects anumpy.ndarray
, in case future commits replace the deprecatednumpy.matrix
with justnumpy.ndarray
.