Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compute Gaussian kernel performance improvements. #9

Merged
merged 3 commits into from
Oct 8, 2022

Conversation

mierzejk
Copy link
Contributor

@mierzejk mierzejk commented Aug 3, 2020

densratio.RuLSIF.compute_kernel_Gaussian has been updated with a performance-improved implementation. The sheet comparing the baseline (original) and performance-improved implementations is also available at https://bit.ly/3X7asIm; I hope it is pretty self-explanatory.

The densratio.RuLSIF.set_compute_kernel_target (also available to be imported directly from densratio) accepts one of the following string arguments, and sets the underlying engine to carry out calculations:

Because of aforementioned multi threading technicalities, the engine defaults to cpu when numba is available, or numpy otherwise. I do not think adding the numba requirement is the best idea, as it can potentially be not backward compatible with other existing projects already dependent on densratio.
The performance-improved densratio.RuLSIF.set_compute_kernel_target implementation returns a numpy.matrix if any of the first two arguments is of the numpy.matrix type. Or it returns and expects a numpy.ndarray, in case future commits replace the deprecated numpy.matrix with just numpy.ndarray.

@mierzejk
Copy link
Contributor Author

mierzejk commented Aug 3, 2020

The pull request may, at least partially resolve the following issues: #6 estimate density ratio of large training set and test set and #8 density ratio estimation of high dimension data. According to my tests, both numpy and numba targets can deal with x_list and y_list matrices that consume over 20GB+ altogether, if enough virtual memory is available.
The pull request offers prospect of even greater performance improvement for large sets of data by taking advantage of numba cuda target. Yet that would require some extra work, not fully aligned with currently implemented numba.guvectorize approach.

@mierzejk mierzejk changed the title Compute kernel Gaussian performance improvement. Compute kernel Gaussian performance improvements. Aug 3, 2020
@hoxo-m hoxo-m self-requested a review August 4, 2020 00:11
@hoxo-m hoxo-m self-assigned this Aug 4, 2020
@mierzejk
Copy link
Contributor Author

A side-note in respect of the performance results: just recently I ran the benchmark with the same densratio_py codebase I have submitted in the following two environments:

  1. My over 6-year-old Dell Precision M4800, Intel Core i9 with 8 cores and 32 GB RAM available, running Ubuntu 18.04.4 LTS.
  2. Virtualized Windows Server 2016, 32 cores and 128 GB RAM available.

And to my surprise, despite the fact all 32 cores were being utilized in Windows environment, the process executed a few times faster on my reportedly less powerful laptop. I am not really sure what the real cause of that is. It might be the operating system itself. But perhaps it is due to the fact I have my laptop setup with regard to PyTorch performance, namely I have built numpy, numba, Cython and mkl from sources by myself. On Windows all packages have been delivered pre-built either by Anacoda or pip.
The original benchmark results I attached to the first pull request post were measured in the first environment, i.e. my Dell Precision M4800 running Ubuntu 18.04.4 LTS.

@mierzejk mierzejk mentioned this pull request Aug 18, 2021
@hoxo-m
Copy link
Owner

hoxo-m commented Oct 8, 2022

It is the greatest contribution!

@hoxo-m hoxo-m merged commit 1c3229a into hoxo-m:master Oct 8, 2022
@mierzejk mierzejk changed the title Compute kernel Gaussian performance improvements. Compute Gaussian kernel performance improvements. Aug 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants