[MRG] New ot.gpu with cupy #67
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The PR is a cupy implementation of the functions currently implemented in ot.gpu. I also removed all the classes that were deprecated anyways. It still needs proper updated test but i like this solution since it stays mostly compatible with the old ot.gpu.
I have received a large number of queries about ot.gpu but cudamat is not maintained and the problem will only grow so we need to do something before release 0.5.
This solution is far less elegant than PR #32 of @toto6 with all the decorators but having a cupy specific implementation leaves more room for code optimization than a generic implementation IMHO. Which means that we can make it better in the future without compromizing the numpy implmentation.
I give an example of use for the ot.gpu functions below with different format for input/output, i.e. if there are numpy.array of cupy.array . The output is obtained on my Titan X GPU after two run of the script in ipython.
The output I have is the following: