Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CuDNN dropout support #5501

Merged
merged 7 commits into from Jun 8, 2018

Conversation

@AlexDBlack
Copy link
Contributor

commented Jun 7, 2018

Addresses: #5486

@AlexDBlack AlexDBlack requested a review from saudet Jun 7, 2018

//Reserve space
SizeTPointer stateSizeBytesPtr = new SizeTPointer(); //TODO correct???
stateSizeBytes = stateSizeBytesPtr.get();
checkCudnn(cudnnDropoutGetReserveSpaceSize(cudnnContext.xTensorDesc, stateSizeBytesPtr));

This comment has been minimized.

Copy link
@saudet

saudet Jun 7, 2018

Member

We'd need to call stateSizeBytesPtr.get() after cudnnDropoutGetReserveSpaceSize() for this to make sense.

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Jun 7, 2018

Author Contributor

Ah, of course. Fixed :)

if(statesPtr == null || statesPtr.capacity() != stateSizeBytes){
//TODO: Is this correct? "Pointer to user-allocated GPU memory that will hold random number generator states."
statesPtr = new BytePointer(stateSizeBytes);
}

This comment has been minimized.

Copy link
@saudet

saudet Jun 7, 2018

Member

It probably needs memory allocated on the device. Let's try to use DataCache available in the parent and as used by CudnnConvolutionHelper, CudnnBatchNormalizationHelper, and CudnnLSTMHelper.

This comment has been minimized.

Copy link
@AlexDBlack

AlexDBlack Jun 7, 2018

Author Contributor

Fixed

This comment has been minimized.

Copy link
@saudet

saudet Jun 7, 2018

Member

We should also call states.deallocate() just before new DataCache() to free up precious device memory right away.

@AlexDBlack AlexDBlack force-pushed the ab_cudnn_dropout branch from 66a46fe to 6afb764 Jun 7, 2018

@saudet

This comment has been minimized.

Copy link
Member

commented Jun 8, 2018

Using the code from Caffe2/PyTorch as reference
https://github.com/pytorch/pytorch/blob/master/caffe2/operators/dropout_op_cudnn.cc
it looks like the buffer size returned by cudnnDropoutGetReserveSpaceSize() is for cudnnDropoutForward(), while the buffer size that cudnnSetDropoutDescriptor() requires we can get that from cudnnDropoutGetStatesSize().

@AlexDBlack AlexDBlack force-pushed the ab_cudnn_dropout branch from 9f80129 to 032c91e Jun 8, 2018

@AlexDBlack AlexDBlack requested a review from saudet Jun 8, 2018

@saudet
saudet approved these changes Jun 8, 2018

@AlexDBlack AlexDBlack merged commit 9aef595 into master Jun 8, 2018

0 of 2 checks passed

codeclimate 7 issues to fix
Details
continuous-integration/jenkins/pr-merge This commit is being built
Details

@AlexDBlack AlexDBlack deleted the ab_cudnn_dropout branch Jun 8, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.