-
Notifications
You must be signed in to change notification settings - Fork 25.1k
Cuda9 updates #2263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cuda9 updates #2263
Conversation
CPU implementation of L_p feature pooling
GPU implementation of L_p feature pooling
Work around bug in msvc compiler in win32 mode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reviewed the parts I understood. I'll need to skim through cuDNN7 and nccl2 docs to check the rest
test/test_nn.py
Outdated
output.backward(grad_output) | ||
types = (torch.FloatTensor,) | ||
if TEST_CUDA: | ||
types += (torch.cuda.FloatTensor,) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
@@ -238,11 +242,11 @@ struct algorithm_search<cudnnConvolutionBwdFilterAlgo_t> { | |||
CUDNN_CONVOLUTION_BWD_FILTER_ALGO_0, | |||
CUDNN_CONVOLUTION_BWD_FILTER_ALGO_1, | |||
CUDNN_CONVOLUTION_BWD_FILTER_ALGO_FFT, | |||
CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3, | |||
CUDNN_CONVOLUTION_BWD_FILTER_ALGO_WINOGRAD_NONFUSED, |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/cudnn/Conv.cpp
Outdated
if (groupIdx > 0) { | ||
long size = 1; | ||
for (int i = dim; i < tensor->nDimension; ++i) { | ||
size *= tensor->size[i]; | ||
} | ||
ptr += elementSize * size * groupIdx / groups; | ||
} | ||
} |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
I got an error while compiling your pull request. It was at the very end, something about not finding HalfTensor. I can rerun it and get the full error message if this isn't a known issue. |
@tpankaj Please run |
The error persists. Here it is, compiled on an Ubuntu 16.04 system with CUDA 9 and CuDNN 7.
|
Hi, Not sure if this is this is the same problem as the abovementioned, but I attempt to build from source using CUDA9 and CUDNN 7. It seems to get pretty far in the build process, and then towards the end I am getting...
Can someone kindly confirm that this is also a nccl subtree issue? |
@TeslasGhost That looks like the exact error I got. |
@ngimel I'm continuing to get an error now that I've installed nccl and nccl dev package for CUDA 9:
Is there a source directory for that version of nccl that I need to point it to? |
Apparently Findnccl.cmake in gloo subtree is not finding your install of nccl. https://github.com/pytorch/pytorch/blob/master/torch/lib/gloo/cmake/Modules/Findnccl.cmake |
I opened up an issue mentioning the steps required to use a user installed nccl, please see #2375 |
When will this be updated and merged? Cudnn7 grouped convolution support resolves the high priority issue #1708 and is independent of CUDA 9 updates. If the two feature sets are separated, both would be easier to be merged. |
Unfortunately it does not. For depthwise-separable convolutions this https://github.com/szagoruyko/pyinn is much better, and for other grouped convolutions current cudnn version provides only modest improvements. |
cbf60dd
to
e024f41
Compare
Regrouped cuda9 fixes and cleaned branch so it wouldn't show other commits in PR.
Also added cudnn7 grouped convolution support, and a hgemm fix that was needed for cuda9 for pre-maxwell hardware.