Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DCN on Jetson TX2 #3041

Open
MauroPfister opened this issue Jun 16, 2020 · 4 comments
Open

DCN on Jetson TX2 #3041

MauroPfister opened this issue Jun 16, 2020 · 4 comments
Labels
community help wanted Extra attention is needed

Comments

@MauroPfister
Copy link

Hi

I am trying to use the deformable convolutions from this repo on a Jetson TX2. Compilation was successful and I can also run them from Python. However, for every call of the DCN I get the following error:
error in deformable_im2col: too many resources requested for launch

I was wondering if there are any setting in the .cu files that I can change to fix this error?

Minimal reproducible example

# Execute from parent directory of ops folder

import torch
from ops.dcn import DeformConvPack

device = torch.device('cuda')
dcn = DeformConvPack(in_channels=256,
                     out_channels=256,
                     kernel_size=(3, 3),
                     padding=1).to(device)
input = torch.Tensor(16, 256, 26, 20).to(device)
output = dcn(input)

Environment

  • Jetson TX2 with JetPack 4.3
  • Python 3.6.9
  • Pytorch 1.4
  • Torchvision 0.5

Since I only wanted to install DCNs instead of the whole repo, I used a reduced setup.py (copied from this repo):

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
    name='mmdet',
    ext_modules=[
        CUDAExtension('deform_conv_cuda', [
            'src/deform_conv_cuda.cpp',
            'src/deform_conv_cuda_kernel.cu'
        ]),
        CUDAExtension('deform_pool_cuda', [
            'src/deform_pool_cuda.cpp',
            'src/deform_pool_cuda_kernel.cu'
        ])
    ],
    cmdclass={
        'build_ext': BuildExtension
    })

Bug fix
After a quick search on Google I found this PyTorch issue which seems related. Unfortunately I have no experience with CUDA at all, so I am not sure if this helps.

@MauroPfister
Copy link
Author

I was able to solve the issue by replacing CUDA_NUM_THREADS = 1024 by CUDA_NUM_THREADS = 512 and recompiling:

const int CUDA_NUM_THREADS = 1024;

The regular convolutions of PyTorch do not seem to have this problem. Maybe the CUDA_NUM_THREADS constant could be set depending for which architecture the DCNs are built?

@hellock
Copy link
Member

hellock commented Jun 19, 2020

Thanks for your reporting! It is a known issue that setting CUDA_NUM_THREADS to 1024 causes the building failure on some old or lightweight devices. We have not found a good way to set it according to the gpu arch. PRs are welcome if you have any ideas.

@MauroPfister
Copy link
Author

I don't have any experience with PyTorch CUDA extensions, so I can't help with a PR unfortunately. But maybe just mention it in a README somewhere? That way people could easily fix the issue themselves.

@hhaAndroid hhaAndroid added the community help wanted Extra attention is needed label Apr 19, 2021
@jshilong
Copy link
Collaborator

Thanks for your reporting! I would add it to FAQ to help people locate problems faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants