Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About inference #1

Open
thunanguyen opened this issue Jun 9, 2021 · 8 comments
Open

About inference #1

thunanguyen opened this issue Jun 9, 2021 · 8 comments

Comments

@thunanguyen
Copy link

Hi! Currently, I encounter a bug. It's ok when I do the training but when I do the inference, it says that:
GPUassert: too many resources requested for launch network_eval.cu 292

My GPU is RTX Titan
My CUDA version is 11.1
My CuDNN version is 8
My OS is Ubuntu 18.04

@creiser
Copy link
Owner

creiser commented Jun 16, 2021

Hi! Yeah, we experienced the same problem on an RTX 2080 Ti, which is probably quite close to your RTX Titan. Despite these cards being newer than the GTX 1080 Ti we used, there seems to a shortage of a certain resource. Might be that the compiler uses more registers for these newer cards (more than there are available). I will look into it.

@thunanguyen
Copy link
Author

thanks for your response, I have been looking into this for weeks now and still don't know what happens. Maybe because of the ray-tracing function of RTX GPUS?

@creiser
Copy link
Owner

creiser commented Jun 16, 2021

An easy fix should be to run the network_eval kernel with fewer threads per block, e.g. 512 instead of 640, but then the performance suffers. It should also be possible to run it on a newer GPU with the same block size. Maybe they reduced some resources for these ray-tracining capabilities, but I also checked and the specification says that these cards still have the same register count and shared memory.

@creiser
Copy link
Owner

creiser commented Jun 16, 2021

I fixed the problem and it now runs on a RTX 2080 Ti, so it should also for you. Despite the suboptimal fix, I measured on the Lego scene 17 ms with the RTX 2080 Ti.

e79af85

@bruinxiong
Copy link

@creiser It's weird. I check my nvidia 1080Ti the same as yours, CC is 6.1. Even I set fewer threads per block, such as 512 rather than 640. I still meet GPUassert: too many resources requested for launch network_eval.cu 292. I have to decrease to 256 to suffer the performance discount.

@creiser
Copy link
Owner

creiser commented Jul 27, 2021

@bruinxiong Yeah, that does not make sense with a GTX 1080 Ti. Did you use the pre-compiled CUDA extension or did you compile the code yourself? In case you have compiled the extension yourself this problem might be caused by an old version of the CUDA Toolkit. Just to be safe make sure to use a recent driver version as well (but this shouldn't be the cause)
If your PC has multiple GPUs make sure that the right one is being used, the program prints out the used GPU in the beginning.

@bruinxiong
Copy link

@bruinxiong Yeah, that does not make sense with a GTX 1080 Ti. Did you use the pre-compiled CUDA extension or did you compile the code yourself? In case you have compiled the extension yourself this problem might be caused by an old version of the CUDA Toolkit. Just to be safe make sure to use a recent driver version as well (but this shouldn't be the cause)
If your PC has multiple GPUs make sure that the right one is being used, the program prints out the used GPU in the beginning.

@creiser Hi, I compiled the code by myself. I use the latest CUDA Toolkit 11.2.0. I have 3 GTX 1080Ti GPUs in my PC. I use the default GPU (GPU0) to render in the interactive viewer mode. I only set 256 threads per block, once I increase it larger than 256 the above error will emerge.

@creiser
Copy link
Owner

creiser commented Jul 30, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants