Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only one GPU usable on multi-GPU system #63

Closed
moyix opened this issue Oct 17, 2020 · 5 comments
Closed

Only one GPU usable on multi-GPU system #63

moyix opened this issue Oct 17, 2020 · 5 comments

Comments

@moyix
Copy link

moyix commented Oct 17, 2020

When attempting to use the second GPU on a multi-GPU system with current git version (f6c0495), I get:

moyix@isabella:~/git/pbrt-v4-scenes/barcelona-pavilion$ ~/git/pbrt-v4/build/pbrt --gpu --gpu-device 1 pavilion-night.pbrt 
pbrt version 4 (built Oct 16 2020 at 19:02:18)
Copyright (c)1998-2020 Matt Pharr, Wenzel Jakob, and Greg Humphreys.
The source code to pbrt (but *not* the book contents) is covered by the Apache 2.0 License.
See the file LICENSE.txt for the conditions of the license.
[ 181564.000 20201016.194016 /home/moyix/git/pbrt-v4/src/pbrt/textures.cpp:1141 ] FATAL CUDA error: operation not supported
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b29632a - pbrt::PrintStackTrace() + 0x3a
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b2965a6 - pbrt::CheckCallbackScope::Fail() + 0x26
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b2f9648 - pbrt::LogFatal(pbrt::LogLevel, char const*, int, char const*) + 0xe8
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b245a6c - void pbrt::LogFatal<char const*>(pbrt::LogLevel, char const*, int, char const*, char const*&&) + 0x5c
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b278140 - pbrt::GPUFloatImageTexture::Create(pbrt::Transform const&, pbrt::TextureParameterDictionary const&, pbrt::FileLoc const*, pstd::pmr::polymorphic_allocator<std::byte>) + 0x1170
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b27849b - pbrt::FloatTextureHandle::Create(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, pbrt::Transform const&, pbrt::TextureParameterDictionary const&, pbrt::FileLoc const*, pstd::pmr::polymorphic_allocator<std::byte>, bool) + 0x1fb
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b1d5e22 - pbrt::ParsedScene::CreateTextures(pstd::pmr::polymorphic_allocator<std::byte>, bool) const + 0x932
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b5d56c0 - pbrt::GPUAccel::GPUAccel(pbrt::ParsedScene const&, pstd::pmr::polymorphic_allocator<std::byte>, CUstream_st*, std::map<int, pstd::vector<pbrt::LightHandle, pstd::pmr::polymorphic_allocator<pbrt::LightHandle> >*, std::less<int>, std::allocator<std::pair<int const, pstd::vector<pbrt::LightHandle, pstd::pmr::polymorphic_allocator<pbrt::LightHandle> >*> > > const&, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, pbrt::MediumHandle, std::less<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, pbrt::MediumHandle> > > const&, pstd::array<bool, 12>*, pstd::array<bool, 12>*, bool*) + 0x11f0
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b3b018a - pbrt::GPUPathIntegrator::GPUPathIntegrator(pstd::pmr::polymorphic_allocator<std::byte>, pbrt::ParsedScene const&) + 0xeca
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b3b2467 - pbrt::GPURender(pbrt::ParsedScene&) + 0x57
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b1b7476 - main + 0x1326
(/lib/x86_64-linux-gnu/libc.so.6         )	0x0x7f9a90e5d0b3 - __libc_start_main + 0xf3
(/home/moyix/git/pbrt-v4/build/pbrt      )	0x0x55d25b1be9de - _start + 0x2e


Aborted

Setup is CUDA 11.1 on Linux:

Fri Oct 16 20:19:41 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 455.23.05    Driver Version: 455.23.05    CUDA Version: 11.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 3090    On   | 00000000:21:00.0  On |                  N/A |
| 75%   75C    P2   348W / 350W |   5195MiB / 24265MiB |     99%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  GeForce RTX 3090    On   | 00000000:4A:00.0 Off |                  N/A |
|  0%   37C    P8    28W / 350W |     13MiB / 24268MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      6558      G   /usr/lib/xorg/Xorg                102MiB |
|    0   N/A  N/A      9083      G   /usr/lib/xorg/Xorg                537MiB |
|    0   N/A  N/A      9268      G   /usr/bin/gnome-shell               89MiB |
|    0   N/A  N/A   1062811      G   ...oken=16993796885969011938       29MiB |
|    0   N/A  N/A   1539332      C   ...ix/git/pbrt-v4/build/pbrt     1679MiB |
|    0   N/A  N/A   3735375      G   ./tev                             368MiB |
|    0   N/A  N/A   4063098      G   ...AAAAAAAAA= --shared-files       66MiB |
|    1   N/A  N/A      6558      G   /usr/lib/xorg/Xorg                  4MiB |
|    1   N/A  N/A      9083      G   /usr/lib/xorg/Xorg                  4MiB |
|    1   N/A  N/A      9268      G   /usr/bin/gnome-shell                0MiB |
|    1   N/A  N/A   1062811      G   ...oken=16993796885969011938        0MiB |
|    1   N/A  N/A   1539332      C   ...ix/git/pbrt-v4/build/pbrt        0MiB |
|    1   N/A  N/A   3735375      G   ./tev                               0MiB |
|    1   N/A  N/A   4063098      G   ...AAAAAAAAA= --shared-files        0MiB |
+-----------------------------------------------------------------------------+

And, of course, support for using multiple GPUs at once would be great as well :)

@moyix
Copy link
Author

moyix commented Oct 17, 2020

Verbose log: pbrt_2gpu_verbose.txt

@moyix
Copy link
Author

moyix commented Oct 17, 2020

Setting the CUDA_VISIBLE_DEVICES environment variable to 1 allows pbrt to use the second GPU. Perhaps some piece of the GPU initialization code has GPU 0 hardcoded?

mmp added a commit that referenced this issue Oct 21, 2020
@mmp
Copy link
Owner

mmp commented Oct 21, 2020

Sorry for not looking at this sooner (and thanks for all of that information and that tidbit about CUDA_VISIBLE_DEVICES.)

I've pushed a fix that might fix this, but am not sure since I can't repro it locally. When you have a chance, can you let me know if that helps?

If it doesn't, then if you could try a run with a debug build, that'd be fantastic. It would also be interesting to know if it reproduces if you run with --nthreads 1. (But my hope is that it's now fixed!)

@moyix
Copy link
Author

moyix commented Oct 22, 2020

Yep, that fixed it!

@moyix moyix closed this as completed Oct 22, 2020
@mmp
Copy link
Owner

mmp commented Oct 22, 2020

Yaay!

Dolkar pushed a commit to Dolkar/pbrt-v4-myod-integration that referenced this issue May 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants