CUDA Thrust cannot access Device #24

isohl · 2015-08-03T20:13:49Z

CUDA Thrust calls such as the following:

thrust::exclusive_scan(thrust::device_ptr<uint2>(input), thrust::device_ptr<uint2>(input + numElements), thrust::device_ptr<uint2>(output), zero);

give the error (generated with -DREALM_BACKTRACE):

terminate called after throwing an instance of 'thrust::system::system_error'
  what():  function_attributes(): after cudaFuncGetAttributes: invalid device function
BACKTRACE (0, 7f3c38567700)
----------
./composite() [0xb21c75]
  /lib/x86_64-linux-gnu/libc.so.6 : ()+0x36d40
  /lib/x86_64-linux-gnu/libc.so.6 : gsignal()+0x39
  /lib/x86_64-linux-gnu/libc.so.6 : abort()+0x148
  /usr/lib/x86_64-linux-gnu/libstdc++.so.6 : __gnu_cxx::__verbose_terminate_handler()+0x155
  /usr/lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0x5e836
  /usr/lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0x5e863
  /usr/lib/x86_64-linux-gnu/libstdc++.so.6 : ()+0x5eaa2
  ./composite : thrust::system::cuda::detail::bulk_::detail::throw_on_error(cudaError, char const*)+0x50
  ./composite : thrust::system::cuda::detail::bulk_::detail::function_attributes_t thrust::system::cuda::detail::bulk_::detail::function_attributes<void (*)(thrust::system::cuda::detail::bulk_::detail::cuda_task<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<9ul>, 128ul>, 0ul>, thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::scan_detail::accumulate_tiles, thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<1u>, thrust::device_ptr<uint2>, thrust::system::cuda::detail::aligned_decomposition<long>, thrust::detail::normal_iterator<thrust::pointer<uint2, thrust::system::cuda::detail::tag, thrust::use_default, thrust::use_default> >, thrust::plus<uint2>, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type, thrust::null_type> > >)>(void (*)(thrust::system::cuda::detail::bulk_::detail::cuda_task<thrust::system::cuda::detail::bulk_::parallel_group<thrust::system::cuda::detail::bulk_::concurrent_group<thrust::system::cuda::detail::bulk_::agent<9ul>, 128ul>, 0ul>, thrust::system::cuda::detail::bulk_::detail::closure<thrust::system::cuda::detail::scan_detail::accumulate_tiles, thrust::tuple<thrust::system::cuda::detail::bulk_::detail::cursor<1u>, thrust::device_ptr<uint2>, thrust::system::cuda::detail::aligned_decomposition<long>, thrust::detail::normal_iterator<thrust::pointer<uint2, thrust::system::cuda::detail::tag, thrust
----------
BACKTRACE (0, 7f3d429f9700)
----------
./composite() [0xb21c75]
  /lib/x86_64-linux-gnu/libc.so.6 : ()+0x36d40
  /lib/x86_64-linux-gnu/libglib-2.0.so.0 : g_main_context_check()+0x134
  /lib/x86_64-linux-gnu/libglib-2.0.so.0 : ()+0x48f7b
  /lib/x86_64-linux-gnu/libglib-2.0.so.0 : g_main_context_iteration()+0x2c
  /usr/local/Qt/5.4/gcc_64/lib/libQt5Core.so.5 : QEventDispatcherGlib::processEvents(QFlags<QEventLoop::ProcessEventsFlag>)+0xc3
  /usr/local/Qt/5.4/gcc_64/lib/libQt5Core.so.5 : QEventLoop::exec(QFlags<QEventLoop::ProcessEventsFlag>)+0xcb
  /usr/local/Qt/5.4/gcc_64/lib/libQt5Core.so.5 : QCoreApplication::exec()+0x85
  /home/xin/gitlab/legioncomposite/QtViewer/libQtViewer.so.1 : interactThread(void*)+0x6d
  /lib/x86_64-linux-gnu/libpthread.so.0 : ()+0x8182
  /lib/x86_64-linux-gnu/libc.so.6 : clone()+0x6d

----------

A simple test of this failure is:

thrust::host_vector<int> H(4);
H[0] = 14;
H[1] = 20;
H[2] = 38;
H[3] = 46;
thrust::device_vector<int> D = H;
thrust::inclusive_scan(D.begin(), D.end(), D.begin());

The text was updated successfully, but these errors were encountered:

lightsighter · 2015-08-03T21:26:45Z

In Legion, we are very particular about data is managed. In particular, we want all data to be stored in logical regions and not in any third-party data structures like thrust vectors. Once it's out of our control, we can't reason about it and that can lead to problems. In this specific case, there is another issue: we actually don't allow Legion GPU tasks to do anything other than launch kernels (all data movement to and from the GPU should be done via the placement of physical instances in GPU memories through the mapping interface). You may have noticed that the Legion build system only links against '-lcuda' and not '-lcudart'. We do this intentionally, so we can scope exactly the set of CUDA functions that users are permitted to invoke without interfering with the Legion programming model. I'm actually not sure how you managed to get thrust to link with our Legion build system since thrust requires '-lcudart'.

isohl · 2015-08-03T21:30:03Z

Does this mean that you do not plan to support Thrust and other similar libraries?

lightsighter · 2015-08-03T22:19:51Z

We will support a specific subset of thrust that plays nice with Legion. For example, once you've got a physical instance of a logical region in framebuffer memory, Legion can provide you a raw device pointer to that instance. You can then wrap this device pointer in a thrust vector and use thrust calls to launch kernels on the vector (be sure the vector doesn't have an allocator associated with it so thrust doesn't try to reclaim the memory after the handle goes out of scope or bad things will happen). Legion will actually modify the kernel launches underneath of the CUDA runtime API to properly defer them on the right streams so you don't have to synchronize before exiting your GPU task. There is a good example of that here: https://github.com/StanfordLegion/legion/blob/stable/examples/full_circuit/circuit_gpu.cu#L390-L401 Notice we don't need to synchronize with the GPU after the kernel launch; Legion handles that automatically while allowing the GPU processor to continue executing and launching more kernels.

lightsighter · 2015-09-27T04:03:41Z

I've added some missing CUDA functions that thrust needs for most of its kernel calls. There is now an example program showing how to use thrust with Legion.

https://github.com/StanfordLegion/legion/tree/master/examples/thrust_interop

gshipman mentioned this issue Sep 1, 2015

Assertion failure: in lowlevel_dma.cc:2775 - Assertion `dist.find(dst_mem) != dist.end()' failed. #51

Closed

streichler self-assigned this Sep 9, 2015

streichler closed this as completed Oct 1, 2015

jiazhihao mentioned this issue Oct 17, 2015

Assertion failure in legion_analysis.cc #73

Closed

magnatelee mentioned this issue Oct 21, 2015

Invalid remove version state bug #80

Closed

jiazhihao mentioned this issue Oct 1, 2017

Assertion `finder != proc_spaces.end()' failed. #305

Closed

qyz96 mentioned this issue Jun 27, 2020

Regent: Scalability Issue of Cholesky.rg in legion/language/examples #663

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CUDA Thrust cannot access Device #24

CUDA Thrust cannot access Device #24

isohl commented Aug 3, 2015

lightsighter commented Aug 3, 2015

isohl commented Aug 3, 2015

lightsighter commented Aug 3, 2015

lightsighter commented Sep 27, 2015

CUDA Thrust cannot access Device #24

CUDA Thrust cannot access Device #24

Comments

isohl commented Aug 3, 2015

lightsighter commented Aug 3, 2015

isohl commented Aug 3, 2015

lightsighter commented Aug 3, 2015

lightsighter commented Sep 27, 2015