Build fails with Thrust 2.1: pinned_allocator.h removed #202

torrance · 2023-04-26T04:38:03Z

pinned_allocator.h was removed as part of pull request NVIDIA/thrust#1611, and a commit referenced from there mentions "Remove thrust::system::cuda::experimental::pinned_allocator.h, which has been deprecated for a long time."

I have no idea what it has been deprecated in favour of. An old issue suggests universal_host_pinned_allocator but this doesn't seem to actually exist anywhere.

What should it be replaced with?

The text was updated successfully, but these errors were encountered:

benbarsdell · 2023-04-27T01:33:38Z

Apparently this is the replacement:

#include <thrust/system/cuda/memory.h>

using pinned_allocator = thrust::mr::stateless_resource_allocator<
    T, thrust::system::cuda::universal_host_pinned_memory_resource>;

torrance · 2023-04-27T08:41:50Z

At your suggestion, I've made the following change:

diff --git a/src/fft.cu b/src/fft.cu
index eeace96..26cd458 100644
--- a/src/fft.cu
+++ b/src/fft.cu
@@ -44,7 +44,7 @@
 #include "ArrayIndexer.cuh"
 #include <thrust/device_vector.h>
 #include <thrust/host_vector.h>
-#include <thrust/system/cuda/experimental/pinned_allocator.h>
+#include <thrust/system/cuda/memory.h>

 #include <cufft.h>
 #include <cufftXt.h>
@@ -63,9 +63,9 @@ class BFfft_impl {
        bool             _using_load_callback;
        thrust::device_vector<char> _dv_tmp_storage;
        thrust::device_vector<CallbackData> _dv_callback_data;
-       typedef thrust::cuda::experimental::pinned_allocator<CallbackData> pinned_allocator_type;
+       using pinned_allocator_type = thrust::mr::stateless_resource_allocator<CallbackData, thrust::universal_host_pinned_memory_resource>;
        thrust::host_vector<CallbackData, pinned_allocator_type> _hv_callback_data;

And that builds.

However, all FFT-related tests currently fail, specifically those using fftshift, which seems to be exactly where this host_vector is used (?). In these cases, the odata array is all zeros, suggesting something failed to transfer from host to device, or vice versa.

I can't be certain that's the cause, since this is my first time trying to build bifrost, but seems likely.

torrance · 2023-05-03T07:18:10Z

A little more investigation, and it turns out the entire callback that performs that fftshift isn't running.

What's incredible is that if I add an empty print statement to post_fftshift the callback works and is called:

diff --git a/src/fft_kernels.cu b/src/fft_kernels.cu
index 9aefa89..7ec352c 100644
--- a/src/fft_kernels.cu
+++ b/src/fft_kernels.cu
@@ -28,6 +28,7 @@
 
 #include "fft_kernels.h"
 #include "cuda.hpp"
+#include "stdio.h"
 
 __device__
 inline size_t pre_fftshift(size_t        offset,
@@ -56,6 +57,8 @@ inline Complex post_fftshift(size_t        offset,
        // For forward transforms with apply_fftshift=true, we cyclically shift
        //   the output data by phase-rotating the input data here.
        if( cb->do_fftshift && !cb->inverse ) {
+               if (offset == 0) printf("");
+
                for( int d=0; d<cb->ndim; ++d ) {
                        // Compute the index of this element along dimension d
                        // **TODO: 64-bit indexing support

What's more incredible, is that if I add this print statement to the parent function only, in this case callback_load_cf32, it doesn't work and nothing is printed. Only if the print is added to post_fftshift do both print statements print anything at all.

I have no idea what's going on here.

jaycedowell · 2023-05-19T01:55:17Z

I've updated the self-hosted runner to Ubuntu 20.04 and CUDA 12.0 and I'm now seeing this in the CI. I'm also getting a 'cuda/stream.hpp(85): error: namespace "cuda::std" has no member "runtime_error"' error there as well. Working through those locally, I get Bifrost to build, and I am seeing that all of the test_fft tests are failing with a lot of zero filled results.

I played around with this a little bit and ended up with fewer errors if I changed the declaration of CallbackData in fft_kernels.h to be a struct __attribute__((packed)) CallbackData. I'm not sure why this would matter but I now only get errors on the complex-to-real transform tests.

jaycedowell · 2023-05-19T15:47:06Z

I think my complex-to-real errors are from an older version of the test suite (I've been testing on "ibverb-support"). As of abee49a CI looks to be ok.

torrance · 2023-05-26T03:53:20Z

I'm also getting a 'cuda/stream.hpp(85): error: namespace "cuda::std" has no member "runtime_error"' error there as well

Yes, I got that too and had to make it an absolute import.

I played around with this a little bit and ended up with fewer errors if I changed the declaration of CallbackData in fft_kernels.h to be a struct __attribute__((packed)) CallbackData. I'm not sure why this would matter but I now only get errors on the complex-to-real transform tests.

I can confirm this works for me too, however the compiler complains:

fft_kernels.h:109:13: warning: ignoring packed 
   attribute because of unpacked non-POD field ‘int_fastdiv 
   CallbackData::istrides [3]’
  109 |  int_fastdiv istrides[3]; // Note: Elements, not bytes

...so I'm not sure why it works, especially since the compiler is telling me it's being ignored (!).

jaycedowell · 2023-05-26T15:57:33Z

I also got that compiler warning. I'm hesitant to call this "solved" since it's not clear why this change makes any difference. But it does seem to yield the correct FFT results and it doesn't appear to cause any problems with earlier versions of CUDA. Maybe this is a "works for me".

jaycedowell mentioned this issue Sep 5, 2023

T-engine output is all zeros lwa-project/ng_digital_processor#14

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build fails with Thrust 2.1: pinned_allocator.h removed #202

Build fails with Thrust 2.1: pinned_allocator.h removed #202

torrance commented Apr 26, 2023

benbarsdell commented Apr 27, 2023

torrance commented Apr 27, 2023 •

edited

Loading

torrance commented May 3, 2023

jaycedowell commented May 19, 2023

jaycedowell commented May 19, 2023

torrance commented May 26, 2023

jaycedowell commented May 26, 2023

Build fails with Thrust 2.1: pinned_allocator.h removed #202

Build fails with Thrust 2.1: pinned_allocator.h removed #202

Comments

torrance commented Apr 26, 2023

benbarsdell commented Apr 27, 2023

torrance commented Apr 27, 2023 • edited Loading

torrance commented May 3, 2023

jaycedowell commented May 19, 2023

jaycedowell commented May 19, 2023

torrance commented May 26, 2023

jaycedowell commented May 26, 2023

torrance commented Apr 27, 2023 •

edited

Loading