Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix usage of Idx to alpaka::Idx #2265

Merged

Conversation

ichinii
Copy link
Contributor

@ichinii ichinii commented May 6, 2024

on windows, the use of Idx resolves differently than alpaka::Idx, thus alpaka::Idx should be used directly.

Compile log:

[build]   Compiling CUDA source file ..\..\..\..\example\parallelLoopPatterns\src\parallelLoopPatterns.cpp...
[build]   
[build]   C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -IC:\Users\ich\Desktop\hzb\forkalpaka\include -IC:\local\boost_1_84_0 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include"  -G   --keep-dir parallel.1B88E8CE\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --extended-lambda --expt-relaxed-constexpr --display-error-number -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D_USE_MATH_DEFINES -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -DALPAKA_BLOCK_SHARED_DYN_MEMBER_ALLOC_KIB=47 -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdparallelLoopPatterns.dir\Debug\vc143.pdb" -o parallelLoopPatterns.dir\Debug\parallelLoopPatterns.obj "C:\Users\ich\Desktop\hzb\forkalpaka\example\parallelLoopPatterns\src\parallelLoopPatterns.cpp" 
[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/BufUniformCudaHipRt.hpp(81): error : Idx is not a template [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns\parallelLoopPatterns.vcxproj]
[build]                     std::is_same_v<TIdx, Idx<TExtent>>,
[build]                                          ^
[build]             detected during:
[build]               instantiation of "alpaka::BufUniformCudaHipRt<TApi, TElem, TDim, TIdx>::BufUniformCudaHipRt(const alpaka::DevUniformCudaHipRt<TApi> &, TElem *, Deleter, const TExtent &, size_t) [with TApi=alpaka::ApiCudaRt, TElem=float, TDim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t, Deleter=lambda [](float *)->void]" at line 268
[build]               instantiation of "auto alpaka::trait::BufAlloc<TElem, Dim, TIdx, alpaka::DevUniformCudaHipRt<TApi>, void>::allocBuf(const alpaka::DevUniformCudaHipRt<TApi> &, const TExtent &)->alpaka::BufUniformCudaHipRt<TApi, TElem, Dim, TIdx> [with TApi=alpaka::ApiCudaRt, TElem=float, Dim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t]" at line 66 of C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/Traits.hpp
[build]   
[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/BufUniformCudaHipRt.hpp(80): error : static assertion failed with "The idx type of TExtent and the TIdx template parameter have to be identical!" [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns\parallelLoopPatterns.vcxproj]
[build]                 static_assert(
[build]                 ^
[build]             detected during:
[build]               instantiation of "alpaka::BufUniformCudaHipRt<TApi, TElem, TDim, TIdx>::BufUniformCudaHipRt(const alpaka::DevUniformCudaHipRt<TApi> &, TElem *, Deleter, const TExtent &, size_t) [with TApi=alpaka::ApiCudaRt, TElem=float, TDim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t, Deleter=lambda [](float *)->void]" at line 268
[build]               instantiation of "auto alpaka::trait::BufAlloc<TElem, Dim, TIdx, alpaka::DevUniformCudaHipRt<TApi>, void>::allocBuf(const alpaka::DevUniformCudaHipRt<TApi> &, const TExtent &)->alpaka::BufUniformCudaHipRt<TApi, TElem, Dim, TIdx> [with TApi=alpaka::ApiCudaRt, TElem=float, Dim=std::integral_constant<size_t, 1ULL>, TIdx=uint32_t, TExtent=uint32_t]" at line 66 of C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/Traits.hpp
[build]   
[build]   2 errors detected in the compilation of "C:/Users/ich/Desktop/hzb/forkalpaka/example/parallelLoopPatterns/src/parallelLoopPatterns.cpp".
[build]   parallelLoopPatterns.cpp
[build] C:\Program Files\Microsoft Visual Studio\2022\Community\MSBuild\Microsoft\VC\v170\BuildCustomizations\CUDA 12.4.targets(799,9): error MSB3721: The command ""C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\bin\nvcc.exe"  --use-local-env -ccbin "C:\Program Files\Microsoft Visual Studio\2022\Community\VC\Tools\MSVC\14.39.33519\bin\HostX64\x64" -x cu   -IC:\Users\ich\Desktop\hzb\forkalpaka\include -IC:\local\boost_1_84_0 -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\include"  -G   --keep-dir parallel.1B88E8CE\x64\Debug  -maxrregcount=0   --machine 64 --compile -cudart static -std=c++17 --generate-code=arch=compute_52,code=[compute_52,sm_52] --extended-lambda --expt-relaxed-constexpr --display-error-number -Xcompiler="/EHsc -Zi -Ob0" -g  -D_WINDOWS -D_USE_MATH_DEFINES -DALPAKA_ACC_CPU_B_SEQ_T_SEQ_ENABLED -DALPAKA_ACC_GPU_CUDA_ENABLED -DALPAKA_DEBUG=0 -DALPAKA_BLOCK_SHARED_DYN_MEMBER_ALLOC_KIB=47 -D"CMAKE_INTDIR=\"Debug\"" -D_MBCS -D"CMAKE_INTDIR=\"Debug\"" -Xcompiler "/EHsc /W1 /nologo /Od /FS /Zi /RTC1 /MDd " -Xcompiler "/FdparallelLoopPatterns.dir\Debug\vc143.pdb" -o parallelLoopPatterns.dir\Debug\parallelLoopPatterns.obj "C:\Users\ich\Desktop\hzb\forkalpaka\example\parallelLoopPatterns\src\parallelLoopPatterns.cpp"" exited with code 1. [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\example\parallelLoopPatterns\parallelLoopPatterns.vcxproj]

Screenshot 2024-05-06 124329

edit:
msvc: Microsoft (R) C/C++-Optimierungscompiler Version 19.39.33523 für x64 installed with Visual Studio 2022
VSCode: 1.89.0

on windows, the use of Idx does not resolve to alpaka::Idx,
thus alpaka::Idx should be used directly
@fwyzard
Copy link
Contributor

fwyzard commented May 6, 2024

What does the compiler resolve Idx to ?

@mehmetyusufoglu
Copy link
Contributor

mehmetyusufoglu commented May 6, 2024

TExtent=uint32_t

Strange issue... In the idx/Traits.hpp file; the Idx template alias is defined before the specialisation. Namely; template<typename T> using Idx = typename trait::IdxType<T>::type; is defined before the trait::IdxType sepecialization for is_arithmetic case. Could taking the template alias after the specialisation might help?

@psychocoderHPC
Copy link
Member

Could taking the template alias after the specialisation might help?

No this should not help, the type is forward declared, so all is fine. If a non arithmetic type is used it will fail with an error that there is no definition or incomplete type error.

Copy link
Member

@psychocoderHPC psychocoderHPC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see any reason why the compiler should fail if alpaka:: is missing but if this is fixing the windows issue it due not hurt to merge it.

@psychocoderHPC
Copy link
Member

@ichinii Could you please add the nvcc and visual studio version you used where the issue showed up.

@psychocoderHPC
Copy link
Member

Ohh wait @ichinii could you please try if including #include "alpaka/idx/Traits.hpp" in BufUniformCudaHipRt.hpp and ViewSubView.hpp instead of adding alpaka:: is solving the issue too?
I think the problem is that unser linux we pull the Idx trait transitive and for reasons this is not the case in windows.

Copy link
Member

@psychocoderHPC psychocoderHPC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should first check if #2265 (comment) is helping it looks like we miss to include the Idx traits.

@@ -53,23 +53,23 @@ namespace alpaka
"The dev type of TView and the Dev template parameter have to be identical!");

static_assert(
std::is_same_v<TIdx, Idx<View>>,
std::is_same_v<TIdx, alpaka::Idx<View>>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ichinii Is this change required because you got the same issue as in BufUniformCudaHipRt.hpp or do you changed it because it is equal to the assert in the BufUniformCudaHipRt.hpp.

I am asking because the include for the Idx trait is in this file available and this file was not shown in the error message you posted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same issue

[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/view/ViewSubView.hpp(56): error : Idx is not a template [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\test\unit\mem\copy\bufSlicingTest.vcxproj]
[build]                     std::is_same_v<TIdx, Idx<View>>,
[build]                                          ^
[build]             detected during:
[build]               instantiation of "alpaka::ViewSubView<TDev, TElem, TDim, TIdx>::ViewSubView(TQualifiedView &, const TExtent &, const TOffsets &) [with TDev=alpaka::DevCpu, TElem=int32_t, TDim=std::integral_constant<size_t, 1ULL>, TIdx=int64_t, TQualifiedView=alpaka::BufCpu<int32_t, std::integral_constant<size_t, 1ULL>, int64_t>, TOffsets=alpaka::Vec<std::integral_constant<size_t, 1ULL>, int64_t>, TExtent=alpaka::Vec<std::integral_constant<size_t, 1ULL>, int64_t>]" at line 76 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]               instantiation of "auto TestContainer<TDim, TIdx, TAcc, TData, Vec>::copySliceOnDevice(TestContainer<TDim, TIdx, TAcc, TData, Vec>::BufDevice, Vec, Vec)->TestContainer<TDim, TIdx, TAcc, TData, Vec>::BufDevice [with TDim=std::integral_constant<size_t, 1ULL>, TIdx=int64_t, TAcc=alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, TData=int32_t, Vec=alpaka::Vec<std::integral_constant<size_t, 1ULL>, int64_t>]" at line 138 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]               instantiation of "void CATCH2_INTERNAL_TEMPLATE_TEST_5<TestType>() [with TestType=std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, int32_t>]" at line 109 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]               instantiation of "void <unnamed>::ns_CATCH2_INTERNAL_TEMPLATE_TEST_4::CATCH2_INTERNAL_TEMPLATE_TEST_4<Types...>::reg_tests() [with Types=<std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint32_t>, double>>]" at line 109 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]   
[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/view/ViewSubView.hpp(55): error : static assertion failed with "The idx type of TView and the TIdx template parameter have to be identical!" [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\test\unit\mem\copy\bufSlicingTest.vcxproj]
[build]                 static_assert(
[build]                 ^
[build]             detected during:
[build]               instantiation of "alpaka::ViewSubView<TDev, TElem, TDim, TIdx>::ViewSubView(TQualifiedView &, const TExtent &, const TOffsets &) [with TDev=alpaka::DevCpu, TElem=int32_t, TDim=std::integral_constant<size_t, 1ULL>, TIdx=int64_t, TQualifiedView=alpaka::BufCpu<int32_t, std::integral_constant<size_t, 1ULL>, int64_t>, TOffsets=alpaka::Vec<std::integral_constant<size_t, 1ULL>, int64_t>, TExtent=alpaka::Vec<std::integral_constant<size_t, 1ULL>, int64_t>]" at line 76 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]               instantiation of "auto TestContainer<TDim, TIdx, TAcc, TData, Vec>::copySliceOnDevice(TestContainer<TDim, TIdx, TAcc, TData, Vec>::BufDevice, Vec, Vec)->TestContainer<TDim, TIdx, TAcc, TData, Vec>::BufDevice [with TDim=std::integral_constant<size_t, 1ULL>, TIdx=int64_t, TAcc=alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, TData=int32_t, Vec=alpaka::Vec<std::integral_constant<size_t, 1ULL>, int64_t>]" at line 138 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]               instantiation of "void CATCH2_INTERNAL_TEMPLATE_TEST_5<TestType>() [with TestType=std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, int32_t>]" at line 109 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp
[build]               instantiation of "void <unnamed>::ns_CATCH2_INTERNAL_TEMPLATE_TEST_4::CATCH2_INTERNAL_TEMPLATE_TEST_4<Types...>::reg_tests() [with Types=<std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint64_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint32_t>, int32_t>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint64_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint64_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint32_t>, float>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint32_t>, float>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint64_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint64_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, int32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, int32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, int32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, int32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, int32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, int32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 1ULL>, uint32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 1ULL>, uint32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 2ULL>, uint32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 2ULL>, uint32_t>, double>, std::tuple<alpaka::AccCpuSerial<std::integral_constant<size_t, 3ULL>, uint32_t>, double>, std::tuple<alpaka::AccGpuUniformCudaHipRt<alpaka::ApiCudaRt, std::integral_constant<size_t, 3ULL>, uint32_t>, double>>]" at line 109 of C:\Users\ich\Desktop\hzb\forkalpaka\test\unit\mem\copy\src\BufSlicing.cpp

@ichinii
Copy link
Contributor Author

ichinii commented May 12, 2024

Thanks for your quick responses. I got back to my windows machine today. Let me try to answer your questions.

What does the compiler resolve Idx to ?

Just declaring a variable with type Idx<TExtent> raises an error. It looks like it resolves to alpaka::internal::ViewAccessOps<TView>::Idx.
Kind of strange that it resolves to ViewAccessOps<TView>::Idx even though we are talking BufUniformCudaHipRt here.

[build] C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/buf/BufUniformCudaHipRt.hpp(77): error : type "alpaka::internal::ViewAccessOps<TView>::Idx [with TView=alpaka::BufUniformCudaHipRt<alpaka::ApiCudaRt, int32_t, std::integral_constant<size_t, 1ULL>, uint64_t>]" (declared at line 45 of C:\Users\ich\Desktop\hzb\forkalpaka\include\alpaka/mem/view/ViewAccessOps.hpp) is inaccessible [C:\Users\ich\Desktop\hzb\forkalpaka\build\gpu-cuda-nvcc\test\unit\mem\copy\bufSlicingTest.vcxproj]
[build]                 Idx<TExtent> test;

@ichinii Could you please add the nvcc and visual studio version you used where the issue showed up.

msvc: Microsoft (R) C/C++-Optimierungscompiler Version 19.39.33523 für x64 installed with Visual Studio 2022
VSCode: 1.89.0

Ohh wait @ichinii could you please try if including #include "alpaka/idx/Traits.hpp" in BufUniformCudaHipRt.hpp and ViewSubView.hpp instead of adding alpaka:: is solving the issue too? I think the problem is that unser linux we pull the Idx trait transitive and for reasons this is not the case in windows.

I added the include to BufUniformCudaHipRt.hpp. It is already present in ViewSubView.hpp. Afai can tell it results in the same behaviour.

@psychocoderHPC psychocoderHPC merged commit 35ce4a6 into alpaka-group:develop May 13, 2024
22 checks passed
@psychocoderHPC
Copy link
Member

thanks for fixing the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants