Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-degenerate twisted clover #1121

Conversation

kostrzewa
Copy link
Member

This draft PR tracks our feature/ndeg-twisted-clover branch for the purpose of build testing and discussion.

Currently being worked on by @urbach and @kostrzewa.

urbach and others added 30 commits March 1, 2021 08:59
…stedClover. Doesn't work yet, but at least the reference operator is called from within dslash_test. There is a CUDA_ERROR_ILLEGAL_ADDRESS error in the GPU kernel call, not sure if it's actually calling the right kernel there...
…ing tm_ndeg_mat and tm_ndeg_matpc tests pass)
Marcogarofalo and others added 3 commits August 6, 2021 22:25
add 'tm_rho' parameter for Hasenbusch mass preconditioning for twiste…
@kostrzewa
Copy link
Member Author

@pittlerf The merge of the stuff from #1136 seems to have broken the test builds here, can you take a look?

@pittlerf
Copy link
Contributor

@pittlerf The merge of the stuff from #1136 seems to have broken the test builds here, can you take a look?

yeah, of coarse.

@kostrzewa
Copy link
Member Author

yeah, of coarse.

Thanks. I guess it's probably a matter of merging with the latest GK. If that's the case, let me know and I'll take care of it.

@pittlerf
Copy link
Contributor

yeah, of coarse.

Thanks. I guess it's probably a matter of merging with the latest GK. If that's the case, let me know and I'll take care of it.

I tried to look at it, but I got the following, when click on details:

HTTP ERROR 404 Not Found

URI: /job/qudabase_ubuntu2004_cuda114_gnu11_clang12_cmake321/COMMS=-DQUDA_MPI=ON,GPUARCH=sm_70/10/display/redirect
404
Not Found
Stapler

It seems to me, that the test has been deleted.

@mathiaswagner
Copy link
Member

mathiaswagner commented Sep 22, 2021

Yes, qudabase_ubuntu2004_cuda114_gnu11_clang12_cmake321 has been removed from Jenkins as Clang12 does not like gcc11 headers when using clang s CUDA compiler. Sorry about that confusion.
The qudabase_ubuntu2004_cuda114_gnu11_cmake321 errors have been fixed in generic_kernel and merging that should fix the build here as well.

This is not related to something you have done but us updating Jenkins to use recent compilers.

@maddyscientist maddyscientist deleted the branch lattice:feature/generic_kernel October 6, 2021 00:31
@maddyscientist
Copy link
Member

Looks like this got closed since GK is merged. You should be able to reopen if you retarget it at develop.

@kostrzewa
Copy link
Member Author

kostrzewa commented Oct 21, 2021

@maddyscientist Thanks for the heads-up. Just now finding some time to take care of this. I'm having some issues building develop on my machine (as well as on my dev system).

[...]
[  4%] Building CXX object lib/CMakeFiles/quda_cpp.dir/targets/cuda/quda_api.cpp.o
In file included from /home/bartek/code/quda-develop/lib/../include/quda_internal.h:4,
                 from /home/bartek/code/quda-develop/lib/../include/tune_quda.h:14,
                 from /home/bartek/code/quda-develop/lib/targets/cuda/quda_api.cpp:2:
/home/bartek/code/quda-develop/lib/../include/quda_api.h:253:5: error: ‘quda’ in ‘cudaError_t’ {aka ‘enum cudaError’} does not name a type
  253 |   ::quda::qudaEventQuery_(event, __func__, quda::file_name(__FILE__), __STRINGIFY__(__LINE__))
      |     ^~~~
/home/bartek/build/quda-develop/include/quda_cuda_api.h:126:15: note: in expansion of macro ‘qudaEventQuery’
  126 |   cudaError_t qudaEventQuery(cudaEvent_t &event);
      |               ^~~~~~~~~~~~~~
/home/bartek/code/quda-develop/lib/../include/quda_api.h:256:5: error: ‘quda’ in ‘cudaError_t’ {aka ‘enum cudaError’} does not name a type
  256 |   ::quda::qudaEventRecord_(event, stream, __func__, quda::file_name(__FILE__), __STRINGIFY__(__LINE__))
[...]

I have gcc 9.3.0, cuda 11.5 as well as cmake 3.16.3 directly available. Am I right in assuming that I need a newer compiler and/or CMake version?

@kostrzewa
Copy link
Member Author

This was with -DQUDA_CXX_STANDARD=14 set.

@kostrzewa
Copy link
Member Author

Having upgraded CMake, I can get the build configured also without QUDA_CXX_STANDARD=14, but the errors seem to persist:

[ 35%] Building CXX object lib/CMakeFiles/quda_cpp.dir/interface/blas_interface.cpp.o
In file included from /home/bartek/code/quda-develop/lib/../include/comm_quda.h:4,
                 from /home/bartek/code/quda-develop/lib/targets/cuda/comm_target.cpp:1:
/home/bartek/code/quda-develop/lib/../include/quda_api.h:253:5: error: ‘quda’ in ‘cudaError_t’ {aka ‘enum cudaError’} does not name a type
  253 |   ::quda::qudaEventQuery_(event, __func__, quda::file_name(__FILE__), __STRINGIFY__(__LINE__))
      |     ^~~~
/home/bartek/build/quda-develop/include/quda_cuda_api.h:126:15: note: in expansion of macro ‘qudaEventQuery’
  126 |   cudaError_t qudaEventQuery(cudaEvent_t &event);
      |               ^~~~~~~~~~~~~~
/home/bartek/code/quda-develop/lib/../include/quda_api.h:256:5: error: ‘quda’ in ‘cudaError_t’ {aka ‘enum cudaError’} does not name a type
  256 |   ::quda::qudaEventRecord_(event, stream, __func__, quda::file_name(__FILE__), __STRINGIFY__(__LINE__))
      |     ^~~~
/home/bartek/build/quda-develop/include/quda_cuda_api.h:133:15: note: in expansion of macro ‘qudaEventRecord’
  133 |   cudaError_t qudaEventRecord(cudaEvent_t &event, qudaStream_t stream = 0);
      |               ^~~~~~~~~~~~~~~

@maddyscientist
Copy link
Member

@kostrzewa can you share with me your compiler versions and cmake command?

@kostrzewa
Copy link
Member Author

cmake \
-DCMAKE_INSTALL_PREFIX="$(pwd)/install_dir" \
-DCMAKE_BUILD_TYPE=RELEASE \
-DQUDA_BUILD_SHAREDLIB=ON \
-DQUDA_FAST_COMPILE_REDUCE=ON \
-DQUDA_FAST_COMPILE_DSLASH=ON \
-DQUDA_BUILD_ALL_TESTS=OFF \
-DQUDA_GPU_ARCH=sm_61 \                                                                                                                    
-DQUDA_INTERFACE_QDP=ON \
-DQUDA_INTERFACE_MILC=OFF \
-DQUDA_DIRAC_WILSON=ON \
-DQUDA_DIRAC_TWISTED_MASS=ON \
-DQUDA_DIRAC_TWISTED_CLOVER=ON \
-DQUDA_DIRAC_NDEG_TWISTED_MASS=ON \
-DQUDA_DIRAC_CLOVER=ON \
-DQUDA_DYNAMIC_CLOVER=ON \
-DQUDA_DIRAC_STAGGERED=OFF \
-DQUDA_DIRAC_DOMAIN_WALL=OFF \
-DQUDA_MULTIGRID=OFF \
-DQUDA_QMP=OFF \
-DQUDA_QIO=OFF \
-DQUDA_MPI=ON \
-DQUDA_DOWNLOAD_USQCD=ON \
${HOME}/code/quda-develop

GCC 9.3.0, now CMake 3.21.3, nvcc V11.0.194 (on my machine)
GCC 9.3.0, CMake 3.21.3, nvcc V11.5.50 (on my dev machine)

@kostrzewa
Copy link
Member Author

on my dev machine:

cmake \
-DCMAKE_INSTALL_PREFIX="$(pwd)/install_dir" \
-DCMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc \
-DCMAKE_BUILD_TYPE=RELEASE \
-DQUDA_BUILD_ALL_TESTS=OFF \
-DQUDA_GPU_ARCH=sm_80 \
-DQUDA_INTERFACE_QDP=ON \
-DQUDA_INTERFACE_MILC=OFF \
-DQUDA_MPI=ON \
-DQUDA_DIRAC_WILSON=ON \
-DQUDA_DIRAC_TWISTED_MASS=ON \
-DQUDA_DIRAC_TWISTED_CLOVER=ON \
-DQUDA_DIRAC_NDEG_TWISTED_CLOVER=ON \
-DQUDA_DIRAC_NDEG_TWISTED_MASS=ON \
-DQUDA_DIRAC_CLOVER=ON \
-DQUDA_DYNAMIC_CLOVER=ON \
-DQUDA_DIRAC_DOMAIN_WALL=OFF \
-DQUDA_DIRAC_STAGGERED=OFF \
-DQUDA_MULTIGRID=ON \
-DQUDA_FORCE_GAUGE=ON \
-DQUDA_QMP=OFF \
-DQUDA_QIO=OFF \
-DQUDA_DOWNLOAD_USQCD=ON \

@kostrzewa
Copy link
Member Author

@maddyscientist it might be a PEBKAC thing to do with not fully cleaning out the build directory...

@maddyscientist
Copy link
Member

Ha, that was my next question, whether the problem persists with a fresh build dir.

@kostrzewa
Copy link
Member Author

Ha, that was my next question, whether the problem persists with a fresh build dir.

Yeah, I think we can safely say that I've wasted your time ;)

@kostrzewa
Copy link
Member Author

superceded by #1196

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants