Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of VectorSpaceVector in the Cuda Vector class. #15622

Merged
merged 1 commit into from Jul 6, 2023

Conversation

bangerth
Copy link
Member

@bangerth bangerth commented Jul 3, 2023

First simple step towards the resolution of #3073.

{
// Check that casting will work.
Assert(dynamic_cast<const Vector<Number> *>(&scaling_factors) != nullptr,
ExcVectorTypeNotCompatible());

// Downcast V. If fails, throw an exception.
const Vector<Number> &down_scaling_factors =
const Vector<Number> &scaling_factors =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this line anymore

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I also learned in #15623 that there will likely be many errors still around the corner. I shouldn't have picked this class, since I can't compile it locally :-(

@Rombur
Copy link
Member

Rombur commented Jul 3, 2023

There is still a bunch of override that should be removed in cuda_vector.h

@bangerth
Copy link
Member Author

bangerth commented Jul 4, 2023

Not just that, but also marking these functions as virtual. Let's see how this looks now!

Copy link
Member

@masterleinad masterleinad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's see what the CI says.

@kronbichler
Copy link
Member

There are unrelated failures in the serial CI. If you rebase, I believe we should be ok, but let's play safe.

@bangerth
Copy link
Member Author

bangerth commented Jul 5, 2023

I have to admit that I don't know what happens with the CUDA check here. I understand what the error message says, but I can't figure out where we call these functions from. Do any of you CUDA people have suggestions?

@masterleinad
Copy link
Member

masterleinad commented Jul 5, 2023

The error message is

/home/runner/work/dealii/dealii/source/lac/cuda_vector.cc: In instantiation of ‘void dealii::LinearAlgebra::CUDAWrappers::Vector<Number>::scale(const dealii::LinearAlgebra::CUDAWrappers::Vector<Number>&) [with Number = float]’:
/home/runner/work/dealii/dealii/source/lac/cuda_vector.cc:618:16:   required from here
/home/runner/work/dealii/dealii/source/lac/cuda_vector.cc:378:76: error: the compiler can assume that the address of ‘scaling_factors’ will never be NULL [-Werror=address]
  378 |       Assert(dynamic_cast<const Vector<Number> *>(&scaling_factors) != nullptr,
      |                         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~~~       

The check in https://github.com/dealii/dealii/pull/15622/files#diff-7344c9e5015cc2cc54ce87fbce41edd9031caa89324129878e28e1ffbd80397aR377-R379 has just become pointless and should be removed, right?

@Rombur
Copy link
Member

Rombur commented Jul 5, 2023

The error is at line 378 in cuda.cc. You forgot to remove a downcasting check.

@bangerth
Copy link
Member Author

bangerth commented Jul 5, 2023

Ah, that too. But I was really referring to these messages (https://github.com/dealii/dealii/actions/runs/5459249763/jobs/9935051876?pr=15622):

CMakeFiles/object_numerics_debug.dir/vector_tools_project.cc.o -c /home/runner/work/dealii/dealii/source/numerics/vector_tools_project.cc
/home/runner/work/dealii/dealii/include/deal.II/base/numbers.h(722): warning #20014-D: calling a __host__ function from a __host__ __device__ function is not allowed

/home/runner/work/dealii/dealii/include/deal.II/base/tensor.h(1177): warning #20014-D: calling a __host__ function from a __host__ __device__ function is not allowed

/home/runner/work/dealii/dealii/include/deal.II/base/numbers.h(722): warning #20014-D: calling a __host__ function from a __host__ __device__ function is not allowed

/home/runner/work/dealii/dealii/include/deal.II/base/tensor.h(1177): warning #20014-D: calling a __host__ function from a __host__ __device__ function is not allowed

/home/runner/work/dealii/dealii/include/deal.II/base/numbers.h(722): warning #20011-D: calling a __host__ function("dealii::VectorizedArray<double, (unsigned long)1ul> ::VectorizedArray(double)") from a __host__ __device__ function("dealii::internal::NumberType< ::dealii::VectorizedArray<double, (unsigned long)1ul> > ::value<double> ") is not allowed

/home/runner/work/dealii/dealii/include/deal.II/base/numbers.h(722): warning #20011-D: calling a __host__ function("dealii::VectorizedArray<float, (unsigned long)1ul> ::VectorizedArray(float)") from a __host__ __device__ function("dealii::internal::NumberType< ::dealii::VectorizedArray<float, (unsigned long)1ul> > ::value<double> ") is not allowed

I recognize now that these just happened to be at the bottom of the (parallel build) log and I mistook them as the cause for the failure, but they are just warnings that happened to be at the bottom. Let's see what happens after fixing the left-over dynamic cast.

@Rombur
Copy link
Member

Rombur commented Jul 5, 2023

You can ignore these warnings. The issue is that we have many functions in Tensor that we want to use on the GPU and so they are annotated with __host__ __device__ but you cannot call std::complex on the GPU. So we get a warning when we instantiate them but it's not an issue unless you try to call the std::complex functions on the GPU.

@bangerth
Copy link
Member Author

bangerth commented Jul 5, 2023

I believe that the test failures are current master breakage. The CUDA CI succeeds, and that's the only one that should see the changes.

@masterleinad masterleinad merged commit 1af048d into dealii:master Jul 6, 2023
14 of 15 checks passed
@bangerth bangerth deleted the vsv-cuda branch July 6, 2023 12:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants