Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

matrix-free FEEvaluation: initialize with nan #14152

Merged
merged 2 commits into from Jul 21, 2022
Merged

Conversation

tjhei
Copy link
Member

@tjhei tjhei commented Jul 19, 2022

Initialize the internal scratch_data of FEEvaluation with signaling_nans
in debug mode.

@tjhei
Copy link
Member Author

tjhei commented Jul 19, 2022

This is for trying to fix a bug I am seeing in ASPECT and @bangerth gave me the idea to initialize the internal data structures with NaNs to be able to more reliably see when things go wrong.

FYI @kronbichler

@tjhei
Copy link
Member Author

tjhei commented Jul 20, 2022

And the following tests fail with a FPE:

The following tests FAILED:

	5711 - matrix_free/matrix_vector_hessians_cells.debug (Failed)

	5753 - matrix_free/matrix_vector_15.debug (Failed)

	5804 - matrix_free/matrix_vector_faces_02.debug (Failed)

	5878 - matrix_free/matrix_vector_hessians_faces.debug (Failed)

😃

Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad this was useful. Let's hope that someone can see what the problem is -- that's exactly why we invented the signaling_nan() infrastructure.

# ifdef DEBUG
scratch_data_array->clear();
scratch_data_array->resize(allocated_size,
Number(numbers::signaling_nan<double>()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be the following instead?

Suggested change
Number(numbers::signaling_nan<double>()));
numbers::signaling_nan<Number>());

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't work because this will be instantiated for Number=VectorizedArray and .
I was too lazy to extend signaling_nan.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but then what are the semantics of casting a signaling nan double to a VectorizedArray?

Leave it as is for the moment -- the patch isn't going to get merged for now while there are failures, but if you don't mind, add the specializations of signaling_nan for VectorizedArray objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is calling the constructor that puts this value into all values:

VectorizedArray(const Number scalar)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you should be able to use

Suggested change
Number(numbers::signaling_nan<double>()));
Number(numbers::signaling_nan<ScalarNumber>()));

which is defined here:

using ScalarNumber =
typename internal::VectorizedArrayTrait<Number>::value_type;

@kronbichler
Copy link
Member

This approach currently breaks the use case in

AlignedVector<VectorizedDouble> evaluation_data;
eval_int.set_data_pointers(&evaluation_data, dim);
eval_ext.set_data_pointers(&evaluation_data, dim);
where we attach two different evaluator objects to the same storage array, so the second one frees the first item's memory. Let me try to fix it locally and see what happens.

@tjhei
Copy link
Member Author

tjhei commented Jul 20, 2022

Is that the reason for the test failures?

@kronbichler
Copy link
Member

Indeed, with this fixed I see the tests passing on my local machine. Let me push this fix, it is better coding style to attach separate arrays anyway, irrespective of this PR.

@tjhei
Copy link
Member Author

tjhei commented Jul 20, 2022

adjusted and rebased.

@bangerth
Copy link
Member

Let's hope this works now!

@bangerth
Copy link
Member

The Windows compiler has some trouble:

D:\a\dealii\dealii\include\deal.II/base/numbers.h(926,34): error C2039: 'NumberType': is not a member of 'dealii::numbers::internal' [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
D:\a\dealii\dealii\include\deal.II/base/signaling_nan.h(33): message : see declaration of 'dealii::numbers::internal' [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
D:\a\dealii\dealii\include\deal.II/base/numbers.h(942): message : see reference to function template instantiation 'bool dealii::numbers::values_are_equal<Number,double>(const Number1 &,const Number2 &)' being compiled [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
          with
          [
              Number=double,
              Number1=double,
              Number2=double
          ]
D:\a\dealii\dealii\include\deal.II/base/tensor.h(1548): message : see reference to function template instantiation 'bool dealii::numbers::value_is_zero<Number>(const Number &)' being compiled [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
          with
          [
              Number=double
          ]
D:\a\dealii\dealii\include\deal.II/base/tensor.h(1547): message : while compiling class template member function 'dealii::Tensor<1,1,double> &dealii::Tensor<1,1,double>::operator =(const Number &)' [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
          with
          [
              Number=double
          ]
D:\a\dealii\dealii\include\deal.II/base/tensor.h(949): message : see reference to function template instantiation 'dealii::Tensor<1,1,double> &dealii::Tensor<1,1,double>::operator =(const Number &)' being compiled [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
          with
          [
              Number=double
          ]
D:\a\dealii\dealii\include\deal.II/base/tensor.h(2172): message : see reference to function template instantiation 'dealii::Tensor<1,1,double> dealii::internal::TensorImplementation::division_operator<1,1,double,OtherNumber,0>(const dealii::Tensor<1,1,double> &,const OtherNumber &)' being compiled [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
          with
          [
              OtherNumber=double
          ]
D:\a\dealii\dealii\include\deal.II/fe/mapping_q_internal.h(2013): message : see reference to function template instantiation 'dealii::Tensor<1,1,double> dealii::operator /<1,1,double,double>(const dealii::Tensor<1,1,double> &,const OtherNumber &)' being compiled [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]
          with
          [
              OtherNumber=double
          ]
D:\a\dealii\dealii\include\deal.II/fe/mapping_q_internal.h(2098): message : see reference to function template instantiation 'void dealii::internal::MappingQImplementation::maybe_compute_face_data<1,1>(const dealii::MappingQ<1,1> &,const dealii::TriaIterator<dealii::CellAccessor<1,1>> &,const unsigned int,const unsigned int,const unsigned int,const std::vector<double,std::allocator<double>> &,const dealii::MappingQ<1,1>::InternalData &,dealii::internal::FEValuesImplementation::MappingRelatedData<1,1> &)' being compiled [D:\a\dealii\dealii\build\source\fe\obj_fe_debug.vcxproj]

@tjhei
Copy link
Member Author

tjhei commented Jul 21, 2022

I am not sure how to make MSVC happy, so I am disabling the NaNs there. 🤷

Initialize the internal scratch_data of FEEvaluation with signaling_nans
in debug mode.
otherwise MSVC tries to use
numbers::internal from signaling_nan.h
@masterleinad masterleinad merged commit 9407279 into dealii:master Jul 21, 2022
@tjhei tjhei deleted the mf_nans branch July 21, 2022 20:00
@@ -892,7 +892,7 @@ namespace numbers
{
// Use the specialized definition for two ADOL-C taped types
return value_is_less_than(value_1,
internal::NumberType<adouble>::value(value_2));
::internal::NumberType<adouble>::value(value_2));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(oops My bad - the regression tester had been out of commission for a week. I missed a reboot.)

The global namespace qualifier ::internal is a typo. Fixed in #14162

mkghadban pushed a commit to OpenFCST/dealii that referenced this pull request Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants