You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following code tests AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output) for Trilinos MPI vector. The way of the test is by comparing two solution vector u_1 and u2 obtained by the following simple equations.
M * u_1 = v_1 ... eq.1,
M * u_2 = v_2 ...eq.2,
where M is the mass matrix, v_1 is a r.h.s vector assembled each cell contribution with distribute_local_to_global, and v_2 is also a r.h.s vector defined as M * u_1 (v_2 = M * u_1). v_2 is calculated after solving eq.1.
For v_2, I need to condense this vector to solve eq.2, and it is done by AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output).
In principle, u_1 and u_2 must be identical within the machine accuracy (about 15 digits). However, the sample code shows wrong results in some MPI cases.
In the cases of 1 and 2 MPI process computations, the results are perfect as below.
In the case of 3 MPI process computation, we can find some differences from the previous results, which are v_2 norm (after condense) : 16.0877 and u_2 norm: 21.4271, which leads to error norm between u_1 and u_2 : 0.0262412.
In the case of 4 and 5 MPI process computations, we can still find the same differences.
Since v_1 norm (before condense) , v_2 norm (before condense), v_1 norm (after condense) and u_1 norm show the same values regardless of the number of MPI processes, these results mean that constraints.condense(ghosted_vec, nonghost_output);, which is used to condense v_2 in a void condense(...) defined at the top of the sample code, does something strange.
And, in the case of 6 MPI process computations, this code is stuck on the same function constraints.condense(ghosted_vec, nonghost_output);.
When I change the degree of finite element basis from 2 to 1, which is defined as fe_degree at the top of the main function, this code is stuck with 2 and more MPI process computations. For fe_degree = 1, this code can work well only in the case of a single MPI process.
I can not find the reason why AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output) does not condense properly.
For the problem that AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output) is stuck, I have take a look at the implementation of this function written in include/deal.II/lac/affine_constraints.templates.h.
I just paste the implementation below.
template <typename number>
template <class VectorType>
void
AffineConstraints<number>::condense(const VectorType &vec_ghosted,
VectorType & vec) const
{
Assert(sorted == true, ExcMatrixNotClosed());
// if this is called with different arguments, we need to copy the data
// over:
if (&vec != &vec_ghosted)
vec = vec_ghosted;
// distribute all entries, and set them to zero. do so in two loops
// because in the first one we need to add to elements and in the second
// one we need to set elements to zero. for parallel vectors, this can
// only work if we can put a compress() in between, but we don't want to
// call compress() twice per entry
for (const ConstraintLine &line : lines)
{
// in case the constraint is inhomogeneous, this function is not
// appropriate. Throw an exception.
Assert(line.inhomogeneity == number(0.),
ExcMessage("Inhomogeneous constraint cannot be condensed "
"without any matrix specified."));
const typename VectorType::value_type old_value = vec_ghosted(line.index);
for (const std::pair<size_type, number> &entry : line.entries)
if (vec.in_local_range(entry.first) == true)
vec(entry.first) +=
(static_cast<typename VectorType::value_type>(old_value) *
entry.second);
}
vec.compress(VectorOperation::add);
for (const ConstraintLine &line : lines)
if (vec.in_local_range(line.index) == true)
vec(line.index) = 0.;
vec.compress(VectorOperation::insert);
}
I am not sure about the reason, but I found that we can avoid the deadlock by inserting vec.compress(dealii::VectorOperation::add) or vec.compress(dealii::VectorOperation::insert) right after vec = vec_ghosted;. I hope this information helps something.
Finally, I would like to ask a question, (but this may be irrelevant to this topic).
About AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output) for Trilinos MPI vector, the input vector vec_ghosted must be a ghosted vector, which can be initialized like below.
In this initialization, somehow it seems that vector_writable must be false to use in AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output).
I have carefully read the documentation of deal.II, but I cannot find this reason because, in my understanding, vector_writable defines just that multiple threads can write into the vector at a time.
I would appreciate it if you can briefly tell me about this.
The text was updated successfully, but these errors were encountered:
The following code tests AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output) for Trilinos MPI vector. The way of the test is by comparing two solution vector u_1 and u2 obtained by the following simple equations.
M * u_1 = v_1 ... eq.1,
M * u_2 = v_2 ...eq.2,
where M is the mass matrix, v_1 is a r.h.s vector assembled each cell contribution with distribute_local_to_global, and v_2 is also a r.h.s vector defined as M * u_1 (v_2 = M * u_1). v_2 is calculated after solving eq.1.
For v_2, I need to condense this vector to solve eq.2, and it is done by AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output).
In principle, u_1 and u_2 must be identical within the machine accuracy (about 15 digits). However, the sample code shows wrong results in some MPI cases.
In the cases of 1 and 2 MPI process computations, the results are perfect as below.
In the case of 3 MPI process computation, we can find some differences from the previous results, which are
v_2 norm (after condense) : 16.0877
andu_2 norm: 21.4271
, which leads toerror norm between u_1 and u_2 : 0.0262412
.In the case of 4 and 5 MPI process computations, we can still find the same differences.
Since
v_1 norm (before condense)
,v_2 norm (before condense)
,v_1 norm (after condense)
andu_1 norm
show the same values regardless of the number of MPI processes, these results mean thatconstraints.condense(ghosted_vec, nonghost_output);
, which is used to condense v_2 in avoid condense(...)
defined at the top of the sample code, does something strange.And, in the case of 6 MPI process computations, this code is stuck on the same function
constraints.condense(ghosted_vec, nonghost_output);
.When I change the degree of finite element basis from 2 to 1, which is defined as
fe_degree
at the top of themain
function, this code is stuck with 2 and more MPI process computations. Forfe_degree = 1
, this code can work well only in the case of a single MPI process.I can not find the reason why
AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output)
does not condense properly.For the problem that
AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output)
is stuck, I have take a look at the implementation of this function written ininclude/deal.II/lac/affine_constraints.templates.h
.I just paste the implementation below.
I am not sure about the reason, but I found that we can avoid the deadlock by inserting vec.compress(dealii::VectorOperation::add) or vec.compress(dealii::VectorOperation::insert) right after
vec = vec_ghosted;
. I hope this information helps something.Finally, I would like to ask a question, (but this may be irrelevant to this topic).
About
AffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output)
for Trilinos MPI vector, the input vectorvec_ghosted
must be a ghosted vector, which can be initialized like below.In this initialization, somehow it seems that
vector_writable
must befalse
to use inAffineConstraints::condense(const VectorType &vec_ghosted, VectorType &output)
.I have carefully read the documentation of deal.II, but I cannot find this reason because, in my understanding,
vector_writable
defines just that multiple threads can write into the vector at a time.I would appreciate it if you can briefly tell me about this.
The text was updated successfully, but these errors were encountered: