Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix array vars for SIDE_HIERARCHIC #27639

Open
wants to merge 1 commit into
base: next
Choose a base branch
from

Conversation

maxnezdyur
Copy link
Contributor

@maxnezdyur maxnezdyur commented May 15, 2024

closes #27610
Still a work in progress.
Need to cache the map so it only created once or whenever the dof map changes.

@moosebuild
Copy link
Contributor

moosebuild commented May 15, 2024

Job Documentation on 24bf550 wanted to post the following:

View the site here

This comment will be updated on new commits.

@moosebuild
Copy link
Contributor

Job Precheck on 16d6ab7 wanted to post the following:

Your code requires style changes.

A patch was auto generated and copied here
You can directly apply the patch by running, in the top level of your repository:

curl -s https://mooseframework.inl.gov/docs/PRs/27639/clang_format/style.patch | git apply -v

Alternatively, with your repository up to date and in the top level of your repository:

git clang-format c531dc9fff65967b07cd79c4d990f5b889efa328

@moosebuild
Copy link
Contributor

moosebuild commented May 17, 2024

Job Coverage on 24bf550 wanted to post the following:

Framework coverage

0b390f #27639 24bf55
Total Total +/- New
Rate 85.10% 85.11% +0.01% 88.33%
Hits 103327 103357 +30 53
Misses 18088 18086 -2 7

Diff coverage report

Full coverage report

Modules coverage

Coverage did not change

Full coverage reports

Reports

Warnings

  • framework new line coverage rate 88.33% is less than the suggested 90.0%

This comment will be updated on new commits.

@maxnezdyur
Copy link
Contributor Author

@roystgnr
Can you look at this and let me know if the fix I added is worth pursuing? I added a test that shows the problem we had before was fixed. Just not sure if the fix will be efficient enough or if there is a better way to do this. Thanks.

Copy link
Contributor

@roystgnr roystgnr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Roy: In my comment I mentioned the Giant Reverse-Lookup Table as a cautionary tale

PR: At long last, we have created the Giant Reverse Lookup Table from classic Roy comment, "Don't Create the Giant Reverse-Lookup Table"

I could perhaps be persuaded that my idea of the "right way" to do this (DofObject APIs to get stride lengths in-loop) is somehow even worse than trying to maintain a giant new index via a ton of new code, but I'm going to want to see benchmarks first, and even then I bet there are more local optimizations possible.

@@ -244,6 +250,9 @@ class MooseVariableBase : public MooseObject,

/// Whether this variable lives on lower dimensional blocks
bool _is_lower_d;

/// Determines the stride from one component to another.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I prefer comments about any kind of array or map to be more explicit about the meaning of the inputs and outputs, especially when their types are some type of int that lacks inherent context. Here it seems clear that "stride" is a difference between two global DoF indices, but that might only be because I recently read the relevant issue, and it doesn't seem clear (a priori, without thinking about the underlying mechanics) whether in _stride[i] we need i to be a variable number, or a local DoF number for the current Elem type, or what.

framework/src/variables/MooseVariableBase.C Outdated Show resolved Hide resolved
}
}
for (auto & id : new_dof_indices)
id += component * _stride[id];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so much for my guesses about indexing _stride by variable number vs local DoF index - you're feeding global DoF indices in here? Either I'm misunderstanding something, or that's the "inverse lookup table that in the best case would double the memory we use for indexing ... in the thousands for some of your use cases" disaster I was talking about in #27610 - surely we can do better!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like the only place we use _stride, in this little local loop. Did I miss someplace elsewhere?

If not, I'd have just used the local APIs to get local information. That's what dof_indices is doing in the first place anyway. On each DofObject the stride between component 0 and component c for the array variable with base variable number v is dof_object->n_comp(sys_num,v)*c. No global O(N_nodes) vectors or parallel communication should be necessary here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is it assumed that componentDofIndices will only be called one element/dof object at a time?
For example, if someone has the dof index i, and just needs to get the dof index j which is the component c of that array variable would your method work? I was having trouble because I was trying to make this general enough that given a single dof or any vector of dofs in any order of the component 0 of the array variable, componentDofIndices would return the correct component c dofs that correspond.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point. I looked into the use cases in the framework and it's always in the context of a single DenseMatrix jacobian block, but in the context of DG and FV schemes that doesn't have to mean we're on a single element, does it?

I'm going to have to harass @lindsayad to double-check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the local method works, if when we populate _dof_indices in prepare(), reinitAux(), reinitNodes(), and reinitNode(), we also create the _strides vector using the correct dof_object. But it just requires that if a user needs a componentDoF, they need to give the correct dof_indices vector. To force that componentDofIndices should only request the component (no const std::vector<dof_id_type> & dof_indices ) and always just use the local _dof_indices and the user has no way to arbitrarily ask for a specific component dof.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If need be we can just use the stride method for anything that is not already correct? Since the old method works perfect for Lagrange and monomial families I keep that part of the code. Any family which that isn't true, use strides?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to look at this in detail to answer you questions. If @roystgnr beats me to it, then more power to him

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more idea, what if we just change how libmesh adds the dofs in for Hiecharic + other families?
We make sure that the dofs added follow the logic expected before for isNodal() It would just change the order we put the dofs into the midside nodes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no way to change how libMesh adds DoFs for multiple variables in a group without breaking existing optimizations, optimizations that are saving us factors of n_components for array variables in particular. I'm not even sure there's a way to change things with breaking existing optimizations - the logic for isNodal() is basically assuming that we have one DoF per node, but as soon as you e.g. try to put 16 cubic DoFs onto a Quad9, 16>9 really bites you there.

We could add array variables to libMesh, instead of the hack here to emulate them via multiple variables in a group, but that wouldn't be a quick fix and I'm not sure how many APIs would need changing.

It looks like, although we're not always dealing with DoF indices that are local to a single element at a time, we're always dealing with index vectors that are local to a single element or one of its neighbors? That would still be just fine for figuring out strides on a local level, if we could pass along which element was associated with each indices vector.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the largest "stencil" we support for finite elements is mortar, which has a higher-d "element", higher-d "neighbor", and lower-d "side" whose interior parent is the "element". So this suggestion

That would still be just fine for figuring out strides on a local level, if we could pass along which element was associated with each indices vector.

seems good to me

@maxnezdyur
Copy link
Contributor Author

In cacheJacobianBlockNonzero below, jvar->allDofIndices(), is the vector of interest. At some point we will call jvar.componentDofIndices(jdof_indices, j); on it and the vector of dofs given might not be local to the element. So that's why I made stride "global".

void
Assembly::cacheJacobianNonlocal(GlobalDataKey)
{
  for (const auto & it : _cm_nonlocal_entry)
  {
    auto ivar = it.first;
    auto jvar = it.second;
    auto i = ivar->number();
    auto j = jvar->number();
    for (MooseIndex(_jacobian_block_nonlocal_used) tag = 0;
         tag < _jacobian_block_nonlocal_used.size();
         tag++)
      if (jacobianBlockNonlocalUsed(tag, i, j) && _sys.hasMatrix(tag))
        cacheJacobianBlockNonzero(jacobianBlockNonlocal(i, j, LocalDataKey{}, tag),
                                  *ivar,
                                  *jvar,
                                  ivar->dofIndices(),
                                  jvar->allDofIndices(),
                                  tag);
  }
} 

@YaqiWang
Copy link
Contributor

YaqiWang commented May 23, 2024

If the issue does not affect elemental shape functions, I would like a short-cut for elemental shape functions to avoid potential performance penalty because Griffin uses them quite often in a main functionality. If the change here does not affect elemental shape functions, please discard my comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SIDE_HIERARCHIC doesn't work with array variables
5 participants