-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Single phase flow solver enhancement #75
Conversation
…n, some changes in assembly
@klevzoff There are some merge conflicts with the pr that I just put in. I messed with some interface. Sorry about that. The good news is that I didn't run uncrustify, which means that the conflicts are rather minor. If you haven't done any more work past this one commit, I can resolve them for you. It is probably best in this case. |
@klevzoff . Never mind about the merge conflicts. My merge tool was making it difficult, but there were only trivial merge conflicts. |
@rrsettgast thanks for taking care of this. I knew you were working on interface changes, was planning to rebase on top of it later. |
Still work in progress, but looking for reviews on what's there. Most changes in
Couple notes:
|
@@ -49,11 +49,11 @@ | |||
<BasisFunctions> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding a TODO to remove all the non-essential blocks from this file (like the FE and failure criterion stuff).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried, but turns out it's not so easy :) ElementRegion
and BoundaryConditionManager
rely on a FE numerical method being defined in the input. I'll create a separate issue to deal with it, where we can also think in general how to extend the NumericalMethods
section to support other discretization techniques.
I removed what I could for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Created #88
@@ -387,7 +410,7 @@ void SinglePhaseFlow_TPFA::ImplicitStepSetup( real64 const& time_n, | |||
EOS->EquationOfStateDensityUpdate( dPressure[er][esr][k], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recommend we migrate EOS calls to using absolute fields (P,T) not incremental fields (dP, dT).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is now done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can support both. I always work with deltas if I can.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest we add an integrated test to compare with an analytical solution. Otherwise, I think the physics now looks reasonable and we should start working on better modularizing stuff.
/** | ||
* @struct A structure containing a single cell (element) identifier triplet | ||
*/ | ||
struct CellDescriptor |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these structs (CellDescriptor and Connection) essentially temporary copies of the data that they are comprised of? Did you have the intent to try to get better cache performance by grouping the data like this? @artv3 do you have an opinion on how something like this might end up performing in a cuda kernel?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is essentially an R1Tensor right? Are we planning to create an array of structs with them? If so I think it would be good for vectorization. In particular if we can store them as int4's then we can get the best performance. This is something I want to try in the mini-app.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CellConnection
(maybe not the best name) describes a general multi-point stencil, so, in addition to identifiers of the two connected cells (connectedCellIndices
), which yes are a convenience copy of face manager data, it stores a variable-length array (stencilCellIndices
) of identifiers of cells that participate in the stencil (i.e. whose pressure DOFs are included in flux computation), along with weights (transmissibilities). This array is constructed by the discretization scheme and cannot be recovered from FaceManager
.
The triplet in CellDecriptor
is pretty much always accessed and used together, so I didn't see the point of splitting it into separate scalar arrays. The whole reason for its existence is the fact that we don't have a contiguous numbering of elements across subregions. On the other hand, CellConnection
struct could be split if need be and we could get rid of connectedCellIndices
, since faceIndex
is also stored.
Any input is welcome on how best to reorganize storage for maximum efficiency. These things will eventually be migrated to a separate FVDiscretization object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will there be a lot of variation of the number number of cells in the stencils? If we have different threads carryout different stencils that could be bad on the GPU. If there is not a lot of variation between stencils could we batch them so kernel 1 carries out stencil 1, then kernel 2 carried out stencil 2 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artv3
Are you asking if there will be a fixed number of elements associated with a face for a given stencil? If that is the question, then the answer is no for an unstructured mesh, and yes for a structured mesh.
I think that the best thing to do would be to do as you imply by batching the processing of faces. However the access patterns will be all over the place, as the element data that supports the stencil will not be ordered in an unstructured mesh. Again, if it is a structured mesh there is all sorts of things that we can do. @klevzoff would you agree with that statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would someone be able to email me (vargas45@llnl.gov) some literature for a crash course? I can't say I'm too familiar with the application and it would help me figure out how to better map the method to RAJA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does GEOSX generate meshes for this solver?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artv3 Same as any other solver, internal generator makes a cartesian grid and slices boxes into elements of given type, plus possible perturbation of nodal coordinates (I think? that's just from quick look at the code). But we also support externally generated meshes, which could be anything (within the range of supported element types).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klevzoff thanks, the solver is set up to work with any type of mesh element though right? If we do have situations were we can make assumptions about cell types that would be advantageous from an performance point of view.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@artv3 yes, in principle the solver can work with arbitrary polygonal cells. The assembly is purely connectivity-based, the only step that actually deals with geometrical shapes is precomputation of transmissibilies/volumes (speaking of which, volumes are currently only computed correctly for hexes, I will improve that in a separate branch).
…evzoff/single-phase-flow
(using Epetra_LongLongSerialDenseVector throws exceptions)
@@ -757,7 +757,7 @@ void SinglePhaseFlow_TPFA::AssembleSystem ( DomainPartition * const domain, | |||
const real64_array & trans = faceManager->getReference<real64_array>(viewKeyStruct::transmissibilityString); | |||
|
|||
//***** Loop over all elements and assemble the change in volume/density terms ***** | |||
Epetra_LongLongSerialDenseVector elemDOF(1); | |||
Epetra_IntSerialDenseVector elemDOF(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had changed this to be consistent with the integral type we are using to store the DOF, although I didn't know if there was a drawback to that. Just for for clarity, the localIndex type is an alias of the type "std::int_fast32_t" which is defined as the fastest integer type with at least 32 bits. https://stackoverflow.com/questions/9239558/what-is-the-difference-between-intxx-t-and-int-fastxx-t On Linux systems this appears to be a 64 bit integer. I think that it is fine to use the Epetra_IntSerialDenseVector and do an integer_cast, but we really should wrap the Epetra_IntSerialDenseVector type in the linear solver interface such that we are always consistent. As a side note, I know you aren't really planning on doing 2 billion degree of freedom problems, but we will need to for the ECP project, and in general to claim that you are a next gen code you are looking at problems sizes in this range. @artv3 maybe we can look at the performance gains between using int_32 and int32_fast on systems where in32_fast is an int64. It seems like the compiler optimizations that you get from int64 might be offset by the doubling of the amount of data you have to move to get the indices around.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klevzoff Oh wait. I just read the subtext of the commit. Epetra_LongLongSerialDenseVector was throwing an exception? When was the expiation thrown?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rrsettgast this is what I had at the very first assembly call:
Running simulation:
Time: 0s, dt:10s, Cycle: 0
terminate called after throwing an instance of 'int'
executing stackTrace.cpp::handler(6,1,1)
attempting unmangled trace:
0 1 2 3 4 5 6 7 8 9 :
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789:
../../../../../build-default-debug/bin/geosx(1) : cxx_utilities::handler1(int)+0x1f [0x55dce6c9ce83]
/lib/x86_64-linux-gnu/libc.so.6(2) : +0x3ef20 [0x7f790e840f20]
/lib/x86_64-linux-gnu/libc.so.6(3) : gsignal+0xc7 [0x7f790e840e97]
/lib/x86_64-linux-gnu/libc.so.6(4) : abort+0x141 [0x7f790e842801]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(5) : +0x8c8fb [0x7f790f0b68fb]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(6) : +0x92d3a [0x7f790f0bcd3a]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(7) : +0x92d95 [0x7f790f0bcd95]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(8) : +0x92fe8 [0x7f790f0bcfe8]
/home/klevtsov/work/geosx/thirdPartyLibs/install-default-release/trilinos/lib/libepetra.so.12(9) : int Epetra_MultiVector::ChangeGlobalValue<long long>(long long, int, int, double, bool)+0x25b [0x7f790f72c76b]
/home/klevtsov/work/geosx/thirdPartyLibs/install-default-release/trilinos/lib/libepetra.so.12(10) : Epetra_MultiVector::SumIntoGlobalValue(long long, int, int, double)+0xf [0x7f790f72c13f]
/home/klevtsov/work/geosx/thirdPartyLibs/install-default-release/trilinos/lib/libepetra.so.12(11) : int Epetra_FEVector::inputValues<long long>(int, long long const*, double const*, bool, int)+0x77 [0x7f790f6fe637]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(12) : +0x6a0ecd [0x7f79108f0ecd]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(13) : +0x6a989c [0x7f79108f989c]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(14) : +0x6aacea [0x7f79108facea]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(15) : +0x6aafb9 [0x7f79108fafb9]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(16) : +0x6a9a58 [0x7f79108f9a58]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(17) : geosx::SinglePhaseFlow_TPFA::AssembleSystem(geosx::DomainPartition*, geosx::systemSolverInterface::EpetraBlockSystem*, double, double)+0xa59 [0x7f79108f196d]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(18) : geosx::SolverBase::NonlinearImplicitStep(double const&, double const&, int, geosx::DomainPartition*, geosx::systemSolverInterface::EpetraBlockSystem*)+0x13a [0x7f79108cb65a]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(19) : geosx::SinglePhaseFlow_TPFA::SolverStep(double const&, double const&, int, geosx::dataRepository::ManagedGroup*)+0x89 [0x7f79108ed955]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(20) : geosx::ProblemManager::RunSimulation()+0x670 [0x7f7910867660]
../../../../../build-default-debug/bin/geosx(21) : +0x71ba2 [0x55dce6c97ba2]
/lib/x86_64-linux-gnu/libc.so.6(22) : __libc_start_main+0xe7 [0x7f790e823b97]
../../../../../build-default-debug/bin/geosx(23) : +0x7175a [0x55dce6c9775a]
I checked in gdb and there was nothing fishy going on on our side (i.e. values passed as indices were good), but the debugger does not see into Trilinos code, which is built without debug info. Just replacing the Epetra vector type solves the problem. Did you not have the same issue on your system?
I just reverted it back temporarily so I could continue developing and testing other things. I agree the index data type should be 64 bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the cause, will push a fix soon.
- Physics: slightly compressible solid (uniaxial compressibility), pressure dependent viscosity. - LinearEOS: add viscosity and porosity updates using (linearized) exponential formulas, extract a generic exponential relation compute class. - Other: implement pore volume scaling in residual norm calculation (residual is now relative) - All quantities allocated on the mesh instead of constitutive data (temporarily)
…scaled residual computation without copying the vector
NUMBER sum_in_set(localIndex const * const indexList, const localIndex len, LAMBDA && body) | ||
{ | ||
RAJA::ReduceSum<REDUCE_POLICY, NUMBER> sum(NUMBER(0)); | ||
RAJA::TypedListSegment<localIndex> listSeg(indexList, len); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps make listSeg(indexList, len) -> listSeg(indexList, len, Unowned), to avoid a deep copy of the data.
@rrsettgast @joshua-white @castelletto1 @artv3 requesting more review comments or approval. I would rather like to merge this sooner than later (before some big refactoring hits Some unattended points remaining:
Other than that I think the solver is in relatively decent shape for now. Things may (and will) be revisited as we work on multi-phase solver and also whenever other components are refactored (constitutive, linear algebra, etc.) |
Further work on single phase TPFA flow solver will be focused here