Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single phase flow solver enhancement #75

Merged
merged 21 commits into from
Jul 20, 2018

Conversation

klevzoff
Copy link
Contributor

@klevzoff klevzoff commented Jul 7, 2018

Further work on single phase TPFA flow solver will be focused here

@rrsettgast
Copy link
Member

@klevzoff There are some merge conflicts with the pr that I just put in. I messed with some interface. Sorry about that. The good news is that I didn't run uncrustify, which means that the conflicts are rather minor. If you haven't done any more work past this one commit, I can resolve them for you. It is probably best in this case.

@rrsettgast
Copy link
Member

@klevzoff . Never mind about the merge conflicts. My merge tool was making it difficult, but there were only trivial merge conflicts.

@klevzoff
Copy link
Contributor Author

klevzoff commented Jul 8, 2018

@rrsettgast thanks for taking care of this. I knew you were working on interface changes, was planning to rebase on top of it later.

@klevzoff
Copy link
Contributor Author

klevzoff commented Jul 11, 2018

Still work in progress, but looking for reviews on what's there. Most changes in PrecomputeData() and flux assembly. Sorry for a bunch of formatting related changes scattered elsewhere, my IDE does that with any code I touch and it's pretty hard to revert back. Main goals so far were:

  • Incorporate diagonal tensor permeability and align TPFA transmissibility computations with what's typically used in FV literature
  • Have MPFA-style flux assembly with general stencil and upstream weighting of fluid properties

Couple notes:

  • PrecomputeData() is now called from SolverStep() as opposed to FinalInitialization(), but there is a flag to make sure it's only done once. Reason is: during FinalInitialization() we don't have access to required data like permeability, that is treated as initial conditions and applied later.
  • had to comment out registration of gravityFlag input due to a weird error (something about "gravityFlag" key being inserted twice) that does not reproduce in develop, although related code seems identical. Will figure out later.

@@ -49,11 +49,11 @@
<BasisFunctions>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a TODO to remove all the non-essential blocks from this file (like the FE and failure criterion stuff).

Copy link
Contributor Author

@klevzoff klevzoff Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried, but turns out it's not so easy :) ElementRegion and BoundaryConditionManager rely on a FE numerical method being defined in the input. I'll create a separate issue to deal with it, where we can also think in general how to extend the NumericalMethods section to support other discretization techniques.

I removed what I could for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created #88

@@ -387,7 +410,7 @@ void SinglePhaseFlow_TPFA::ImplicitStepSetup( real64 const& time_n,
EOS->EquationOfStateDensityUpdate( dPressure[er][esr][k],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend we migrate EOS calls to using absolute fields (P,T) not incremental fields (dP, dT).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can support both. I always work with deltas if I can.

Copy link
Contributor

@joshua-white joshua-white left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest we add an integrated test to compare with an analytical solution. Otherwise, I think the physics now looks reasonable and we should start working on better modularizing stuff.

@klevzoff klevzoff self-assigned this Jul 13, 2018
@klevzoff klevzoff added the type: feature New feature or request label Jul 13, 2018
/**
* @struct A structure containing a single cell (element) identifier triplet
*/
struct CellDescriptor
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these structs (CellDescriptor and Connection) essentially temporary copies of the data that they are comprised of? Did you have the intent to try to get better cache performance by grouping the data like this? @artv3 do you have an opinion on how something like this might end up performing in a cuda kernel?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is essentially an R1Tensor right? Are we planning to create an array of structs with them? If so I think it would be good for vectorization. In particular if we can store them as int4's then we can get the best performance. This is something I want to try in the mini-app.

Copy link
Contributor Author

@klevzoff klevzoff Jul 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CellConnection (maybe not the best name) describes a general multi-point stencil, so, in addition to identifiers of the two connected cells (connectedCellIndices), which yes are a convenience copy of face manager data, it stores a variable-length array (stencilCellIndices) of identifiers of cells that participate in the stencil (i.e. whose pressure DOFs are included in flux computation), along with weights (transmissibilities). This array is constructed by the discretization scheme and cannot be recovered from FaceManager.

The triplet in CellDecriptor is pretty much always accessed and used together, so I didn't see the point of splitting it into separate scalar arrays. The whole reason for its existence is the fact that we don't have a contiguous numbering of elements across subregions. On the other hand, CellConnection struct could be split if need be and we could get rid of connectedCellIndices, since faceIndex is also stored.

Any input is welcome on how best to reorganize storage for maximum efficiency. These things will eventually be migrated to a separate FVDiscretization object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will there be a lot of variation of the number number of cells in the stencils? If we have different threads carryout different stencils that could be bad on the GPU. If there is not a lot of variation between stencils could we batch them so kernel 1 carries out stencil 1, then kernel 2 carried out stencil 2 ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artv3
Are you asking if there will be a fixed number of elements associated with a face for a given stencil? If that is the question, then the answer is no for an unstructured mesh, and yes for a structured mesh.

I think that the best thing to do would be to do as you imply by batching the processing of faces. However the access patterns will be all over the place, as the element data that supports the stencil will not be ordered in an unstructured mesh. Again, if it is a structured mesh there is all sorts of things that we can do. @klevzoff would you agree with that statement?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would someone be able to email me (vargas45@llnl.gov) some literature for a crash course? I can't say I'm too familiar with the application and it would help me figure out how to better map the method to RAJA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does GEOSX generate meshes for this solver?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artv3 Same as any other solver, internal generator makes a cartesian grid and slices boxes into elements of given type, plus possible perturbation of nodal coordinates (I think? that's just from quick look at the code). But we also support externally generated meshes, which could be anything (within the range of supported element types).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klevzoff thanks, the solver is set up to work with any type of mesh element though right? If we do have situations were we can make assumptions about cell types that would be advantageous from an performance point of view.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@artv3 yes, in principle the solver can work with arbitrary polygonal cells. The assembly is purely connectivity-based, the only step that actually deals with geometrical shapes is precomputation of transmissibilies/volumes (speaking of which, volumes are currently only computed correctly for hexes, I will improve that in a separate branch).

@@ -757,7 +757,7 @@ void SinglePhaseFlow_TPFA::AssembleSystem ( DomainPartition * const domain,
const real64_array & trans = faceManager->getReference<real64_array>(viewKeyStruct::transmissibilityString);

//***** Loop over all elements and assemble the change in volume/density terms *****
Epetra_LongLongSerialDenseVector elemDOF(1);
Epetra_IntSerialDenseVector elemDOF(1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had changed this to be consistent with the integral type we are using to store the DOF, although I didn't know if there was a drawback to that. Just for for clarity, the localIndex type is an alias of the type "std::int_fast32_t" which is defined as the fastest integer type with at least 32 bits. https://stackoverflow.com/questions/9239558/what-is-the-difference-between-intxx-t-and-int-fastxx-t On Linux systems this appears to be a 64 bit integer. I think that it is fine to use the Epetra_IntSerialDenseVector and do an integer_cast, but we really should wrap the Epetra_IntSerialDenseVector type in the linear solver interface such that we are always consistent. As a side note, I know you aren't really planning on doing 2 billion degree of freedom problems, but we will need to for the ECP project, and in general to claim that you are a next gen code you are looking at problems sizes in this range. @artv3 maybe we can look at the performance gains between using int_32 and int32_fast on systems where in32_fast is an int64. It seems like the compiler optimizations that you get from int64 might be offset by the doubling of the amount of data you have to move to get the indices around.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@klevzoff Oh wait. I just read the subtext of the commit. Epetra_LongLongSerialDenseVector was throwing an exception? When was the expiation thrown?

Copy link
Contributor Author

@klevzoff klevzoff Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rrsettgast this is what I had at the very first assembly call:

Running simulation:
Time: 0s, dt:10s, Cycle: 0
terminate called after throwing an instance of 'int'
executing stackTrace.cpp::handler(6,1,1)
attempting unmangled trace: 
0         1         2         3         4         5         6         7         8         9         : 
0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789: 
../../../../../build-default-debug/bin/geosx(1)  : cxx_utilities::handler1(int)+0x1f [0x55dce6c9ce83]
/lib/x86_64-linux-gnu/libc.so.6(2)  : +0x3ef20 [0x7f790e840f20]
/lib/x86_64-linux-gnu/libc.so.6(3)  : gsignal+0xc7 [0x7f790e840e97]
/lib/x86_64-linux-gnu/libc.so.6(4)  : abort+0x141 [0x7f790e842801]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(5)  : +0x8c8fb [0x7f790f0b68fb]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(6)  : +0x92d3a [0x7f790f0bcd3a]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(7)  : +0x92d95 [0x7f790f0bcd95]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(8)  : +0x92fe8 [0x7f790f0bcfe8]
/home/klevtsov/work/geosx/thirdPartyLibs/install-default-release/trilinos/lib/libepetra.so.12(9)  : int Epetra_MultiVector::ChangeGlobalValue<long long>(long long, int, int, double, bool)+0x25b [0x7f790f72c76b]
/home/klevtsov/work/geosx/thirdPartyLibs/install-default-release/trilinos/lib/libepetra.so.12(10)  : Epetra_MultiVector::SumIntoGlobalValue(long long, int, int, double)+0xf [0x7f790f72c13f]
/home/klevtsov/work/geosx/thirdPartyLibs/install-default-release/trilinos/lib/libepetra.so.12(11)  : int Epetra_FEVector::inputValues<long long>(int, long long const*, double const*, bool, int)+0x77 [0x7f790f6fe637]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(12)  : +0x6a0ecd [0x7f79108f0ecd]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(13)  : +0x6a989c [0x7f79108f989c]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(14)  : +0x6aacea [0x7f79108facea]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(15)  : +0x6aafb9 [0x7f79108fafb9]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(16)  : +0x6a9a58 [0x7f79108f9a58]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(17)  : geosx::SinglePhaseFlow_TPFA::AssembleSystem(geosx::DomainPartition*, geosx::systemSolverInterface::EpetraBlockSystem*, double, double)+0xa59 [0x7f79108f196d]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(18)  : geosx::SolverBase::NonlinearImplicitStep(double const&, double const&, int, geosx::DomainPartition*, geosx::systemSolverInterface::EpetraBlockSystem*)+0x13a [0x7f79108cb65a]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(19)  : geosx::SinglePhaseFlow_TPFA::SolverStep(double const&, double const&, int, geosx::dataRepository::ManagedGroup*)+0x89 [0x7f79108ed955]
/home/klevtsov/work/geosx/GEOSX/build-default-debug/lib/libgeosx_core.so(20)  : geosx::ProblemManager::RunSimulation()+0x670 [0x7f7910867660]
../../../../../build-default-debug/bin/geosx(21)  : +0x71ba2 [0x55dce6c97ba2]
/lib/x86_64-linux-gnu/libc.so.6(22)  : __libc_start_main+0xe7 [0x7f790e823b97]
../../../../../build-default-debug/bin/geosx(23)  : +0x7175a [0x55dce6c9775a]

I checked in gdb and there was nothing fishy going on on our side (i.e. values passed as indices were good), but the debugger does not see into Trilinos code, which is built without debug info. Just replacing the Epetra vector type solves the problem. Did you not have the same issue on your system?

I just reverted it back temporarily so I could continue developing and testing other things. I agree the index data type should be 64 bit.

Copy link
Contributor Author

@klevzoff klevzoff Jul 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found the cause, will push a fix soon.

- Physics: slightly compressible solid (uniaxial compressibility), pressure dependent viscosity.
- LinearEOS: add viscosity and porosity updates using (linearized) exponential formulas, extract a generic exponential relation compute class.
- Other: implement pore volume scaling in residual norm calculation (residual is now relative)
- All quantities allocated on the mesh instead of constitutive data (temporarily)
…scaled residual computation without copying the vector
NUMBER sum_in_set(localIndex const * const indexList, const localIndex len, LAMBDA && body)
{
RAJA::ReduceSum<REDUCE_POLICY, NUMBER> sum(NUMBER(0));
RAJA::TypedListSegment<localIndex> listSeg(indexList, len);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps make listSeg(indexList, len) -> listSeg(indexList, len, Unowned), to avoid a deep copy of the data.

@klevzoff
Copy link
Contributor Author

klevzoff commented Jul 20, 2018

@rrsettgast @joshua-white @castelletto1 @artv3 requesting more review comments or approval. I would rather like to merge this sooner than later (before some big refactoring hits develop and we have to re-merge all changes all over again).

Some unattended points remaining:

  • representation of stencils and efficient flux assembly kernels: this is what I plan to work on next, but it will be a part of a new component (FVDiscretizationManager or some such) and in a separate pr
  • analytical solution tests: I have committed very simple incompressible tests with linear pressure distribution in 2D; more complex functions like cosines turned out to be trickier - need to implement proper pressure boundary conditions first (imposed on boundary faces, not cell centers). Also we're not running tests in Travis yet, so I don't want to spend too much time on this.

Other than that I think the solver is in relatively decent shape for now. Things may (and will) be revisited as we work on multi-phase solver and also whenever other components are refactored (constitutive, linear algebra, etc.)

@klevzoff klevzoff merged commit b6b2dac into develop Jul 20, 2018
@klevzoff klevzoff deleted the feature/klevzoff/single-phase-flow branch July 20, 2018 20:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants