Petsc interface #1805

cmacmackin · 2019-10-14T10:48:31Z

This is the PETSc interface which I implemented as part of #1803, which @ZedThree requested I split into a separate PR. It addresses issue #1771 and allows vectors and arrays to be set using the local BOUT++ index objects for that processor. The wrapper will automatically handle conversions to global indexing for PETSc. It also handles the interpolation weights needed when calculating along-field values in the y-direction. It has been extensively unit tested. The unit tests for the index conversion ability required some modest refactoring of the Mesh class, making some methods virtual which weren't previously and adding methods which wrap some MPI calls (so that they can be overridden for testing purposes). This allows the parallel behaviour of the indexing to be tested while running the unit tests in serial.

There were a number of specific comments Peter made about my code on the old PR. I will address those and then push the changes to this branch.

The basic call-sequence of my PETSc interface can be seen below:

Basically, I need the indexing operator to return something to which matrix/vector values can be assigned. When assigning to that object, it would then call the PETSc setter routines. While something like this could have been accomplished with a single method on PetscMatrix along the lines of

  setElement(ind_type& index1, ind_type& index2, BoutReal val);

the approach I took allowed something which feels more natural and is thus more readable.

Peter had said

Also, historically Mesh has been a bit of dumping ground for new behaviour, and we'd quite like to avoid adding new things to it if at all possible. Do you think there are alternative ways of achieving the same functionality?

It should be possible to create a a new class which provides access to the MPI methods (or add them to BoutComm). The reason I took the approach I did was because it for minimal changes to existing calls to MPI routines within the code by ensuring the wrapper methods were in the same class/namespace as the Mesh routines which call them. I don't have any objections to placing the wrappers on a separate class, i just didn't feel confident enough to make such a disruptive change on my own.

* include/bout/petsc_interface.hxx: C

I have added an object-oriented interface between BOUT and PETSc which will allow for easier conversion between Bout Fields and PETSc vectors and matrices. In particular, the interface uses BOUT indexing and then internally handles the conversion to PETSc global indexing. I have written unit tests for this interface to ensure it behaves as expected. In particular, I have checked that it can handle indexing properly for parallel simulations. As all tests are run in serial, this required quite a lot of work. I created a new fake mesh class which acts as though it is on just one processor of many. By creating a group of these meshes they can pass information to each other, emulating communication between processors. I also needed to fake MPI calls, which required creating virtual methods for them on the mesh class. These are then overridden in the fake mesh implementation. I also structured the classes in my PETSc interface so that fake classes could override certain methods where it is necessary to make the pretend-parallelism work. This PETSc interface will make implementation of a 3D Laplace solver much simpler.

I've written implementations for the setters (tests passing), indexers, and vectors. For the latter two, the tests currently fail. I've also fixed the makefile so that gmock is now linked.

This required me to make a number of changes to the fake parallel mesh class I'd added for testing. I had made numerous errors in the implementation of that and had overlooked a few problems. However, the current state of the code will now compile and all indexer tests pass.

All unit tests now pass for this type. PETSc prints some error messages to the screen, but this is expected; it occurs when testing that erroneous use-cases are caught correctly.

While writing a 3D Laplace it I caught a few bugs here and there in my PETSc interface, particularly around its handling of corner cells.

These regions are now created during setup of the mesh. Creating them myself in the tests is unnecessary and leads to an error because they already exist.

@ZedThree

Following suggestions made by @ZedThree on PR #1803, I have made a number of changes to the PETSc interface. The most significant of these are: - Changing the template parameters in this interface from F to T - Moving the PetscVectorElement and PetscMatrixElement classes inside PetscVector and PetscMatrix, respectively. Then are now PetscVector::Element and PetscMatrix::Element - Changing the Element classes to return values rather than pointers - Adding conversion routines to turn PetscVector::Element into BoutReal - Removing the copy-and-swap idiom from the PetscVector copy-assignment operator - Making some more variables const and some return-types non-const (more still to be done on this) - Converting from C-style casting to C++ static_cast<> - Various other minor fixes and style changes

`FakeMesh::createBoundaryRegions()` is now called when constructing `FakeMeshFixture`, so attempts to call it a second time from within this unit test result in an error. As such, I removed the second call.

Converted these into PascalCase.

cmacmackin · 2019-10-15T14:24:54Z

Most of the specific issues raised on PR #1803 have now been resolved on this branch. The main ones that remain are

adding an operator BoutReal() method to PetscMatrix::Element, if possible
placing MPI calls onto their own class (e.g., BoutComm), rather than BoutMesh
placing the protected BoutMesh::BoutMesh() constructor elsewhere?
writing unit tests for operator BoutReal() on the Element classes
deciding how to handle assignment of Element objects to each other
anything else which gets flagged up during review

ZedThree · 2019-10-15T16:16:32Z

Currently failing to compile because it turns out std::less<void> in FakeParallelMesh is C++14 only, and we currently only support C++11 (this will be changing after the next release!).
Luckily, I don't think std::less<void> is required, so you should just be able to delete it.

I pass CXXFLAGS=-std=c++14 to ./configure to force C++11 compatibility.

There's a strange double-free bug coming from the unit tests. I'm not sure I completely understand, but it looks like it's something to do with the way createFakeProcessors is setting bout::globals::mesh. I fixed it by adding a destructor to the test that resets the global mesh to nullptr:

  virtual ~ParallelIndexerTest() {
    bout::globals::mesh = nullptr;
  }

Google recommend adding a virtual destructor to all test fixtures, btw. It's never been a problem for me, but it's probably wise.

The ParallelIndexerTest is quite noisy due to some of the functions its calling. There's a helper class, WithQuietOutput which can be used to silence particular output streams for the duration of a test. Just add some member variables to the fixture:

WithQuietOutput info{output_info}, warn{output_warn}, progress{output_progress}, all{output};

TestMixedSetting and TestDestroy for the vector and matrix tests is also noisy, but this is from PETSc. If you set PetscErrorPrintf = PetscErrorPrintfNone in the fixture constructor, and PetscErrorPrintf = PetscErrorPrintfDefault in the destructor, that should silence them.
I'm guessing these are supposed to raise PETSc errors?

This required both disabling the output streams used by BOUT++ (to suppress the information it prints to the screen) and setting `PetscErrorPrintf = PetscErrorPrintfNone;` so that PETSc error messages (which are expected on tests checking erroneous behaviour) are no longer visible.

cmacmackin · 2019-10-16T08:25:20Z

Currently failing to compile because it turns out std::less<void> in FakeParallelMesh is C++14 only, and we currently only support C++11 (this will be changing after the next release!).
Luckily, I don't think std::less<void> is required, so you should just be able to delete it.

I pass CXXFLAGS=-std=c++14 to ./configure to force C++11 compatibility.

Yes, I noticed this problem on the CI server. Good to see that it's an easy fix.

There's a strange double-free bug coming from the unit tests. I'm not sure I completely understand, but it looks like it's something to do with the way createFakeProcessors is setting bout::globals::mesh. I fixed it by adding a destructor to the test that resets the global mesh to nullptr:
  virtual ~ParallelIndexerTest() {
    bout::globals::mesh = nullptr;
  }
Google recommend adding a virtual destructor to all test fixtures, btw. It's never been a problem for me, but it's probably wise.

Where do you see this bug? I'm not getting it when I run on my system. Also, do you have a reference for the Google recommendation? I'd be interested to read why.

The ParallelIndexerTest is quite noisy due to some of the functions its calling. There's a helper class, WithQuietOutput which can be used to silence particular output streams for the duration of a test. Just add some member variables to the fixture:
WithQuietOutput info{output_info}, warn{output_warn}, progress{output_progress}, all{output};
TestMixedSetting and TestDestroy for the vector and matrix tests is also noisy, but this is from PETSc. If you set PetscErrorPrintf = PetscErrorPrintfNone in the fixture constructor, and PetscErrorPrintf = PetscErrorPrintfDefault in the destructor, that should silence them.
I'm guessing these are supposed to raise PETSc errors?

Thanks for mentioning these. The noise was bothering me quite a lot, especially where it consists of PETSc error messages (which, as you say, are supposed to be raised, as I'm checking that improper use results in an exception being thrown). I had tried to figure out how to suppress the noise from PETSc but didn't find anything when I Googled, so not sure where PetscErrorPrintf is hiding in the documentation.

ZedThree · 2019-10-16T16:27:55Z

Where do you see this bug? I'm not getting it when I run on my system.

It was a bit tricky to reproduce, as it involved multiple tests interacting so it could be due to tests being run in a different order. Perhaps try running ./serial_tests --gtest_shuffle --gtest_repeat=100 to see if that uncovers it? You could also use valgrind, but it doesn't look like the unit tests are even close to being valgrind clean at the minute. You would also need a suppression file to quiet the warnings from MPI as well.

Also, do you have a reference for the Google recommendation? I'd be interested to read why.

I can't seem to find an explicit recommendation specifically for googletest now, so I'm probably misremembering, but the gist of it is that base classes should have virtual destructors to ensure they get called by derived classes. Each TEST_F in googletest gets turned into a subclass of the fixture. If the destructor is empty, it probably doesn't hurt not having one.

Can the BoutMesh constructor you've added just go in FakeParallelMesh? Or does it need access to private members of BoutMesh?

These data races were causing tests to fail in a non-deterministic manner. One was caused by a variable being shared which should have been private. The other was due to the chance of simultaneous writes to `std::set`.

cmacmackin · 2019-10-24T15:09:52Z

I've finally gotten the CI test all to pass, having eliminated the data races and other thread-unsafe behaviour within the PETSc interface and tests. I've run clang-format on the files I wrote and will push those changes shortly. However, before I do, I wondered whether I should apply clang-format to all files in the repository. However, before I do, I'd like to know whether there are any other changes which remain to be made before this branch can be merged into next.

d7919 · 2019-10-24T15:13:53Z

I think generally we'd ask that you clang format just the regions that you've touched rather than whole files but @ZedThree can probably provide exact guidance.

ZedThree · 2019-10-24T15:15:17Z

I wouldn't run clang-format over anything except the changes in this PR. There will be a lot of other changes otherwise! I'm pretty sure there's a way to limit it to just particular commits (via git clang-format), but I'm happy to investigate that myself if it proves non-obvious.

I'm in the final stages of prepping the v4.3.0 release, so I hope to be able to give this another read through either tomorrow or Monday.

cmacmackin · 2019-10-24T15:27:59Z

Thanks for the heads-up about git clang-format. I wasn't aware that existed but it made life very easy. I just ran git clang-format --extension cxx,hxx next (i.e., applying it to all changes in .cxx and .hxx files relative to the next branch). I'll push those changes now so you can take a look at them tomorrow or Monday.

ZedThree

Looks good! A few points to go through, but I think basically ready to go in.

I've not been through the tests yet, I'll do that tomorrow

include/bout/petsc_interface.hxx

ZedThree · 2019-10-28T15:56:32Z

include/bout/petsc_interface.hxx

+public:
+  /// If \p localmesh is the same as the global one, return a pointer
+  /// to the global instance. Otherwise create a new one.
+  static IndexerPtr getInstance(Mesh* localmesh);


Given that this is tied to a particular mesh, maybe we should think about moving it into there eventually. That would likely require this working for all boundary sizes, so we'll hold off on that for now

include/bout/petsc_interface.hxx

src/mesh/mesh.cxx

src/sys/petsc_interface.cxx

tests/unit/fake_parallel_mesh.hxx

@ZedThree

These changes are at the request of @ZedThree in PR #1805. As yet they have not been tested.

I've added additional boundary regions with the suffix _THIN which differ from the exinst regions by 1. Including corner values (in y boundaries) 2. Only being one cell deep In addition, the GlobalIndexer class creates a region from these and RGN_NOBNDRY, called RGN_ALL_THIN. This allows far more compact iteration over the grid cells to be stored in PETSc vectors. The new regions will also be useful when setting boundary condition in various solvers. While doing this refactoring I caught an error in the routine on BoutMesh which created the existing boundary regions. There was a call to `firstX()` which should have been `lastX()` when determining what indices to place in RGN_OUTER_X. I can only presume that this hasn't been caught yet because no one has used these region when there is more than one processor in the x-direction. This refactor revealed that sometimes the GlobalIndexer class ended up returning a stale instance between unit tests, due to a quirk of where new values of `localmesh` were being allocated in memory. To avoid this I have added an explicit `recreateGlobalInstance()` method, which marks that the indexer corresponding to the global mesh should be recreated the next time it is requested.

Turned out I was getting the interpolation weights incorrectly, as I shouldn't have used the y-index for the position I wanted but instead for where I want to extend along the field-line from.

cmacmackin · 2019-11-13T09:56:45Z

I've committed the changes requested above, as well as a few bug-fixes I made while running integration tests for the 3D AMG Laplace solver. I've also merged in the latest changes from the next branch. Once the CI is finished running everything should be ready to merge.

cmacmackin · 2019-11-13T15:47:55Z

2 of the CI runs failed, but I'm not sure why. In both cases the log says

$ travis_apt_get_update

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

The build has been terminated

It looks like something may be wrong with Travis itself.

bendudson · 2019-11-13T15:50:26Z

Yes, agreed. It looks like Travis had a hiccup. I've restarted the jobs so hopefully they work this time.

ZedThree · 2019-11-14T10:34:52Z

For reason the submodules are reverting to the previous commit we had. But I can see next was merged in very recently and merging it in again doesn't update the submodules. I'm going to sort it out manually and push to this branch.

ZedThree

Just two things need fixing, then we should get this in

include/bout/petsc_interface.hxx

cmacmackin · 2019-11-14T14:05:36Z

Fixed those to issues with assignment operators.

It seems I accidentally introduced a commit reverting the updates to the submodules. I had thought that I was bringing the submodules commits in my repo up-to-date with the ones in next but I actually overwrote those changes with the old commits by mistake.

ZedThree · 2019-11-14T16:31:59Z

Great work, thanks @cmacmackin!

cmacmackin added 8 commits October 14, 2019 11:19

Added initial PETSc interface.

d52af5f

* include/bout/petsc_interface.hxx: C

Added unit tests for PETSc interface.

2751e8b

Progress on petsc_interface implementation

509e1bb

I've written implementations for the setters (tests passing), indexers, and vectors. For the latter two, the tests currently fail. I've also fixed the makefile so that gmock is now linked.

Finished implemented PetscVector type

63d628f

All unit tests now pass for this type. PETSc prints some error messages to the screen, but this is expected; it occurs when testing that erroneous use-cases are caught correctly.

Implemented PETSc matrix interface, all unit tests passing

3f14f2a

Changes to PETSc interface made while implementing 3D Laplace solver

2ab65b5

While writing a 3D Laplace it I caught a few bugs here and there in my PETSc interface, particularly around its handling of corner cells.

cmacmackin added tests feature labels Oct 14, 2019

cmacmackin requested a review from ZedThree October 14, 2019 10:48

cmacmackin mentioned this pull request Oct 14, 2019

Algebraic multigrid #1803

Merged

cmacmackin and others added 12 commits October 14, 2019 12:42

Removed std:: from global namespace in petsc_interface.hxx

515f24a

Move getters for interpolation weights to ParallelTransforms

9cbc474

Fix unused parameter warnings from getWeightsForYApproximation

b055753

Fixed PETSc interface so std:: not loaded in global namespace

14a4d81

No longer create RGN_X/YGUARDS in PETSc indexer tests

4f26d79

These regions are now created during setup of the mesh. Creating them myself in the tests is unnecessary and leads to an error because they already exist.

Fixed PETSc interface to only include boundaries to a depth of 1

6e07412

Converted positionsAndWeights class name to PascalCase

0936647

Switched to using unique_ptr to hold Vec in PetscVector class

d953ee5

Converted local variables within PETSc interface methods to const

e10069c

Fixed regression in SolverTest.GetLocalN()

8cf7f81

`FakeMesh::createBoundaryRegions()` is now called when constructing `FakeMeshFixture`, so attempts to call it a second time from within this unit test result in an error. As such, I removed the second call.

Fixed remaining referece to positionsAndWeights

b3c4818

Converted these into PascalCase.

Removed unnecessary C++14 feature std::less

b879ae2

Eliminated some data races from PETSc interface tests

5d2729f

These data races were causing tests to fail in a non-deterministic manner. One was caused by a variable being shared which should have been private. The other was due to the chance of simultaneous writes to `std::set`.

Applied clang-format to the new files added for the PETSc interface

f2d2c72

Applied clang-format to changes made on the petsc_interface branch

ebd509e

ZedThree reviewed Oct 28, 2019

View reviewed changes

cmacmackin and others added 8 commits November 12, 2019 14:32

Small changes and refactoring to PETSc interface

512d2f8

These changes are at the request of @ZedThree in PR #1805. As yet they have not been tested.

Added PETSc interface method to permit partial assembly of matrices

1cc08c9

Automatied initialising of GlobalIndexer objects in PETSc interface

2f50bc0

Fixed bug setting yup/ydown values in PetscMatrix

ff7ecea

Turned out I was getting the interpolation weights incorrectly, as I shouldn't have used the y-index for the position I wanted but instead for where I want to extend along the field-line from.

Made initialiseTest() method public in PETSc indexer unit tests

d2bab1c

Merge branch 'next' into petsc_interface

b9e44d7

Applied clang-format to changes made in petsc_interface branch

97d682d

Update googletest and mpark.variant submodules

b249358

ZedThree requested changes Nov 14, 2019

View reviewed changes

include/bout/petsc_interface.hxx Show resolved Hide resolved

include/bout/petsc_interface.hxx Outdated Show resolved Hide resolved

Fixed assignment operators in PETSc interface

a202ef0

ZedThree approved these changes Nov 14, 2019

View reviewed changes

ZedThree merged commit a31a78e into next Nov 14, 2019

ZedThree deleted the petsc_interface branch November 14, 2019 16:31

ZedThree mentioned this pull request Dec 3, 2019

Interface to use BOUT++ indices with PETSc objects #1771

Closed

cmacmackin mentioned this pull request Dec 13, 2019

Generalising the PETSc interface #1873

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Petsc interface #1805

Petsc interface #1805

cmacmackin commented Oct 14, 2019

cmacmackin commented Oct 15, 2019 •

edited

ZedThree commented Oct 15, 2019

cmacmackin commented Oct 16, 2019

ZedThree commented Oct 16, 2019

cmacmackin commented Oct 24, 2019 •

edited

d7919 commented Oct 24, 2019

ZedThree commented Oct 24, 2019

cmacmackin commented Oct 24, 2019

ZedThree left a comment

ZedThree Oct 28, 2019

cmacmackin commented Nov 13, 2019

cmacmackin commented Nov 13, 2019

bendudson commented Nov 13, 2019

ZedThree commented Nov 14, 2019

ZedThree left a comment

cmacmackin commented Nov 14, 2019 •

edited

ZedThree commented Nov 14, 2019

Petsc interface #1805

Petsc interface #1805

Conversation

cmacmackin commented Oct 14, 2019

cmacmackin commented Oct 15, 2019 • edited

ZedThree commented Oct 15, 2019

cmacmackin commented Oct 16, 2019

ZedThree commented Oct 16, 2019

cmacmackin commented Oct 24, 2019 • edited

d7919 commented Oct 24, 2019

ZedThree commented Oct 24, 2019

cmacmackin commented Oct 24, 2019

ZedThree left a comment

Choose a reason for hiding this comment

ZedThree Oct 28, 2019

Choose a reason for hiding this comment

cmacmackin commented Nov 13, 2019

cmacmackin commented Nov 13, 2019

bendudson commented Nov 13, 2019

ZedThree commented Nov 14, 2019

ZedThree left a comment

Choose a reason for hiding this comment

cmacmackin commented Nov 14, 2019 • edited

ZedThree commented Nov 14, 2019

cmacmackin commented Oct 15, 2019 •

edited

cmacmackin commented Oct 24, 2019 •

edited

cmacmackin commented Nov 14, 2019 •

edited