Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make input Options case sensitive and error on unused Options #2210

Merged
merged 110 commits into from
May 28, 2021

Conversation

ZedThree
Copy link
Member

As discussed in #2209, this makes Options case sensitive, which is a breaking change.

It also introduces another breaking change: by default, any unused options after an initial call to rhs is an error. This can be toggled with input:error_on_unused_options (I used this instead of input:strict as it's more explicit: strict sounds like turning it off might allow case-insensitive options).

I had to fix some other bits for this to work: some Solver unit tests were a little fragile, and constructing Options from a braced init list turned out to be broken in a fun way. Writing Options::getUnused also turned out to be way more difficult than I was expecting. Options is actually a little more difficult to use than I thought.

I'm not overly happy with this check being done in Solver::solve -- it's the obvious place, but it doesn't really feel like Solver 's responsibility. I want to also add input:help which prints out all the seen options, and again, Solver::solve is the obvious place, but it feels like it should live somewhere "higher up".

Also to be done, add Levenshtein distance checking to point out likely typos

Also, I've added toString(Options) which ignores the doc strings and source. It's very close to just completely replacing OptionsINI, so I'd also like to work out how to merge all that.

I figure all these changes are very likely to break some tests, so I'm looking forward to seeing what needs fixing!

Constructing `Options` using an initializer_list like:

    Options option {{"section1", {"value1", 42}}};

would not initialise the section names. This mostly down to the value
being constructed from the very bottom of the tree. When we construct
the bottom-most value, we don't know its parent's name, so we can't
give it its full_name "section1:value1".

Instead, we need to do this recursively, as we construct each parent.
Need all elements of Options::ValueType to have operator== in order to
check equality of Options itself
Avoids segfault if run_rhs is called in Solver::solve
This is a breaking change! Set input:error_on_unused_options = false
for old behaviour
This is a breaking change: input options are now case sensitive
@ZedThree ZedThree changed the title Make input Options case sensitive Make input Options case sensitive and error on unused Options Jan 15, 2021
@@ -661,6 +666,13 @@ public:
return *this;
}

friend bool operator==(const Options& lhs, const Options& rhs) {
if (lhs.is_value and rhs.is_value) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth having a shortcut here to check if they are the same object (&lhs == &rhs)?

};
append_impl(children, section_name, append_impl);
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth instead making full_name a function, which iterates through its parents? Of course that raises the problem of setting pointers to parents... Another possibility is to not store the full name or parent, but that then puts it on wherever the full name is needed to work it out.

// `(*this)[value.first].full_name` in the copy constructor, so we
// need to explicitly set it again
(*this)[value.first].full_name = value.first;
append_section_name((*this)[value.first].children, value.first);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! I missed this before. Maybe worth checking where full_name is actually needed, and if it could be worked out when needed e.g. as the tree of options is traversed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed it for printing out subsections correctly, i.e.

[section:subsection]
value = thing

but, yes, maybe this could be done when traversing the tree

src/sys/options.cxx Outdated Show resolved Hide resolved
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/solver/solver.cxx Show resolved Hide resolved
tests/unit/solver/test_solver.cxx Outdated Show resolved Hide resolved
tests/unit/solver/test_solver.cxx Outdated Show resolved Hide resolved
tests/unit/solver/test_solver.cxx Outdated Show resolved Hide resolved
tests/unit/solver/test_solver.cxx Outdated Show resolved Hide resolved
tests/unit/solver/test_solver.cxx Outdated Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
@ZedThree
Copy link
Member Author

As predicted, lots of test failures. It does look like a lot, if not all, are due to the case sensitivity:

The following tests FAILED:
	  5 - test-communications (Failed)
	  6 - test-coordinates-initialization (Failed)
	  8 - test-delp2 (Failed)
	  9 - test-drift-instability (Failed)
	 10 - test-drift-instability-staggered (Failed)
	 11 - test-fieldgroupComm (Failed)
	 12 - test-griddata (Failed)
	 16 - test-interchange-instability (Failed)
	 18 - test-interpolate-z (Failed)
	 19 - test-invertable-operator (Failed)
	 20 - test-invpar (Failed)
	 24 - test-laplace-petsc3d (Failed)
	 28 - test-naulin-laplace (Failed)
	 29 - test-petsc-laplace (Failed)
	 30 - test-petsc-laplace-MAST-grid (Failed)
	 31 - test-restart-io (Failed)
	 32 - test-restart-io-hdf5 (Failed)
	 33 - test-restarting (Failed)
	 34 - test-slepc-solver (Failed)
	 37 - test-solver (Failed)
	 38 - test-squash (Failed)
	 40 - test-stopCheck-file (Failed)
	 41 - test-twistshift (Failed)
	 42 - test-twistshift-staggered (Failed)
	 43 - test-vec (Failed)
	 44 - test-yupdown (Failed)
	 45 - test-yupdown-weights (Failed)

@ZedThree
Copy link
Member Author

I've started looking through the failures. If we're going case-sensitive, we're going to need to decide on whether we want to keep existing spellings/cases or change them to match common usage. For example, the NOUT option is currently all caps, but very split in the tests and examples whether it is all caps or not. On the other hand, the mesh:dd*:First option is capital F but all the input files spell it first.

If we provide a tool for correcting case in input files, this should have minimal impact on users so I'm inclined to make most/all options in the library lowercase.

@ZedThree
Copy link
Member Author

ZedThree commented Jan 19, 2021

Still WIP, and expecting tests to fail, as I've not yet fixed the input names, but errors should include suggestions

TODO:

  • fix input options to prefer lowercase by default (maybe except where the variable itself is ALL CAPS? but maybe also fix that too?)
  • add fixes for input options to bin/bout-v5-input-upgrader.py and apply to all input files
  • remove other unused options from suggested alternatives
  • remove unused options with a default source from the set of unused options (we don't want to error because run:version = v5.0.0-alpha wasn't used -- this is really an "output" option, not an input)
  • add input:help to display inputs and docstrings
  • add custom fmt formatter for Options to enable printing either as INI-style or as flattened keys, with/without docstrings/type/source etc?

include/options.hxx Outdated Show resolved Hide resolved

if (error_on_unused_options) {
Options unused = globaloptions.getUnused();
if (not unused.getChildren().empty()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this whole chunk should be broken out into a function somewhere else? I'm still not keen on this living in Solver::solve, but moving it out requires something like BOUTMAIN calling PhysicsModel::init/run_rhs to get all the options.

Also the input:help functionality needs to essentially be called in the same place, and then we may have to be careful that nothing has tried to write to any files, or we'll end up clobbering stuff. I was wondering if we could somehow run in a "fake" location somewhere to avoid this. We'd still need access to the input file and grid file, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to have the init and first RHS call be done in a separate call, rather than inside Solver. I remember a long time ago (2013?) we had to change how Solver initialised and ran, because of the weird interface PETSc had then. Perhaps its worth revisiting?

@@ -109,6 +137,53 @@ const Options &Options::operator[](const std::string &name) const {
return it->second;
}

std::multiset<Options::FuzzyMatch>
Options::fuzzyFind(const std::string& name, std::string::size_type distance) const {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are various choices we could make about how this actually functions. I've chosen a default max distance of 4, so an option with more than 4 edits from the search string aren't counted, for example: some_string won't match another_string (by default).

It also matches an "unqualified name" (e.g. first) against a "fully qualified" name (e.g. mesh:ddx:first) with a distance of 1. But it won't try to strip the parent sections and match that. So mesh:ddx:first won't match some:other:first.

Another check is matching different cases: THIS matches this matches ThIs, all distance 1.

These choices were a little arbitrary, so definitely happy to hear suggestions on other choices

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_options.cxx Show resolved Hide resolved
tests/unit/sys/test_utils.cxx Show resolved Hide resolved
tests/unit/sys/test_utils.cxx Show resolved Hide resolved
tests/unit/sys/test_utils.cxx Outdated Show resolved Hide resolved
tests/unit/sys/test_utils.cxx Outdated Show resolved Hide resolved
@ZedThree
Copy link
Member Author

Might need to turn off clang-tidy-review on all the googletest stuff...

* next:
  Note removing support for HDF5 in changelog
  Better notation for indexing zShift
  Update note on zShift in grid-files section of manual
  Print error for HDF5 macros when running bout-v5-macro-upgrader.py
  Support macros being removed in bout-v5-macro-upgrader.py
  Remove HDF5 support
@ZedThree
Copy link
Member Author

ZedThree commented Feb 5, 2021

Failing on unused options has a "fun" case where we have something like this:

    if (!estatic && !ZeroElMass) {
      apar_solver = Laplacian::create(globalOptions->getSection("aparsolver"));
    }

Here, everything in the aparsolver section may be unused. I can see two ways of dealing with this:

  1. just comment out the section in the input file
  2. unconditionally create apar_solver

More generically, do we expect the user to comment out or otherwise remove inputs when some other option means they go unused? Or to restructure their code to always read the options at least? Or perhaps to just confirm they've spelt everything correctly and then set input:error_on_unused=false?

@bendudson
Copy link
Contributor

I think an error message suggesting that the user checks, and then sets input:error_on_unused=false would be fine. We could perhaps have a function which says that an option is not currently used, but may be e.g.

if (!estatic && !ZeroElMass) {
   apar_solver = Laplacian::create(globalOptions->getSection("aparsolver"));
} else {
   globalOptions->getSection("aparsolver").setConditionallyUnused();
}

but with a better name. Perhaps an attribute in the Option so that the check for unused options skips that option or section.

@ZedThree
Copy link
Member Author

ZedThree commented Apr 8, 2021 via email

* next: (159 commits)
  CMake: Add 'docs' synonym for sphinx-html target
  CMake: Add PYTHONPATH env var to docs command
  CMake: Use correct latex builder invocation
  CMake: Fix missing 'echo' in docs command
  Docs: Fix use of deprecated/removed Sphinx method
  Remove extraneous `this->` when setting member variables
  Fix a couple of typos in tests
  Only check local y points if we have multiple processors in Y
  Docs: Fix sphinx conf.py for breathe 4.30
  CMake: Print link to generated docs after completion
  CMake: Fix typo in docs comment
  [skip ci] Apply black changes
  Print possible boundary regions during startup
  Add Mesh method to get the set of all possible boundaries
  Move constraint creation to utility method
  CMake: Default to not building the docs
  CMake: Be explicit that docs aren't built in `all`
  CMake: Fix typo
  Make method parameter names consistent (clang-tidy fix)
  Check version of SUNDIALS is new enough for constraints
  ...
mxg = 1
myg = 1
MXG = 1
MYG = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these top-level options be removed? There are MXG, MYG settings in the mesh section too

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One for @dschwoerer ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the top-level options are removed, the mesh section ones are used, and the simulation fails.

I'll push a fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lovely, thanks @dschwoerer 👍

bendudson
bendudson previously approved these changes May 16, 2021
Copy link
Contributor

@bendudson bendudson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ZedThree ! This is quite a large PR. Probably good to merge soon before it conflicts with everything

@ZedThree
Copy link
Member Author

I'd just like to implement the fix for the boundary condition checking before we merge this. Hopefully won't take too long!

Also, it's not necessary for this PR, but merging #2277 would also allow a small simplification here.

* next: (58 commits)
  Report enabled sanitizers in final config summary
  Fix clang-tidy suggestions: move-assignment should be noexcept
  Don't std::move trivially-movable objects
  Avoid calls to virtual functions in constructors
  Handle self-assignment in copy-assignment
  Make move-assignment operator noexcept
  Use explicit nullptr check in conditional
  Remove else-after-return
  Fix use-after-move in Field3D move-assignment
  Add move-assignment for Field2D
  Fix for clobbering boundary operators in field assignment
  Fix for `setBoundary` being called before `fieldmesh` has been set
  Use local mesh in `FieldData` methods
  Don't use reserved identifiers as header guards in fields
  Unconditionally include string header for `Field`
  Update docstrings for `Field` and `FieldData` classes
  Make most `FieldData` members private with getters, not protected
  Swap `FieldData` members in `FieldData`
  Make `setDirectionY` virtual
  Make `Field::directions` private
  ...
We now (soft) error if there are any unused input parameters, but
because the set of boundary regions can be different on different
processors, we might not "use" (that is, read the value of) the
corresponding input parameters on the other processors!

This function is a bit of a workaround for the above: essentially
marks all possible boundary condition inputs as having been used on
each processor. This way, we should still catch invalid inputs and
misspellings (for example, `bdry_oll`) while still allowing valid
boundaries on other processors.
These aren't needed as boundaries are set manually on `f`
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

include/bout/mesh.hxx Show resolved Hide resolved
@ZedThree
Copy link
Member Author

Should've just merged this earlier, instead of trying to fix it all in this PR!

It looks like the failing tests are because Mesh::getPossibleBoundaries is much slower than I thought it was. I'm going to try a couple of things, but I might just revert the use of that to get this in.

Turns out creating `Region`s on production-sized meshes is
expensive. We don't need them in this function, so don't bother
computing them!
@ZedThree
Copy link
Member Author

It turns out the biggest cost of getPossibleBoundaries was in creating Regions on the copied meshes, which turns out to be quite expensive for production-sized meshes. Not constructing the regions reduces the cost of getPossibleBoundaries by 10^4!

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

include/output.hxx Show resolved Hide resolved
@ZedThree
Copy link
Member Author

* next: (63 commits)
  Use better variable names
  Add downloading dependencies to CMake install doc
  CMake: Add option to download SUNDIALS at configure time
  Delete some commented out code
  Remove some unused private members
  Fix clang-tidy warning about uninitialised members
  Use override on overridden virtual functions
  Make some LaplacePCR methods const
  Reduce scope of some local variables and immediately initialise them
  Fix some clang-tidy warnings
  Fix use of reserved identifier for header guard
  Keep const in cxx
  Remove const int input
  Add PCR case to the test-laplace integrated test
  clang format
  Allow NYPE != 1
  Inline the setup function
  Use GlobalNxNoBoundaries
  Tidy comment
  Run clang tidy
  ...
Copy link
Contributor

@johnomotani johnomotani left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@ZedThree ZedThree merged commit 48a0f1d into next May 28, 2021
@ZedThree ZedThree deleted the case-sensitive-options branch May 28, 2021 08:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants