Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate Parameter Parsing: introducing ParameterAccessor #341

Merged
merged 27 commits into from
Mar 6, 2024

Conversation

mabruzzo
Copy link
Contributor

@mabruzzo mabruzzo commented Aug 7, 2023

This PR is the first in a series of that alters the way in which classes (like Method or Physics) are configured from values taken from the parameter file.

Plan to roll out these changes

We need to introduce these changes in discrete steps because (i) it would take a lot of work to create (and review) a PR migrating everything at once, and (ii) additional work needs to be done every time a new parameter gets added via the old approach (to convert it so that it uses the new approach).

With that in mind, this PR takes a large first step to begin overhauling parameter parsing. It starts by shifting the vast majority of Method config-parsing from Cello's Config object into the Method subclasses (the remaining stuff needs to be addressed after we finish the conversion in the Enzo-layer). It also overhauls the process of configuring some Method objects from the Enzo-layer

Additional PRs will follow that deal with migrating other Method classes in the Enzo-layer. The changes in many of these cases are slightly more involved/less straightforward. In the future I also plan to extend this to other parts of the codebase (like Initial subclasses or Physics subclasses).

Motivation

Since I'm not rolling out this transition all at once, I have added quite a bit of website documentation for this new approach. This documentation explains the differences in the approaches and motivates why we are making this change:

  • you can find this documentation in doc/source/devel/param.rst
  • Alternatively, you can look at the preview website pages (built as part of this PR). In the sidebar, go to Developer Guide -> Parameter Parsing and Configuration

With that said I will try to highlight some weaknesses (that emerged for historical reasons) of the old approach:

  • Adding a new parameter is fairly laborious. To "properly" add a new parameter for a given Method subclass in the Cello layer (Enzo layer), one must store the parsed the value as an attribute of the global Config (EnzoConfig), forward it to the subclass, where the value is also stored as an attribute. This involves ~7 steps (see documentation).
  • Because this is laborious, there is a strong temptation to cut corners. The common shortcut is to simply store the parsed value in the global Config (or EnzoConfig) object and always access the value from there. This is equivalent to storing the parsed values as global variables (which is obviously undesirable for a number of reasons).
  • the process of configuring multiple instances of a single Method subclass could be made cleaner

The change made to the parameter-parsing by this PR is fairly simple:

The constructors (or static-factory-methods) of Method subclasses directly retrieve the parameter values from the parameter file and store the values as attributes. We completely no longer store attributes them as attributes within Config or EnzoConfig (in fact, all steps involving these classes are skipped).

This change reduces the number of steps needed to introduce a new parameter from 7 to 3, and strongly discourages the usage of global variables. This takes inspiration from Athena++

The ParameterAccessor class

While one can imagine implementing our new approach by passing a reference to the Parameters class to the constructors of various Method classes, we instead choose to pass the newly defined ParameterAccessor class.

To understand ``ParameterAccessor``, I need to define a "parameter-path" (within the spoiler tag)

Parameters in Enzo-E parameter files for are organized into hierarchical groups. This mirrors in which files are organized within a directory hierarchy. The following block shows a snippet from a hypothetical parameter file that configures EnzoMethodHeat.

Method{
  list = ["heat"];
  heat {
    alpha = 0.5;
  }
}

To unambiguously refer to a parameter in the file, we would refer to the parameters in this snippet of Method:list and Method:heat:alpha. This shorthand is similar to file paths on a unix-like operating system (e.g. /home/matthew/my_file.txt). For that reason, I will use term "parameter-path" to refer to these "full names of these parameters".


Essentially, the ParameterAccessor has an associated root-parameter path specified during its construction:

  • All parameter paths passed to methods of this class are considered to be relative-paths to this root-location
  • The documentation highlights how this class helps simplify code in the case where we want to construct multiple instances of a given Method subclass.
  • Technically, the Parameters class supports a similar feature, but it's less obvious when code is relying upon that feature (when you see ParameterAccessor, you know that the feature is being used).

Notes for the reviewers

  • Please carefully read the documentation. I found it fairly difficult to write. So any feedback would.
  • Most of the changes outside of ParameterAccessor.[ch]pp just consist of moving parsing out of Config and EnzoConfig
  • I want your feedback about the following point (discussed in the following spoiler tag):
Addressing drawback of ``ParameterAccessor``

The primary drawback of ParameterAccessor is that we no longer specify the full parameter-paths of every parameter. While this is fundamentally necessary for gracefully handling the case where we want to configure multiple instances of a given Method class, it's still less obvious for new users.

My thought is: what it we extend the analogy between parameter-paths and file-paths?
- We could make a leading ":" signify an absolute parameter-path (much in the same way that a leading "/" signifies an absolute file path).
- For example, passing ":Method:heat:alpha" to an instance method of ParameterAccessor would always refer to the same parameter regardless of the root-path currently associated with the instance of ParameterAccessor
- This could be used in cases where we don't expect more than 1 instance of the Method subclass to need to be configured in a given simulation (which is a lot of them).
- We could also make a leading ".:" signify that a parameter-path is explicitly a relative path (much in the same way that a leading "./" explicitly signifies that a file-path is relative to the current directory).

Do you have any thoughts about this?


Future Plans:

  • In the future, I would like intention to remove the code where Cello/Enzo-E automatically set the courant factor of each Method subclass. Instead, I want each Method subclass to parse and set their own courant factor, especially since the default courant value may differ from case to case (for example, the VL+CT solver requires a value at or below 0.5 while lots of other methods require a value at or below 1.0).
  • Once we finish transitioning the Method objects, I want to start raising warnings (or optionally raising errors), when users specify parameters that aren't used for anything in a parameter file (usually this means there is some kind of typo - I've done this a lot).

@mabruzzo
Copy link
Contributor Author

@jobordner - I know a while back that you mentioned you had concerns about safe-parsing in SMP-mode. I can/should explicitly check that.

But, don't let this stop you from reviewing this PR. I'm very confident that there really shouldn't be any problems at all since I changed any of the underlying parsing code (or when the parsing is done)

@gregbryan gregbryan self-requested a review February 19, 2024 17:11
Copy link
Contributor

@gregbryan gregbryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me -- the ParameterAccessor seems good to me. Requested changes are just cosmetic. Thanks @mabruzzo!

doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
src/Cello/problem_Problem.cpp Outdated Show resolved Hide resolved
@gregbryan
Copy link
Contributor

Commenting briefly on the questions raised: (1) I think the ability to specify an absolute and relative parameter path seems useful (but if this introduces too much more complexity in the code base, it is less appealing). The parameter-path analogy is immediately clear (at least to me). (2) I very much like the idea of eventually introducing checking for parameter use -- as you note, typos in parameter files are an ongoing issue and helping to catch them would be useful. Again, the issue is how much machinery is required to implement.

Co-authored-by: Greg Bryan <gbryan@astro.columbia.edu>
@jobordner jobordner self-requested a review February 22, 2024 18:28
@jobordner
Copy link
Contributor

Here's a link to my possibly excessive PR Review Nothing really critical--I will accept the PR after you've made whichever of the changes you decide to make. Let me know if you have any questions, and hope you don't mind the unconventional review approach. As the first in a series of PR's, I wanted to err on the side of thoroughness.

@mabruzzo
Copy link
Contributor Author

Thanks James! I'll make a point of addressing these suggestions (they're all really helpful!)

The changes include:
- renaming `ParameterAccessor` -> `ParameterGroup`
- removing some unnecessary comments
- renaming `ParameterGroup::get_root_parpath` -> `ParameterGroup::get_group_path`
- deleting `ParameterGroup::wrapped_Parameters_ref`
…stently ParameterGroup instances by value when constructing Method objects.
@mabruzzo
Copy link
Contributor Author

Commenting briefly on the questions raised: (1) I think the ability to specify an absolute and relative parameter path seems useful (but if this introduces too much more complexity in the code base, it is less appealing). The parameter-path analogy is immediately clear (at least to me). (2) I very much like the idea of eventually introducing checking for parameter use -- as you note, typos in parameter files are an ongoing issue and helping to catch them would be useful. Again, the issue is how much machinery is required to implement.

Thanks for responding to these points. It's my plan to definitely to eventually implement parameter-checking (a lot of the necessary infrastructure is already in place from a long time ago). I think I will probably implement the absolute parameter-path stuff at some (assuming it doesn't involve introducing too much machinery). But I probably want to wait on that until after we can remove some functionality from the Parameters class (which requires us to migrate a bunch of stuff first).

@mabruzzo
Copy link
Contributor Author

mabruzzo commented Feb 26, 2024

I've finished revising the code. @gregbryan and @jobordner, I'm ready for you to take one more look (I think this about ready to be merged at this point). I haven't changed anything too dramatic (it's mostly just systematic simple changes)

@jobordner in particular, I just wanted to briefly go down your list of suggestions:

  1. rename ParameterAccessor
    • Per your recommendation, I changed the name to ParameterGroup. The new name is more informative and easier to understand.
    • The name is a little misleading/confusing. Because Param and Parameters instances "own" the associated data, I'd naively assume a ParameterGroup instance owns all data in a parameter-group. Instead, it's really a lightweight reference/proxy for a parameter-group.
    • Some alternative names might be ParamGroupProxy or ParamGroupRef (I don't strongly prefer these alternatives over ParameterGroup, since they are a little verbose).
    • I feel like we could spend a long time bikeshedding on this topic. So I'm just going to go ahead and use ParameterGroup unless you (or Greg) tell me you prefer one of these alternatives (or a different alternative).
  2. prefer a consistent Method creation approach
    • I modified the creation of all Methods in this PR, so that they are all created by passing ParameterAccessor to the constructor
    • With that said, I don't think it should be a hard and fast rule that we can't use a factory method.
      • In cases like EnzoMethodStarMaker, EnzoMethodAccretion, and EnzoMethodFeedback where different configurations correspond to different to subclasses it makes a lot of sense to use a factory. (As an aside, it's my intention to eventually refactor those classes - I don't think we should be using 2 levels of inheritance in those cases... I think it produces very confusing code)
      • I can easily imagine a factory-method helping in certain cases where the internal representation of a Method object does not have a one-to-one mapping with the parameters in a parameter file.
  3. I agree - this seems like a great idea for the future.
  4. rename get_root_parpath()
    • Good idea - I did that.
  5. remove or rename wrapped_Parameters_ref()
    • Good point. I removed it.
  6. remove ParameterAccessor comments
    • I removed the highlighted comment
  7. pass ParameterAccessor by value or const reference
    • Good point. We now pass by value everywhere. It might be better to pass by const-reference, but we would need to declare all of value methods of Parameters to be const-qualified. But that's a little beyond the scope of this PR.
  8. That's a good point. I edited the documentation quite a bit
    - I fleshed out the first half of the documentation a lot more (and gave more examples). I think it should generally be more approachable
    - There's still a lot of room for improvement! In particular, I did not split design docs off from the developer docs. I definitely agree with your point that some of this stuff is better off as design docs, but I don't have it in me to do that right now.
    - It's my intention to continue improving the docs in the future, but I think it's probably good enough for right now

Copy link
Contributor

@gregbryan gregbryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified version looks good to me -- left a few minor comments (but approved because this doesn't need me to look at it again). Thanks!

doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
doc/source/devel/param.rst Outdated Show resolved Hide resolved
mabruzzo and others added 2 commits February 28, 2024 09:05
Co-authored-by: Greg Bryan <gbryan@astro.columbia.edu>
@mabruzzo mabruzzo merged commit 5321ff0 into enzo-project:main Mar 6, 2024
7 checks passed
@mabruzzo mabruzzo deleted the param-prototype-courant branch March 6, 2024 20:42
mabruzzo added a commit to mabruzzo/enzo-e that referenced this pull request Mar 18, 2024
…odMHDVlct` from `EnzoConfig` into constructor (and helper functions).
@mabruzzo mabruzzo mentioned this pull request Apr 18, 2024
22 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants