Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python update script to read, reformat and write .prm files #5381

Merged
merged 5 commits into from Sep 14, 2023

Conversation

gassmoeller
Copy link
Member

@gassmoeller gassmoeller commented Sep 6, 2023

In preparation for ASPECT 3.0 we decided to move some parameters around and break compatibility of input files in a similar way as for ASPECT 2.0. To easily do this we will need an update script, and building on the experience during 2.0 it will be far easier to do this in a modern scripting language like python than in sed or awk like we did last time (perl would have been another alternative, but I and probably most of our users are more familiar with python).

So for this PR I took the functions for reading prm files we had in contrib/python/scripts/aspect_data.py and improved and extended them to maintain all relevant information. I also added functions for writing the parameters back to a file and added a first reformatting step, which is to systematically remove the warning that our update scripts for 2.0 introduced in many files.

The script uses the following style principles (and we can discuss them, but I think this is most closely following what most of us use anyway):

  • indent by two spaces for each subsection level, and indent comments as well as content
  • ensure an empty line before a subsection or a comment, unless it is a subsection/comment following directly another subsection, or it is at the start of the file
  • combine content of subsections that are duplicated
  • if values are duplicated, use the last one in the file (as ASPECT does when it reads the file)
  • retain as much user formatting in comments and parameter values as possible, e.g. spaces for padding values to align between adjacent lines. This is not always perfectly possible, I will have to adjust a few files in the benchmarks manually (e.g. if the padded comment only aligns properly with the values if it is not indented)
  • retain broken lines (\) in values of parameters and comments. remove them from subsection or parameter names. This means I had to manually exclude tests/prmbackslash*.prm from the update. I think it is ok to support breaking subsection or parameter names, but we should not encourage it, it makes the files hard to read

The script currently has the following limitations I am aware of (not sure if these matter, they didnt seem to apply to any input files we have):

  • it cannot handle more than one include directive in one file (it will only use the last one) - this is handled correctly now

For now this PR contains two commits, the first introduces the scripts, the second applies it to all .prms in the tests/ folder. Once an initial review is done I will apply it to benchmarks/, cookbooks/, etc. (I have done this locally already and checked that there are no fundamental problems, but because some files require manual changes I want to make sure we agree on the principles before including them here).

@MFraters, @jdannberg we discussed this yesterday, can one of you take a look? Additional reviews are of course also appreciated.

Copy link
Member

@MFraters MFraters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is great and useful to have. It cleans up and unifies a lot of the prm files, so that is great, besides that it can be used to port prm files for future breaking changes.

I looked through the code and it looks clean and well documented to me. The only comment I have on that aspect is that the main update script documentation, which is the one a user would most likely read, could be expanded a bit to state what it actually does and some warnings.

As for the choices of the guidelines themselves, I think the are oke. I am not entirely sure how dealii handles multiple includes, but if they overwrite it as well, then there is no issue. If they allow multiple include statements (which looks like it is the case if I take a quick look at https://github.com/dealii/dealii/blob/b60d3b4302a2d98695f83c6e573333a1b6c58e92/source/base/parameter_handler.cc#L2025), then it might be something we could think about in the future. Since we are not using it anywhere, I don't think this script needs to account for that now.

if __name__ == '__main__':
parser = argparse.ArgumentParser(
prog='ASPECT .prm file reformatter',
description='Reformats ASPECT .prm files to follow our general formatting guidelines.')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to add the caveat of includes here and mention that double entries will be merged.

# -*- coding: utf-8 -*-

""" This script reformats the given .prm files to follow our
general formatting guidelines.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be good to add the caveat of includes here and mention that double entries will be merged. And maybe just list the features you mentioned in you opening comment of this pull request.

@gassmoeller
Copy link
Member Author

Thanks for the review

I am not entirely sure how dealii handles multiple includes, but if they overwrite it as well, then there is no issue.

Turns out some of our tests have multiple includes so I had to extend the script. It should handle that correctly now. I am still working on getting all tests to run and extending the documentation, but the script itself hasnt changed much. I will finish the documentation tomorrow and let you know when the PR is ready for another look.

@gassmoeller gassmoeller force-pushed the python_update_script branch 2 times, most recently from d38758a to 62333aa Compare September 7, 2023 19:36
@gassmoeller
Copy link
Member Author

Ok, I think I found the last problems and fixed them. I also updated the cookbooks and benchmarks prm files. So if we agree this is the right path forward I think this is ready for another review. One more thing for a follow up PR would be to integrated this script into our testers to maybe check new prm files for style, but we can discuss that separately.

Copy link
Member

@MFraters MFraters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you forgot to address my inline comments. Otherwise, this looks good and I think the approach is great. It looks like a lot of double information and extra newlines where removed :)

@gassmoeller
Copy link
Member Author

Oops you are right. I updated the documentation as requested. I only added the full explanation once at the beginning of the file to not duplicate it.

Copy link
Member

@MFraters MFraters left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the extra documentation. Unfortunately you will need to rebase, but I can merge it afterwards.

@gassmoeller
Copy link
Member Author

Thanks, rebased.

@MFraters MFraters merged commit 6713edf into geodynamics:main Sep 14, 2023
7 checks passed
@gassmoeller gassmoeller deleted the python_update_script branch September 15, 2023 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants