Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge feature/capgen into main 2021/08/12 #391

Merged
merged 172 commits into from
Aug 18, 2021

Conversation

climbfuji
Copy link
Collaborator

@climbfuji climbfuji commented Aug 12, 2021

Update main with the latest development in feature/capgen.

Where incompatibilities exist between capgen and ccpp_prebuild, a workaround was added on the ccpp_prebuild side. Exception: in parse_checkers.py, a temporary workaround is added to internally convert unit none to 1 to match the ccpp-physics/UFS/SCM units for dimensionless variables (none) with what capgen uses (1). A discussion has been started to eliminate those inconsistencies in the actual metadata (see ESCOMP/CCPPStandardNames#18).

Update 08/18/2021: one additional routine had to be added to parse_checkers.py that I forgot to mention previously: registered_fortran_ddt_names - this routine is required so that ccpp_prebuild.py can get access to all the registered Fortran DDTs, because it utilizes capgen's metadata parser on a file-by-file basis.

Associated PRs:

#391
earth-system-radiation/rte-rrtmgp#133
NCAR/ccpp-physics#718
NOAA-EMC/fv3atm#367
ufs-community/ufs-weather-model#745

For regression testing, see ufs-community/ufs-weather-model#745.

@climbfuji
Copy link
Collaborator Author

A careful review of this PR is probably beyond my expertise right now since I'm not very familiar with capgen. To the extent that ccpp_prebuild.py calls capgen's parsers, as long as it continues to work, I don't have a problem approving the PR though. Looking at the associated fv3atm PR, it doesn't look there are a bunch of changes to understand there, perhaps with the exception of adding ccpp_types.F90 to VARIABLE_DEFINITION_FILES in the config file. I'm assuming this is now necessary due to updates in the Fortran parser or something? Also, if the order of files in that list should have DDT definitions first, why did it not error out before this PR? Is this a new requirement (I didn't think so -- I thought this is mentioned in the docs)?

Thanks for these comments. A few answers to your questions:

  1. The order of DDTs didn't matter in the past, because we were collecting data from the capgen.py metadata parser, which didn't check if a DDT was defined when encountering a reference to it. The consistency check happened later in ccpp_prebuild.py, after all data was collected and merged. Now, capgen.py's metadata parser does check if the DDTs are known and therefore enforces the requirement of having the correct order of the variable definition files.
  2. ccpp_prebuild.py had the ccpp_t type (in ccpp_types.F90, ccpp_types.meta) hardcoded into the scripts instead of reading it from the files. Because of 1., this data must now be read by the capgen.py metadata parser.

In summary, these PRs correct several inconsistencies i the metadata w.r.t. derived data types that ccpp_prebuild.py simply didn't care about, but capgen.py does (for good reasons). This is not only a good thing to do, but also necessary for switching to capgen.py later.

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I do not understand the changes to parse_checkers.py. The command:
git diff NCAR/feature/capgen dom/merge_feature_capgen_into_main_20210812 -- scripts/parse_tools/parse_checkers.py

shows more changes than the unit conversion from none to 1, however, these changes are not showing up in the diffs in this PR. Please explain.
Note: NCAR stands for github.com:/NCAR and dom stands for github.com/climbfuji (git command assumes that both remotes are loaded in a sandbox).

  • convert_metadata.py is in feature/capgen but not in this PR. Did you decide to remove it?
  • metavar.py does not match feature/capgen. Why?

language: python

python:
- "2.7"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we get to drop Python 2 support?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use Python 2 anymore. I think this can be upgraded safely to Python 3 in a follow-up PR.

# end if
self.__table_type = value
elif key == 'dependencies':
if value.lower() == "none" or value.strip() == '':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is a blank dependencies line an error here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capgen's metadata parser was creating a dependencies list that contained an item [ ' ' ] - this produced an error.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will not be happening once you merge #395 (and should not have been happening with the change you made).

emsg += "already been declared as None"
self.__pobj.add_syntax_err(emsg)
else:
depends = [x.strip() for x in value.split(',') if x.strip()]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change should be made in feature/capgen first, then merged into prebuild. Doing it this way will cause issues later on.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a PR to feature/capgen that I can pull in that is identical to this change - and nothing else - then I can pull it in, fix the merge conflict, assert that nothing has changed in the codebase, and don't have to rerun the tests. I don't see however why this change is causing a problem. It can be merged back from main to feature/capgen without issues.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #395

Copy link
Collaborator

@gold2718 gold2718 Aug 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see however why this change is causing a problem.

For one thing, it took me 20 times longer to process this PR due to this change and undocumented changes to the capgen code.

# end if
if self.header_type == "ddt":
known_ddts.append(self.title)
# end if
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be code here which is missing. What happened?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, there were several merge conflicts that I had to resolve. What is missing here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do a diff against the feature/capgen version. After you merge #395, there should be no difference.
For this PR to be okay, there needs to be only one change between capgen code in feature/capgen and the code in this PR. Currently, that is not true.

>>> VariableProperty('local_name', str, check_fn_in=check_local_name).valid_value('foo(bar)')
'foo(bar)'
>>> VariableProperty('local_name', str, check_fn_in=check_local_name).valid_value('q(:,:,index_of_water_vapor_specific_humidity)')
'q(:,:,index_of_water_vapor_specific_humidity)'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables below have been removed in feature/capgen. Why are they back?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

git seems to have added this code back in apparently when I did the merge

@climbfuji
Copy link
Collaborator Author

  • I do not understand the changes to parse_checkers.py. The command:
git diff NCAR/feature/capgen dom/merge_feature_capgen_into_main_20210812 -- scripts/parse_tools/parse_checkers.py

shows more changes than the unit conversion from none to 1, however, these changes are not showing up in the diffs in this PR. Please explain.
Note: NCAR stands for github.com:/NCAR and dom stands for github.com/climbfuji (git command assumes that both remotes are loaded in a sandbox).

  • convert_metadata.py is in feature/capgen but not in this PR. Did you decide to remove it?
  • metavar.py does not match feature/capgen. Why?
  • convert_metadata.py is no longer needed, there is no model that uses the old metadata standard
  • I will need to look at the diffs of metavar.py when I get back to my work laptop to see why they don't match

climbfuji and others added 3 commits August 18, 2021 06:43
Fix parsing of dependencies
… into merge_feature_capgen_into_main_20210812
…ckers.py based on code review: remove old code in main, add missing code from feature/capgen
@climbfuji
Copy link
Collaborator Author

@gold2718 Thank you for your thorough code review. I would appreciate if you could keep these conversation at a professional level and refrain from comments like #391 (comment).

Several of the changes that you complained about were made in main (or master at that time) in agreement and with your approval. They are not the result of a sloppy merge.

  1. Why is metavar.py different? Because feature/capgen still does not support the active attribute. In PR Update master from gsd/develop 2020/06/30 #307, the capability to process the active attribute was added to main's metavar.py, and it was review and approved. This needs to stay to keep the system working.
  2. Similar, in metavar.py, the update to true and false attributes were made in PR Migrate to new metadata format step 1 #205, which was reviewed and approved. This needs to stay to keep the system working (ccpp_prebuild uses it).
  3. metadata_table.py: the changes to the documentation (example metadata table, lines 61-96) were made in PRs Add dependencies to metadata #317 and Migrate to new metadata format step 1 #205, which were reviewed and approved. They are still valid and should stay.

I made the following corrections based on your code review in commit 53aa9dc:

  1. metadata_table.py: The change in line 377 is no longer required. I think this was needed to get past a parsing issue before line 392 was fixed in my PR (and then in your PR Fix parsing of dependencies #395, which I merged into my PR to clean up the commit history).
  2. metadata_table.py: The additional lines 736-741 are not required. They were a leftover of the original "dependencies" implementation in the main branch, which is replaced with the feature/capgen implementation as part of this PR. I removed them
  3. metadata_table.py: I am not sure why the lines if self.module is None didn't get merged into my branch from feature/capgen, but I added them back (now lines 778-781).
  4. parse_checkers.py: I forgot to mention in the PR description that I needed to add one additional routine to parse_checkers.py: registered_fortran_ddt_names. This routine is required so that ccpp_prebuild.py can get access to all the registered Fortran DDTs, because it utilizes capgen's metadata parser on a file-by-file basis. I updated the PR description accordingly.
  5. parse_checkers.py: reinstantiated the missing comment line 10 "# CCPP framework imports" and removed LITERAL_INT (line 206-ish).

The fact that we are seeing more changes that slowed down your code review is not only because of the oversights listed in 4-8, but also because feature/capgen has not been updated from main (master) for years. I am unsure why this didn't happen. In all other projects I have been working, the maintainers of feature branches are usually keeping their branches up to date with the authoritative code to avoid exactly these kind of problems, especially when they are long-term developments like feature/capgen. But maybe there was a misunderstanding and we (DTC) should have created those PRs instead of you. You would still have had to review and accept them, though.

After fixing the few merge problems described above (items 4, 5, 6, 8) and updating the PR description (item 7), the current PR must be merged as-is to keep the system capgen-metadata-parser + ccpp_prebuild working. We do not have the time and I do not see why I should be making the extra efforts to make these changes separately in feature/capgen first at this point, then merge them into my PR and fix the merge conflicts. It is natural in distributed development that changes are made in the same files in different branches, but as I said above they should have been merged back to feature/capgen immediately afterwards instead of years later.

To avoid such problems in the future, I will create a follow-up PR to bring the changes/differences described above back to feature/capgen, and I strongly encourage that we do so on a regular basis going forward.

Fortunately, after making all of the abovee changes, the auto-generated code is 100% identical to before, which means we do not have to rerun the regression tests on all nine platforms (we will rerun them on just one with two different compilers).

@gold2718
Copy link
Collaborator

Why is metavar.py different? Because feature/capgen still does not support the active attribute. In PR Update master from gsd/develop 2020/06/30 #307, the capability to process the active attribute was added to main's metavar.py, and it was review and approved. This needs to stay to keep the system working.

Then, as part of the capgen unification project, this change should be made in the feature/capgen branch. Continually updating the main branch without contributing to the feature/capgen branch is slowing down the unification effort.

@gold2718
Copy link
Collaborator

Similar, in metavar.py, the update to true and false attributes were made in PR Migrate to new metadata format step 1 #205, which was reviewed and approved. This needs to stay to keep the system working (ccpp_prebuild uses it).

If prebuild is using a feature not used by feature/capgen, that code should be moved to the prebuild codebase.

@gold2718
Copy link
Collaborator

metadata_table.py: the changes to the documentation (example metadata table, lines 61-96) were made in PRs Add dependencies to metadata #317 and Migrate to new metadata format step 1 #205, which were reviewed and approved. They are still valid and should stay.

Please move these changes to feature/capgen, then merge from there.

@climbfuji
Copy link
Collaborator Author

Similar, in metavar.py, the update to true and false attributes were made in PR Migrate to new metadata format step 1 #205, which was reviewed and approved. This needs to stay to keep the system working (ccpp_prebuild uses it).

If prebuild is using a feature not used by feature/capgen, that code should be moved to the prebuild codebase.

No. We discussed this in the ccpp-framework meeting at that time, it was decided to generalize the false/true logic and I followed your recommendations at that time to do. This is an improvement for capgen that we agreed upon, which prebuild has adopted since then. I am not going to revert or move it.

@gold2718
Copy link
Collaborator

The fact that we are seeing more changes that slowed down your code review is not only because of the oversights listed in 4-8, but also because feature/capgen has not been updated from main (master) for years. I am unsure why this didn't happen. In all other projects I have been working, the maintainers of feature branches are usually keeping their branches up to date with the authoritative code to avoid exactly these kind of problems, especially when they are long-term developments like feature/capgen.

I agree, when are you going to do this? I argue that these changes should have been made first in feature/capgen in order to make progress towards capgen unification.

@climbfuji
Copy link
Collaborator Author

Why is metavar.py different? Because feature/capgen still does not support the active attribute. In PR Update master from gsd/develop 2020/06/30 #307, the capability to process the active attribute was added to main's metavar.py, and it was review and approved. This needs to stay to keep the system working.

Then, as part of the capgen unification project, this change should be made in the feature/capgen branch. Continually updating the main branch without contributing to the feature/capgen branch is slowing down the unification effort.

I will create a PR with the active attribute changes, the true/false (valid vals) changes, and the documentation changes for feature/capgen, merge it, then formally update this PR (which must result in no changes to the code), then this should be ok.

@climbfuji
Copy link
Collaborator Author

Why is metavar.py different? Because feature/capgen still does not support the active attribute. In PR Update master from gsd/develop 2020/06/30 #307, the capability to process the active attribute was added to main's metavar.py, and it was review and approved. This needs to stay to keep the system working.

Then, as part of the capgen unification project, this change should be made in the feature/capgen branch. Continually updating the main branch without contributing to the feature/capgen branch is slowing down the unification effort.

I will create a PR with the active attribute changes, the true/false (valid vals) changes, and the documentation changes for feature/capgen, merge it, then formally update this PR (which must result in no changes to the code), then this should be ok.

The PR is here: #396 - please merge and I will pull it into this PR.

@climbfuji
Copy link
Collaborator Author

The fact that we are seeing more changes that slowed down your code review is not only because of the oversights listed in 4-8, but also because feature/capgen has not been updated from main (master) for years. I am unsure why this didn't happen. In all other projects I have been working, the maintainers of feature branches are usually keeping their branches up to date with the authoritative code to avoid exactly these kind of problems, especially when they are long-term developments like feature/capgen.

I agree, when are you going to do this? I argue that these changes should have been made first in feature/capgen in order to make progress towards capgen unification.

Immediately after #391 is merged.

climbfuji and others added 2 commits August 18, 2021 10:07
…e_metavar_doc_metadata_table

Add active attribute and update true/false logic in metavar.py, update example metadata table in metadata_table.py
… into merge_feature_capgen_into_main_20210812
@climbfuji
Copy link
Collaborator Author

@gold2718 after merging #396 into feature/capgen and merging feature/capgen into this branch again, the files metadata_table.py and metavar.py are identical between the head of feature/capgen and this branch/PR. I am going tor re-request your review now.

Copy link
Collaborator

@gold2718 gold2718 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, thanks!

@climbfuji climbfuji merged commit 922fe44 into NCAR:main Aug 18, 2021
@climbfuji climbfuji deleted the merge_feature_capgen_into_main_20210812 branch June 27, 2022 03:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants