sort the history file by vertical dim#2114
Conversation
- Merge the two partial sorts into one; this is partly for efficiency but also because the original was not guaranteed to leave the fields in alphabetical sort order within a given size of level dimension (e.g., in test SMS_D_Ld1_P8x1.f10_f10_mg37.I2000Clm50BgcCropQianRs.green_gnu.clm-default, the last three fields on the h0 file were TSOI_ICE, SOILN_vr, PCT_CFT before this change) - Remove the check for duplicate field names: with this refactor, I think this check will no longer be guaranteed to detect duplicate field names, and this duplicate detection was already done elsewhere so is unnecessary here - Add a comment describing the rationale for the sort order
|
@jedwards4b thanks a lot for working on this and for opening this PR to speed up our history writes! Based on discussion with @jedwards4b I'm going to rework the sort. I'm recording our conversation here: From me:
From Jim:
From me:
From Jim:
For now I'm going to change the sort order so that fields with the same level dimension are grouped together. So fields will be sorted first by level dimension (with all non-leveled fields appearing first) then alphabetically by field name. I looked a bit into reordering the fields to make all of the time-varying fields appear after time-constant fields, but it looked like this would take more rework than I was up for right now, so I'm going to skip that piece. The time-varying fields that appear before time-constant fields are written in htape_timeconst... it's weird to have time-varying fields written in a subroutine with name "timeconst" and moving those writes to their own subroutine - called after htape_timeconst - would probably clean up the logic as well as having some performance improvement as Jim mentioned. |
This prevents the interleaving of fields that have the same size of their level dimension despite having different level dimensions; this is a more intuitive ordering.
billsacks
left a comment
There was a problem hiding this comment.
With the changes I just pushed, I approve this PR... I'll integrate it as soon as Sam Levis's tag is done.
|
I have run ERP_D_P36x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.cheyenne_intel.clm-default. It passes, including baseline comparisons. I'll do final testing once #2106 is merged. |
Description of changes
Sort the history file by vertical dimension and then variable name.
Specific notes
This presents a significant performance improvement especially notable on lustre file systems such as on derecho.
Contributors other than yourself, if any:
CTSM Issues Fixed (include github issue #):
Are answers expected to change (and if so in what way)?
NO
Any User Interface Changes (namelist or namelist defaults changes)?
NO
Testing performed, if any:
(List what testing you did to show your changes worked as expected)
(This can be manual testing or running of the different test suites)
(Documentation on system testing is here: https://github.com/ESCOMP/ctsm/wiki/System-Testing-Guide)
(aux_clm on cheyenne for intel/gnu and izumi for intel/gnu/nag/pgi is the standard for tags on master)
I ran two cases:
PFS_Ld5.ne30pg3_ne30pg3_mg17.FLTHIST_v0d.derecho_intel.beforeiochanges
PFS_Ld5.ne30pg3_ne30pg3_mg17.FLTHIST_v0d.derecho_intel.afteriochanges
with
hist_nhtfrq = -24
hist_mfilt = 1
Before:
After
Note that this difference increases with increased resolution and/or number of mpi tasks.