sort the history file by vertical dim by jedwards4b · Pull Request #2114 · ESCOMP/CTSM

jedwards4b · 2023-08-18T13:37:47Z

Description of changes

Sort the history file by vertical dimension and then variable name.

Specific notes

This presents a significant performance improvement especially notable on lustre file systems such as on derecho.

Contributors other than yourself, if any:

CTSM Issues Fixed (include github issue #):

Are answers expected to change (and if so in what way)?
NO
Any User Interface Changes (namelist or namelist defaults changes)?
NO
Testing performed, if any:
(List what testing you did to show your changes worked as expected)
(This can be manual testing or running of the different test suites)
(Documentation on system testing is here: https://github.com/ESCOMP/ctsm/wiki/System-Testing-Guide)
(aux_clm on cheyenne for intel/gnu and izumi for intel/gnu/nag/pgi is the standard for tags on master)

I ran two cases:
PFS_Ld5.ne30pg3_ne30pg3_mg17.FLTHIST_v0d.derecho_intel.beforeiochanges
PFS_Ld5.ne30pg3_ne30pg3_mg17.FLTHIST_v0d.derecho_intel.afteriochanges

with
hist_nhtfrq = -24
hist_mfilt = 1

Before:

hist_htapes_wrapup_define                                 1024   1024   6        0.8841      0.4071      16      0.9985      739    
hist_htapes_wrapup_write                                  1024   1024   6        0.5030      0.5029      0       0.5031      425    
hist_htapes_wrapup_tconst                                 1024   1024   6        0.1001      0.0601      641     0.1422      0

After


   hist_htapes_wrapup_write                                  1024   1024   6        0.5039      0.5038      0       0.5039      202    
   hist_htapes_wrapup_define                                 1024   1024   6        0.4677      0.4037      14      0.5223      739    
   hist_htapes_wrapup_tconst                                 1024   1024   6        0.0851      0.0460      513     0.1256      0

Note that this difference increases with increased resolution and/or number of mpi tasks.

- Merge the two partial sorts into one; this is partly for efficiency but also because the original was not guaranteed to leave the fields in alphabetical sort order within a given size of level dimension (e.g., in test SMS_D_Ld1_P8x1.f10_f10_mg37.I2000Clm50BgcCropQianRs.green_gnu.clm-default, the last three fields on the h0 file were TSOI_ICE, SOILN_vr, PCT_CFT before this change) - Remove the check for duplicate field names: with this refactor, I think this check will no longer be guaranteed to detect duplicate field names, and this duplicate detection was already done elsewhere so is unnecessary here - Add a comment describing the rationale for the sort order

billsacks · 2023-08-18T23:45:52Z

@jedwards4b thanks a lot for working on this and for opening this PR to speed up our history writes!

Based on discussion with @jedwards4b I'm going to rework the sort. I'm recording our conversation here:

From me:

I was wondering if it's important (for this performance reason) for fields with more levels to appear after fields with fewer levels - e.g., for PCT_CFT to appear last because the cft dimension is the biggest level dimension. Or was your use of num2d more a proxy for something else - e.g., the important thing is just for all 3-d fields to appear after 2-d fields, or the important thing is to group fields with the same level dimension together. The reason I ask is because I'm finding the new sorted order a bit unintuitive and I'm wondering if there's some flexibility to change it, or if this sort order is important.

From Jim:

the important thing is to group fields with the same level dimension and put time dependent fields after non time dependent together.

From me:

Okay, thanks a lot. So if I understand right, I think I should change the sort so instead of being based on num2d, it is based on type2d (the name of the level dimension): currently, for example, fields with level dimension of levgrnd are interleaved with those with level dimension of levdcmp because those both happen to be 25; do you agree with that change? Also, at least in the current ncdump sort order, I see a few time-varying fields (time_bounds, date_written, time_written) appear before some time-constant fields (area, landfrac, etc.); is that a problem?

From Jim:

Sure I agree with your suggestion and it may improve things a bit more to rearrange those time-varying variables.

For now I'm going to change the sort order so that fields with the same level dimension are grouped together. So fields will be sorted first by level dimension (with all non-leveled fields appearing first) then alphabetically by field name.

I looked a bit into reordering the fields to make all of the time-varying fields appear after time-constant fields, but it looked like this would take more rework than I was up for right now, so I'm going to skip that piece. The time-varying fields that appear before time-constant fields are written in htape_timeconst... it's weird to have time-varying fields written in a subroutine with name "timeconst" and moving those writes to their own subroutine - called after htape_timeconst - would probably clean up the logic as well as having some performance improvement as Jim mentioned.

This prevents the interleaving of fields that have the same size of their level dimension despite having different level dimensions; this is a more intuitive ordering.

billsacks

With the changes I just pushed, I approve this PR... I'll integrate it as soon as Sam Levis's tag is done.

billsacks · 2023-08-21T20:33:51Z

I have run ERP_D_P36x2_Ld3.f10_f10_mg37.I1850Clm50BgcCrop.cheyenne_intel.clm-default. It passes, including baseline comparisons. I'll do final testing once #2106 is merged.

sort the history file by vertical dim

0006cf2

jedwards4b mentioned this pull request Aug 18, 2023

history write optimization #2115

Merged

ekluzek assigned billsacks and ekluzek Aug 19, 2023

ekluzek added priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations enhancement new capability or improved behavior of existing capability labels Aug 19, 2023

Sort history fields by name of level dimension rather than size

3c828c3

This prevents the interleaving of fields that have the same size of their level dimension despite having different level dimensions; this is a more intuitive ordering.

billsacks approved these changes Aug 21, 2023

View reviewed changes

Merge branch 'master' into jedwards/history_write_performance

8bfed6e

billsacks merged commit 8bfed6e into ESCOMP:master Aug 22, 2023

jedwards4b deleted the jedwards/history_write_performance branch September 7, 2023 16:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sort the history file by vertical dim#2114

sort the history file by vertical dim#2114
billsacks merged 4 commits intoESCOMP:masterfrom
jedwards4b:jedwards/history_write_performance

jedwards4b commented Aug 18, 2023

Uh oh!

billsacks commented Aug 18, 2023

Uh oh!

billsacks left a comment

Uh oh!

billsacks commented Aug 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jedwards4b commented Aug 18, 2023

Description of changes

Specific notes

Uh oh!

billsacks commented Aug 18, 2023

Uh oh!

billsacks left a comment

Choose a reason for hiding this comment

Uh oh!

billsacks commented Aug 21, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants