Id fix for 32 bit #51

adcroft · 2017-06-19T00:39:59Z

Replaces the 32-bit iceberg_num with:
- a 64-bit id within the code, and
- two 32-bit id-like variables in the restart files.
Reads (old) restart files with 32-bit iceberg_num and generates new IDs for bergs.
Always writes the new restart file.
Trajectory files no long have the "marker" entries but instead all records have the 64-bit equivalent id.
This should not change answers but just changes the book keeping internally and for i/o.
I removed the "read_orig" ability since maintaining compatibility with 3 year old formats is impossible when I don't have any such files around to test with.

I've assigned this to @nikizadehgfdl to handle - we need confirmation this fixes the issue in Bill Hurlin's run. Once (assuming) it has passed your tests, can you complete the merge by clicking green buttons? I believe that Bill then just needs to use the merged version and his run should magically continue from where it stopped.

@sternalon Can you check that the bond restarts, and indeed the bonds, still work? I don't have a test case with bonds active. Note that the restart file format is different in order to accommodate the 64-bit id. I did not code up a backward compatibility for bond restarts because I couldn't debug it. If we need backward compatibility, it will not be hard to follow how it works in read_restart_bergs() - there's about 5 lines that handle it. There are a lot of changes in all but made in small commits so should not be to hard to follow the code evolution.

- Two new functions to cobine two 32-bit integers into a single 64-bit integer and back again. - When we wrote the iceberg_num (iceberg id code) we assumed integers were 64-bit but by default they are 32-bit with the compilation options used with FMS codes, and netcdf only supports 32-bit integers! - Also renamed unitTests() to unit_tests().

- A new member %id is a 64-bit integer for a better constructed id. - New function, generate_id(), returns a 64-bit integer hash of counter and position.

- This calls the new generate_id() function and stores the 64-bit %id in the iceberg everywhere we first create %iceberg_num.

- When the restart file exists but has zero length data in then the model was correctly NOT allocating but was then proceeding to try and read into the non-allocated array. This avoids that condition. - I thought this was fixed before but apparently not.

- There were unnecessary use and public lines for the functions pack_berg_into_buffer2() and unpack_berg_from_buffer2().

- Previously the pack_berg_into_buffer2() and unpack_berg_from_buffer2() routines would place particular members in the berg type into a particular position of a 2D array. This makes adding variables or re-ordering variables cumbersome since fixed integer locations have to be made consistent. - Now we treat a row of the 2d array as a fifo stack so that only the order matters in the code, and the absolute position does not need to appear in the code. The order of packing and unpacking must still be consistent.

- In order to switch over to using the 64-bit IDs I needed to be able calculate new 64-bit IDs from old 32-bit IDs which involved splitting the generate_id() function into more modular parts.

- As for pack_berg_into_buffer2(), the trajectory packing and unpacking now uses the FIFO stack-style approach.

- %id is set everywhere that %iceberg_num is set. - When resading restarts we convert the 32-bit id to a 64-bit id. - New bergs are assigned both a 32-bit and 64-bit id.

- To reduce amount of code that needs to edited when deprecating $iceberg_num I've uniformly updated every print statement that uses %iceberg_num.

- gnu was casting a constant integer to the kind of the result, while intel and pgi compilers are making all constants integer*4. This commit forces the expression to be evaluated in integer*8 terms.

- Everywhere tests are made now uses %id instead of $iceberg_num, except for the restart files.

- Removed code used to read legacy restarts. - Added FATAL error message if "read_orig" is set in the namelist.

- Add id (as id_cnt,id_ij) to restart files (not yet used when reading).

- I can not figure out why we would initialize the counter to 1 and then increment after generating a berg. Starting at 0 and incrementing before generating a berg means the counter is now a counter of how many bergs were calved in that cell.

- When reading the 32-bit iceberg_num from restarts we will need to recover the starting i,j for the berg. - Also removed comments abotu moving read_restart_calving().

- We now read id_cnt and id_ij to create an id, unless iceberg_num is present. - No longer write iceberg_num (ever).

- iceberg_num now only appears when reading an old restart with the 32-bit reperesentation of iceberg_num.

nikizadehgfdl · 2017-06-20T15:03:49Z

I reproduced the bug (by providing year 225 restart of Bill's coupled model run as initCond) and verified that
this commit fixes the problem.

My regression tests also passed, except for one, the iceberg distributed restart does not work.
Given the urgency for this fix I am recommending it to be pulled in and I filed issue #52 for the distributed restarts (which is not actively used yet).

…arts - When a restart file is empty (zero record length) we were skipping the read step to avoid segmentation violations when passing unallocated arrays to mpp. However, an mpp_sync inside mpp_io then caused some PEs to hang. - In order to call mpp_io on all PEs we are now allocating zero-sized arrays which happily allows the everything to work inside mpp. - Closes NOAA-GFDL#52 - Also closes NOAA-GFDL#48 which should have been closed with PRs NOAA-GFDL#50 and NOAA-GFDL#51 for which this commit is a further fix.

adcroft added 19 commits June 14, 2017 23:21

Added function generate_id()

33e090d

- A new member %id is a 64-bit integer for a better constructed id. - New function, generate_id(), returns a 64-bit integer hash of counter and position.

Set berg%id everywhere %iceberg_num is set

e7fc83f

- This calls the new generate_id() function and stores the 64-bit %id in the iceberg everywhere we first create %iceberg_num.

Merge branch 'fix-empty-restart-files' into id-fix-for-32-bit

2ec7bd0

Removed use/public lines for pack_berg_into_buffer2

85dfb93

- There were unnecessary use and public lines for the functions pack_berg_into_buffer2() and unpack_berg_from_buffer2().

Modularized generate_id(), added convert_old_id()

905f4f2

- In order to switch over to using the 64-bit IDs I needed to be able calculate new 64-bit IDs from old 32-bit IDs which involved splitting the generate_id() function into more modular parts.

Use push/pull_buffer_value() in pack_traj_into_buffer2()

a5d3576

- As for pack_berg_into_buffer2(), the trajectory packing and unpacking now uses the FIFO stack-style approach.

Set %id everywhere %iceberg_num is set

bbde6c2

- %id is set everywhere that %iceberg_num is set. - When resading restarts we convert the 32-bit id to a 64-bit id. - New bergs are assigned both a 32-bit and 64-bit id.

Use %id in all print statements

1efcdaf

- To reduce amount of code that needs to edited when deprecating $iceberg_num I've uniformly updated every print statement that uses %iceberg_num.

Fix id_from_2_ints() for Intel and PGI

04e827b

- gnu was casting a constant integer to the kind of the result, while intel and pgi compilers are making all constants integer*4. This commit forces the expression to be evaluated in integer*8 terms.

Use %id instead of $iceberg_num

32eba0a

- Everywhere tests are made now uses %id instead of $iceberg_num, except for the restart files.

Removed read_restart_bergs_orig()

22bb2a0

- Removed code used to read legacy restarts. - Added FATAL error message if "read_orig" is set in the namelist.

Add id to write_restart()

df4e43a

- Add id (as id_cnt,id_ij) to restart files (not yet used when reading).

Split convert_old_id() in cij_from_old_id()

08ea446

- When reading the 32-bit iceberg_num from restarts we will need to recover the starting i,j for the berg. - Also removed comments abotu moving read_restart_calving().

Read id from restarts or generate ids if iceberg_num is present

be47643

- We now read id_cnt and id_ij to create an id, unless iceberg_num is present. - No longer write iceberg_num (ever).

Removed last vestiges of %iceberg_num

f7988fe

- iceberg_num now only appears when reading an old restart with the 32-bit reperesentation of iceberg_num.

adcroft added the bug label Jun 19, 2017

adcroft assigned nikizadehgfdl Jun 19, 2017

nikizadehgfdl mentioned this pull request Jun 20, 2017

icebergs cannot restart from distributed restart files #52

Closed

adcroft merged commit a527054 into NOAA-GFDL:dev/gfdl Jun 20, 2017

adcroft mentioned this pull request Jun 20, 2017

Fixes hang when mpp_io is not called on some PEs with empty berg restarts #53

Merged

adcroft deleted the id-fix-for-32-bit branch July 15, 2020 02:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Id fix for 32 bit #51

Id fix for 32 bit #51

adcroft commented Jun 19, 2017

nikizadehgfdl commented Jun 20, 2017

Id fix for 32 bit #51

Id fix for 32 bit #51

Conversation

adcroft commented Jun 19, 2017

nikizadehgfdl commented Jun 20, 2017