Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GitHub Issue NOAA-EMC/GSI#313. Store data files for global sources in tar files. #338

Merged
merged 1 commit into from
Mar 23, 2022

Conversation

EdwardSafford-NOAA
Copy link
Contributor

… Completes issue #313

The change in storage is primarily to help alleviate issues on hera where file count (not size) has been an ongoing issue for developers. Additional changes were made to fix build issues on wcoss2.

@EdwardSafford-NOAA
Copy link
Contributor Author

@DavidHuber-NOAA if you have time I'd welcome your review.

@EdwardSafford-NOAA
Copy link
Contributor Author

@MichaelLueken-NOAA I'm guessing David is out on leave or otherwise unable to review this. I'm on leave next week. If you want to look this over and merge it I'll be around today to handle any problems you might spot. Otherwise you could wait until David can do a review and/or I'm back. Your call and I'm good either way.

@MichaelLueken
Copy link
Contributor

@EdwardSafford-NOAA I'll go ahead and clone the work and make sure that everything compiles correctly. I'll then be able to merge it to the authoritative repository next Tuesday (after the current work that was submitted earlier this week has been merged). I'll let you know if anything jumps out.

@DavidHuber-NOAA
Copy link
Collaborator

@EdwardSafford-NOAA @MichaelLueken-NOAA I am not officially back from leave, but I will take a quick look today.

Copy link
Collaborator

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one question, see below.

@@ -260,7 +307,10 @@ for sat in ${big_satlist}; do
(( ii=ii+1 ))
done

if [[ ! $MY_MACHINE = "jet" ]]; then
if [[ $MY_MACHINE = "hera" ]]; then
$SUB --account ${ACCOUNT} -n $ii -o ${logfile} -D . -J ${jobname} --time=4:00:00 \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is --mem=0 required? Wouldn't all of the memory for the node be made available anyway? Or is it different for MPMD tasks?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently there was an upgrade to the batch processing software on hera somewhere around March 1, and this process began running out of memory. You're right all the memory for the node used to be available by default but after the upgrade it apparently isn't. I didn't go too far into the weeds but learned that, counterintuitively, with --mem=0 maximum memory for the node is made available.

Copy link
Collaborator

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okie dokie, thanks for the detailed answer.

@EdwardSafford-NOAA
Copy link
Contributor Author

And thanks @DavidHuber-NOAA for taking a look at this on your day off. That was very kind of you.

@MichaelLueken MichaelLueken changed the title Github issue #313. Store data files for global sources in tar files.… GitHub Issue NOAA-EMC/GSI#313. Store data files for global sources in tar files. Mar 23, 2022
@MichaelLueken
Copy link
Contributor

Since there are no source code changes to this update and @DavidHuber-NOAA has approved the changes, I will now give final approval to these changes and merge them to the authoritative repository.

@MichaelLueken MichaelLueken merged commit bf25a3f into NOAA-EMC:master Mar 23, 2022
AndrewEichmann-NOAA pushed a commit to AndrewEichmann-NOAA/GSI that referenced this pull request Jun 6, 2022
GitHub Issue NOAA-EMC#313.  Store data files for global sources in tar files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Revise RadMon storage strategy to reduce file count
3 participants