Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add SCR_FETCH_BYPASS config param #533

Merged
merged 2 commits into from
Feb 15, 2023
Merged

add SCR_FETCH_BYPASS config param #533

merged 2 commits into from
Feb 15, 2023

Conversation

adammoody
Copy link
Contributor

@adammoody adammoody commented Feb 15, 2023

This restores the ability to completely disable fetch by setting SCR_FETCH=0. It had instead been tied to fetch bypass when bypass support was added, so that setting SCR_FETCH=0 kept the fetch active but switched to "fetch bypass" mode. Fetch bypass enables one to read files directly from the prefix directory on a restart, rather than have SCR first copy those files to cache.

However, a problem arises when restarting an application with a different number of ranks. Even in fetch bypass mode, SCR executes a numbers of checks, including whether the number of ranks listed in the checkpoint matches the number of ranks in the current run. If there is a mismatch, SCR prints a warning and moves on to the next checkpoint. This means SCR would print a bunch of warning messages for apps that actually want to restart with a different number of ranks.

To solve this, the fetch feature can be disabled fully (once again) by setting SCR_FETCH=0. For apps that need to restart with a different number of ranks, they can disable fetch to skip those SCR checks and silence the warnings.

If one wants to use fetch in bypass mode, a new SCR_FETCH_BYPASS param has been added. So now an app can rely on SCR to identify the most recent checkpoint, but tell SCR not to copy those files to cache during restart. Instead, SCR_Route_file would point to the files directly on the prefix directory:

export SCR_FETCH=1
export SCR_FETCH_BYPASS=1

@adammoody adammoody force-pushed the fetch_enable branch 2 times, most recently from e2dfd3b to 8a3cfbe Compare February 15, 2023 19:24
@adammoody adammoody merged commit 0a9c281 into develop Feb 15, 2023
@adammoody adammoody deleted the fetch_enable branch February 15, 2023 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant