Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: increase finish-file timeout #3019

Merged
merged 1 commit into from
Mar 30, 2019
Merged

Conversation

davidpanderson
Copy link
Contributor

@davidpanderson davidpanderson commented Feb 12, 2019

When an app finishes, it writes a "finish file",
which ensures the client that the app really finished.

If the app process is still there N seconds after the finish file appears,
the client assumes that something went wrong, and it aborts the job.

Previously N was 10.
This was too small during periods of heavy paging.
I increased it to 300.

It has been pointed out that if the app creates the finish file,
and its output files are present,
it should be treated as successful regardless of whether it exits.
This is probably true, but right now we don't have a mechanism
for killing a job and marking it as success.
The longer timeout makes this moot.

Fixes #3017

When an app finishes, it writes a "finish file",
which ensures the client that the app really finished.

If the app process is still there N seconds after the finish file appears,
the client assumes that something went wrong, and it aborts the job.

Previously N was 10.
This was too small during periods of heavy paging.
I increased it to 300.

It has been pointed out that if the app creates the finish file,
and its output files are present,
it should be treated as successful regardless of whether it exits.
This is probably true, but right now we don't have a mechanism
for killing a job and marking it as success.
The longer timeout makes this moot.
@JuhaSointusalo JuhaSointusalo merged commit 9d89580 into master Mar 30, 2019
@JuhaSointusalo JuhaSointusalo deleted the dpa_finish_file branch March 30, 2019 21:07
@JuhaSointusalo
Copy link
Contributor

It has been pointed out that if the app creates the finish file,
and its output files are present,
it should be treated as successful regardless of whether it exits.

FWIW, it has also been argued that if an app hangs at exit there's no trusting anything that happened before the hang. The context in that discussion was GPU apps. If GPU drivers cause an app to hang at exit then the drivers might not have processed computing commands correctly either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants