-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
client: on startup, check for active task finish files #3303
Conversation
Sometimes jobs finish (and create finish files) while the client is shutting down, and the client doesn't notice they finished. This can result in the job starting from the beginning. Fix this by checking for finish files on startup.
Maybe it's better to change function to return error code instead of adding it as an output parameter? |
The return value is whether a finish file is present |
Reasonable. Agreed |
@AenBleidd - were you going to continue this review and merge? If not let me know and I can do it. |
@TheAspens, generally I'm ok with the code. But I didn't test it. So please merge it if you believe it's fine and no additional verification needed (or done by you or someone else) |
I was looking at this one too. It puts us into a bit of a bind with the 'feature freeze' for 7.16 It's part of (an extension of) the #3017 / #3019 pairing, which was merged before the release branch was split (back in March, even before the first abortive split). Like @AenBleidd, the code looks OK, but it's a hard problem to generate a conclusive test for in live running. And we really don't want to delay 7.16 any longer, given the slow rate of testing responses apart from the platform-specific issues with Mac and Android. I'll leave it to you: we've lived with the problem for many years, we can leave it on the slate for the next release if necessary. |
I working on reproducing it with a build from master right now. Once I can reliably reproduce, then I will build from this branch and repeat the test. I don't think it will be too hard (famous last words). |
I was able to test by downloading a "uppercase" workunit from a test project and then exiting the client in the middle of the task. I then completed the task standalone and started the client again. The version in master restarted the task from scratch (since I don't think uppercase has any checkpointing) whereas the code in this branch saw the finish file and immediately uploaded and reported the task. This looks good to me. Thanks @davidpanderson for the fix. I'm merging. |
Should fix problem where jobs finish while client is shutting down, as described in #3300