Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FAHCore_a7 and _22 don't always sync files on Windows Shutdown. #1314

Open
bb30994 opened this issue Mar 8, 2020 · 7 comments
Open

FAHCore_a7 and _22 don't always sync files on Windows Shutdown. #1314

bb30994 opened this issue Mar 8, 2020 · 7 comments
Labels
1.Type - Defect Reported issue is a defect. 3.Component - GROMACS Core Reported issue relates to FahCore_a7. 3.Component - OpenMM Core Reported issue relates to FahCore_21/FahCore_22. 4.OS - Windows Reported issue occurs on Windows (Windows 10, Windows 8, Windows 7).

Comments

@bb30994
Copy link

bb30994 commented Mar 8, 2020

When a Windows Shutdown is initiated, all active processes are notified of the pending shutdown and given a period of time to clean up their files. I've noticed that FAHCore_a7 does not process that information correctly. After the wait interval expires, Windows forces the remaining processes to die. This may or may not leave corrupted files (needing a sync process). This increases the probability of a GURU MEDITATION error.

If I manually pause all A7 WUs, the shutdown does not leave corrupted files.

Why can't a7 (or it's wrapper) manage these files correctly when the shutdown warning is issued?

See also https://foldingforum.org/viewtopic.php?f=106&t=32011&p=310002#p310002 and #1289.

January is history. When can we beta-test those pending changes?

@shorttack shorttack added 3.Component - FAHClient Reported issue relates to FAHClient. 4.OS - Windows Reported issue occurs on Windows (Windows 10, Windows 8, Windows 7). labels Apr 14, 2020
@shorttack shorttack added the 1.Type - Enhancement Reported issue is an enhamcement. label Apr 15, 2020
@shorttack
Copy link

@bb30994 Bruce, is this still an issue with the 7.6 beta? If so, I'll call a defect.

@bb30994
Copy link
Author

bb30994 commented Apr 18, 2020

I got a Guru Meditation error when I shut down 7.4.4 to upgrade. Bye-bye one Gromacs WU, I'll test it again but checkpoints are not a FAHClient issue, it's a FAHCore issue. 0.0.2 has not change (yet).

@bb30994
Copy link
Author

bb30994 commented Apr 24, 2020

I contend that this is a bug, not an enhancement.

Any Windows program should be able to close its files during an orderly shutdown of Windows. When the OS notifies a program that a shutdown is being processed, it waits a bit for programs to close and then rechecks. (If somebody is editing a file, they have long enough to save their file.) Then it's more forceful, and it lists specific programs which are "preventing Windows from shutting down"

The FAHCore has the option of refusing to shut down if it needs longer than the allotted time and then closing itself once the files have been closed.

Being uncooperative with Windows shutdown requests is not the way to write a program.

@bb30994
Copy link
Author

bb30994 commented May 1, 2020

This is an issue for FAHCore_A7, not for FAHClient.

@PantherX PantherX added 3.Component - GROMACS Core Reported issue relates to FahCore_a7. and removed 3.Component - FAHClient Reported issue relates to FAHClient. labels May 1, 2020
@shorttack shorttack added 3.Component - FAHCoreWrapper Reported issue relates to FAHCoreWrapper. 3.Component - GROMACS Core Reported issue relates to FahCore_a7. and removed 3.Component - GROMACS Core Reported issue relates to FahCore_a7. labels May 2, 2020
@bb30994
Copy link
Author

bb30994 commented May 4, 2020

same issue reported here #1458 but in a very different context

@shorttack
Copy link

See also #1468, #1436

@bb30994
Copy link
Author

bb30994 commented Jun 22, 2020

The following messages were displayed in a recent log:
19:39:31:WU02:FS01:0x22:Watchdog triggered, requesting soft shutdown down
19:49:30:WU02:FS01:0x22:Watchdog shutdown failed, hard shutdown triggered

Whatever is preventing the soft shutdown request from being honored needs to be examined. The FAHCore should accept that soft shutdown request unless a checkpoint is actively being written. The trigger for the hard shutdown should allow long enough to complete the writing of an active checkpointing process.

@bb30994 bb30994 changed the title FAHCore_a7 doesn't sync files on Windows Shutdown. FAHCore_a7 and _22 don't always sync files on Windows Shutdown. Jun 22, 2020
@PantherX PantherX added 1.Type - Defect Reported issue is a defect. 3.Component - OpenMM Core Reported issue relates to FahCore_21/FahCore_22. and removed 1.Type - Enhancement Reported issue is an enhamcement. 3.Component - FAHCoreWrapper Reported issue relates to FAHCoreWrapper. labels Aug 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.Type - Defect Reported issue is a defect. 3.Component - GROMACS Core Reported issue relates to FahCore_a7. 3.Component - OpenMM Core Reported issue relates to FahCore_21/FahCore_22. 4.OS - Windows Reported issue occurs on Windows (Windows 10, Windows 8, Windows 7).
Projects
None yet
Development

No branches or pull requests

3 participants