Please see [wiki page](https://github.com/aws/aws-parallelcluster/wiki/Slurm-Issues#slurm-will-not-mark-non-responsive-node-as-down-if-node-is-within-resumetimeout) for details regarding this issue Root cause of https://github.com/aws/aws-parallelcluster/issues/2117 We will track any feedback/comments in this Github issue, thank you!