diff --git a/docs/man-pages/condor_submit.rst b/docs/man-pages/condor_submit.rst index d1567661d97..7a73bc70e7e 100644 --- a/docs/man-pages/condor_submit.rst +++ b/docs/man-pages/condor_submit.rst @@ -1462,6 +1462,14 @@ POLICY COMMANDS :index:`max_retries` successfully taking a checkpoint. The checkpoint will transferred and the executable restarted. See :ref:`users-manual/self-checkpointing-applications:Self-Checkpointing Applications` for details. + :index:`max_checkpoint_interval` + + max_checkpoint_interval = + The number of seconds a self-checkpointing has to write out its + next checkpoint. Exceeding this duration puts the job on hold. To + be clear: a job has this amount of time to write out its first + checkpoint, each subsequent checkpoint, and to finish after its + last checkpoint. :index:`hold` hold = diff --git a/docs/version-history/development-release-series-91.rst b/docs/version-history/development-release-series-91.rst index 5cf9d3594f5..367527684c6 100644 --- a/docs/version-history/development-release-series-91.rst +++ b/docs/version-history/development-release-series-91.rst @@ -23,6 +23,12 @@ New Features: ``OpSysLongName`` is now ``"macOS 10.15"`` instead of ``"MacOSX 15.4"``. :jira:`627` +- Added ``max_checkpoint_interval`` to the submit language. It specifies + the largest permitted interval, in seconds, between checkpoints. Exceeding + this interval puts the job on hold. This command is intended to prevent + "stuck" self-checkponting jobs from wasting resources. + :jira:`650` + - Improved and simplified how HTCondor locates the blahp software. Configuration parameter ``GLITE_LOCATION`` has been replaced by ``BLAHPD_LOCATION``.