diff --git a/docs/config_reference.rst b/docs/config_reference.rst index 03e21008b8..c708ab484b 100644 --- a/docs/config_reference.rst +++ b/docs/config_reference.rst @@ -1511,6 +1511,7 @@ General Configuration ReFrame's asynchronous execution policy will try to advance as many tests as possible in their pipeline, but some tests may take too long to proceed (e.g., due to copying of large files) blocking the advancement of previously started tests. If this timeout value is exceeded and at least one test has progressed, ReFrame will stop processing new tests and it will try to further advance tests that have already started. + See :ref:`pipeline-timeout` for more guidance on how to set this. :required: No :default: ``10`` diff --git a/docs/manpage.rst b/docs/manpage.rst index 13df9fa0d1..84c232d682 100644 --- a/docs/manpage.rst +++ b/docs/manpage.rst @@ -1413,6 +1413,23 @@ Whenever an environment variable is associated with a configuration option, its ================================== ================== +.. envvar:: RFM_PIPELINE_TIMEOUT + + Timeout in seconds for advancing the pipeline in the asynchronous execution policy. + See :ref:`pipeline-timeout` for more guidance on how to set this. + + + .. table:: + :align: left + + ================================== ================== + Associated command line option N/A + Associated configuration parameter :attr:`~config.general.pipeline_timeout` + ================================== ================== + + .. versionadded:: 3.10.0 + + .. envvar:: RFM_PREFIX General directory prefix for ReFrame-generated directories. diff --git a/docs/pipeline.rst b/docs/pipeline.rst index f0a51e2915..b6efb3edef 100644 --- a/docs/pipeline.rst +++ b/docs/pipeline.rst @@ -197,6 +197,23 @@ To control the concurrency of the ReFrame execution context, users should set th Execution contexts were formalized. +.. _pipeline-timeout: + +------------------------------------------------------------------------------------------- +Tweaking the throughput and interactivity of test jobs in the asynchronous execution policy +------------------------------------------------------------------------------------------- + +ReFrame's asynchronous execution policy will iteratively cycle through all the in-flight tests and will try to advance the state (see state diagram above) of as many as possible within a given time slot. +The duration of this time slot is controlled by the :attr:`~config.general.pipeline_timeout` configuration option or the :envvar:`RFM_PIPELINE_TIMEOUT` environment variable. +If this timeout expires and at least one test has progressed, ReFrame will stop processing new tests in this time slot. +In the next time slot, it will try to further advance tests that have already started and if there is enough time left, it will also start new tests. +Essentially, a small timeout value gives preference to tests that have already started, thus pushing them quicker down their pipeline, whereas higher values give preference to overall test throughput, as more tests will be running concurrently. +The default timeout is 10 seconds in order to balance interactivity and overall throughput. + +There are cases when some tests take too long to proceed (e.g., due to copying of large files) and as a result they are blocking more tests from starting their pipeline. +In these cases, a higher timeout value will help to increase the test concurrency and therefore the overall throughput. + + Timing the Test Pipeline ------------------------