Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simulation modes: proportional run length and more #2220

Merged
merged 9 commits into from
Apr 6, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
2 changes: 0 additions & 2 deletions bin/cylc
Original file line number Diff line number Diff line change
Expand Up @@ -234,7 +234,6 @@ control_commands['checkpoint'] = ['checkpoint']
utility_commands = OrderedDict()
utility_commands['cycle-point'] = [
'cycle-point', 'cyclepoint', 'datetime', 'cycletime']
utility_commands['random'] = ['random', 'rnd']
utility_commands['scp-transfer'] = ['scp-transfer']
utility_commands['suite-state'] = ['suite-state']
utility_commands['ls-checkpoints'] = ['ls-checkpoints']
Expand Down Expand Up @@ -436,7 +435,6 @@ comsum['job-submit'] = '(Internal) Submit a job'

# utility
comsum['cycle-point'] = 'Cycle point arithmetic and filename templating'
comsum['random'] = 'Generate a random integer within a given range'
comsum['jobscript'] = 'Generate a task job script and print it to stdout'
comsum['scp-transfer'] = 'Scp-based file transfer for cylc suites'
comsum['suite-state'] = 'Query the task states in a suite'
Expand Down
16 changes: 16 additions & 0 deletions bin/cylc-cycle-point
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,11 @@ def main():
help="Add an ISO 8601-based interval representation to CYCLE",
action="store", dest="offset")

parser.add_option(
"--equal", metavar="POINT2",
help="Succeed if POINT2 is equal to POINT (format agnostic).",
action="store", dest="point2")

parser.add_option(
"--template", metavar="TEMPLATE",
help="Filename template string or variable",
Expand Down Expand Up @@ -198,6 +203,17 @@ def main():
except ValueError as exc:
parser.error('ERROR: invalid cycle: %s' % exc)

if options.point2:
try:
cycle_point2 = iso_point_parser.parse(
options.point2, dump_as_parsed=(template is None))
except ValueError as exc:
parser.error('ERROR: invalid cycle: %s' % exc)
if cycle_point2 == cycle_point:
sys.exit(0)
else:
sys.exit(1)

offset_props = {}

if options.offsethours:
Expand Down
7 changes: 6 additions & 1 deletion bin/cylc-get-suite-config
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,11 @@ def main():
"[DEPRECATED: use 'cylc list SUITE'].",
action="store_true", default=False, dest="tasks")

parser.add_option(
"-u", "--run-mode",
help="Get config for suite run mode.", action="store", default="live",
dest="run_mode", choices=['live', 'dummy', 'simulation'])

(options, args) = parser.parse_args()
suite, suiterc = SuiteSrvFilesManager().parse_suite_arg(options, args[0])

Expand All @@ -120,7 +125,7 @@ def main():
config = SuiteConfig(
suite, suiterc,
load_template_vars(options.templatevars, options.templatevars_file),
cli_initial_point_string=options.icp)
cli_initial_point_string=options.icp, run_mode=options.run_mode)
if options.tasks:
for task in config.get_task_name_list():
print prefix + task
Expand Down
54 changes: 0 additions & 54 deletions bin/cylc-random

This file was deleted.

7 changes: 6 additions & 1 deletion bin/cylc-validate
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,11 @@ def main():
"--profile", help="Output profiling (performance) information",
action="store_true", default=False, dest="profile_mode")

parser.add_option(
"-u", "--run-mode", help="Validate for run mode.", action="store",
default="live", dest="run_mode",
choices=['live', 'dummy', 'dummy-local', 'simulation'])

(options, args) = parser.parse_args()

profiler = Profiler(options.profile_mode)
Expand All @@ -73,7 +78,7 @@ def main():
suite, suiterc,
load_template_vars(options.templatevars, options.templatevars_file),
cli_initial_point_string=options.icp,
validation=True, strict=options.strict,
validation=True, strict=options.strict, run_mode=options.run_mode,
output_fname=options.output,
mem_log_func=profiler.log_memory)

Expand Down
15 changes: 7 additions & 8 deletions conf/cylc.lang
Original file line number Diff line number Diff line change
Expand Up @@ -110,14 +110,19 @@
<keyword>started handler</keyword>
<keyword>stalled handler</keyword>
<keyword>simulation mode suite timeout</keyword>
<keyword>simulate failure</keyword>
<keyword>disable suite event handlers</keyword>
<keyword>default run length</keyword>
<keyword>speedup factor</keyword>
<keyword>time limit buffer</keyword>
<keyword>fail cycle points</keyword>
<keyword>fail try 1 only</keyword>
<keyword>disable task event handlers</keyword>
<keyword>shutdown handler</keyword>
<keyword>shell</keyword>
<keyword>sequential</keyword>
<keyword>script</keyword>
<keyword>runahead limit</keyword>
<keyword>run-dir</keyword>
<keyword>run time range</keyword>
<keyword>root</keyword>
<keyword>retry handler</keyword>
<keyword>retrieve job logs retry delays</keyword>
Expand Down Expand Up @@ -174,12 +179,6 @@
<keyword>exclude</keyword>
<keyword>env-script</keyword>
<keyword>enable resurrection</keyword>
<keyword>dummy mode suite timeout</keyword>
<keyword>disable task event hooks</keyword>
<keyword>disable suite event hooks</keyword>
<keyword>disable retries</keyword>
<keyword>disable pre-script</keyword>
<keyword>disable post-script</keyword>
<keyword>disable automatic shutdown</keyword>
<keyword>description</keyword>
<keyword>default node attributes</keyword>
Expand Down
14 changes: 7 additions & 7 deletions conf/cylc.xml
Original file line number Diff line number Diff line change
Expand Up @@ -37,14 +37,19 @@
<RegExpr attribute='Keyword' String=' started handler '/>
<RegExpr attribute='Keyword' String=' stalled handler '/>
<RegExpr attribute='Keyword' String=' simulation mode suite timeout '/>
<RegExpr attribute='Keyword' String=' simulate failure '/>
<RegExpr attribute='Keyword' String=' disable suite event handlers '/>
<RegExpr attribute='Keyword' String=' default run length '/>
<RegExpr attribute='Keyword' String=' speedup factor '/>
<RegExpr attribute='Keyword' String=' time limit buffer '/>
<RegExpr attribute='Keyword' String=' fail cycle points '/>
<RegExpr attribute='Keyword' String=' fail try 1 only '/>
<RegExpr attribute='Keyword' String=' disable task event handlers '/>
<RegExpr attribute='Keyword' String=' shutdown handler '/>
<RegExpr attribute='Keyword' String=' shell '/>
<RegExpr attribute='Keyword' String=' sequential '/>
<RegExpr attribute='Keyword' String=' script '/>
<RegExpr attribute='Keyword' String=' runahead limit '/>
<RegExpr attribute='Keyword' String=' run-dir '/>
<RegExpr attribute='Keyword' String=' run time range '/>
<RegExpr attribute='Keyword' String=' root '/>
<RegExpr attribute='Keyword' String=' retry handler '/>
<RegExpr attribute='Keyword' String=' retrieve job logs retry delays '/>
Expand Down Expand Up @@ -102,11 +107,6 @@
<RegExpr attribute='Keyword' String=' env-script '/>
<RegExpr attribute='Keyword' String=' enable resurrection '/>
<RegExpr attribute='Keyword' String=' dummy mode suite timeout '/>
<RegExpr attribute='Keyword' String=' disable task event hooks '/>
<RegExpr attribute='Keyword' String=' disable suite event hooks '/>
<RegExpr attribute='Keyword' String=' disable retries '/>
<RegExpr attribute='Keyword' String=' disable pre-script '/>
<RegExpr attribute='Keyword' String=' disable post-script '/>
<RegExpr attribute='Keyword' String=' disable automatic shutdown '/>
<RegExpr attribute='Keyword' String=' description '/>
<RegExpr attribute='Keyword' String=' default node attributes '/>
Expand Down
110 changes: 46 additions & 64 deletions doc/src/cylc-user-guide/cug.tex
Original file line number Diff line number Diff line change
Expand Up @@ -502,6 +502,7 @@ \subsubsection{Create A Site Config File}
in~\ref{SiteRCReference}.

\subsubsection{Configure Site Environment on Job Hosts}
\label{Configure Site Environment on Job Hosts}

If your users submit task jobs to hosts other than the hosts they use to run
their suites, you should ensure that the job hosts have the correct environment
Expand Down Expand Up @@ -6697,79 +6698,60 @@ \subsection{The Meaning And Use Of Initial Cycle Point}
Note however that an initial \lstinline=R1= graph section is now the preferred
way to get different behaviour at suite start-up.

\subsection{The Simulation And Dummy Run Modes}
\subsection{Simulating Suite Behaviour}
\label{SimulationMode}

Since cylc-4.6.0 any cylc suite can run in {\em live}, {\em simulation},
or {\em dummy} mode. Prior to that release simulation mode was a
hybrid mode that replaced real tasks with local dummy tasks. This
allowed local simulation testing of any suite, to get the scheduling
right without running real tasks, but running dummy tasks locally does
not add much value over a pure simulation (in which no tasks are
submitted at all) because all job submission configuration has to be
ignored and most task job script sections have to be cut out to avoid
any code that could potentially be specific to the intended task host.
So at 4.6.0 we replaced this with a pure simulation mode (task proxies
go through the {\em running} state automatically within cylc, and no
dummy tasks are submitted to run) and a new dummy mode in which only the
real task scripting is dummied out - each dummy task is
submitted exactly as the task it represents on the correct host and in
the same execution environment. A successful dummy run confirms not only
that the scheduling works correctly but also tests real job submission,
communication from remote task hosts, and the real task job scripts (in
which errors such as use of undefined variables will cause a task to
fail).

The run mode, which defaults to {\em live}, is set on the command line
(for run and restart):
Several suite run modes allow you to simulate suite behaviour quickly without
running the suite's real jobs - which may be long-running and resource-hungry:

\begin{myitemize}
\item {\em dummy mode} - runs dummy tasks as background jobs on configured
job hosts.
\begin{myitemize}
\item simulates scheduling, job host connectivity, and
generates all job files on suite and job hosts.
\end{myitemize}
\item {\em dummy-local mode} - runs real dummy tasks as background jobs on
the suite host, which allows dummy-running suites from other sites.
\begin{myitemize}
\item simulates scheduling and generates all job files on the
suite host.
\end{myitemize}
\item {\em simulation mode} - does not run any real tasks.
\begin{myitemize}
\item simulates scheduling without generating any job files.
\end{myitemize}
\end{myitemize}

Set the run mode (default {\em live}) in the GUI suite start dialog box, or on
the command line:
\lstset{language=transcript}
\begin{lstlisting}
$ cylc run --mode=dummy SUITE
$ cylc restart --mode=dummy SUITE
\end{lstlisting}
but you can configure the suite to force a particular run mode:
\lstset{language=suiterc}
\begin{lstlisting}
[cylc]
force run mode = simulation
\end{lstlisting}
This can be used, for example, for demo suites that necessarily run out
of their original context; or to temporarily prevent accidental
execution of expensive real tasks during suite development.

Dummy mode task scripting just prints a message and sleeps for ten
seconds by default, but you can override this behaviour for particular
tasks or task groups if you like. Here's how to make a task sleep for
twenty seconds and then fail in dummy mode:
\lstset{language=suiterc}
\begin{lstlisting}
[runtime]
[[foo]]
script = "run-real-task.sh"
[[[dummy mode]]]
script = """
echo "hello from dummy task $CYLC_TASK_ID"
sleep 20
echo "ABORTING"
/bin/false"""
\end{lstlisting}
You can get specified tasks to fail in these modes, for more flexible suite
testing. See Section~\ref{suiterc-sim-config} for simulation configuration.

Finally, in simulation mode each task takes between 1 and 15 seconds to
``run'' by default, but you can also alter this for particular tasks or
groups of tasks:
\lstset{language=suiterc}
\begin{lstlisting}
[runtime]
[[foo]]
[[[simulation mode]]]
run time range = PT20S,PT31S # (between 20 and 30 seconds)
\end{lstlisting}
Note that to get a failed simulation or dummy mode task to succeed on
re-triggering, just change the suite.rc file appropriately and reload
the suite definition at run time with \lstinline=cylc reload SUITE=
before re-triggering the task.
\subsubsection{Proportional Simulated Run Length}

If task \lstinline=[job]execution time limit= is set, Cylc divides it by
\lstinline=[simulation]speedup factor= (default \lstinline=10.0=) to compute
simulated task run lengths (default 10 seconds).

\subsubsection{Limitations Of Suite Simulation}

Dummy mode ignores batch scheduler settings because Cylc does not know which
job resource directives (requested memory, number of compute nodes, etc.) would
need to be changed for the dummy jobs. If you need to dummy-run jobs on a
batch scheduler manually comment out \lstinline=script= items and modify
directives in your live suite, or else use a custom live mode test suite.

Dummy mode is equivalent to removing all user-defined task scripting
to expose the default scripting.
Note that the dummy modes ignore all configured task \lstinline=script= items
including \lstinline=init-script=. If your \lstinline=init-script= is required
to run even dummy tasks on a job host, note that host environment setup should
be done elsewhere - see~\ref{Configure Site Environment on Job Hosts}.

\subsubsection{Restarting Suites With A Different Run Mode?}

Expand Down