Skip to content

Commit

Permalink
Remove reference to removed script command. (#133)
Browse files Browse the repository at this point in the history
* Remove reference to removed script command.

* Remove references to "script" command (no longer exists).

Co-authored-by: Bradley Dice <bdice@bradleydice.com>
  • Loading branch information
csadorf and bdice committed Jun 14, 2021
1 parent 5b32638 commit a476c33
Show file tree
Hide file tree
Showing 5 changed files with 13 additions and 167 deletions.
6 changes: 2 additions & 4 deletions docs/source/cluster_submission.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Cluster Submission
==================

While it is always possible to manually submit scripts like the one shown in the :ref:`previous section <project-script>` to a cluster, using the *flow interface* will allows us to **keep track of submitted operations** for example to prevent the resubmission of active operations.
While it is always possible to manually write and submit scripts to a cluster, using the *flow interface* to generate and submit scripts on our behalf will allow **signac-flow** to **keep track of submitted operations** and prevent the resubmission of active operations.

In addition, **signac-flow** uses :ref:`environment profiles <environments>` to select which :ref:`base template <templates>` to use for the cluster job script generation.
All base templates are in essence highly similar, but are adapted for a specific cluster environment.
Expand Down Expand Up @@ -62,11 +62,9 @@ For example the following command would submit up to 5 ``hello`` operations, whe
~/my_project $ python project.py submit -o hello -n 5 -f a.\$lt 5
The submission scripts are generated using the same templating system as the ``script`` command.

.. tip::

Use the ``--pretend`` or ``--test`` option to pre-view the generated submission scripts on screen instead of submitting them.
Use the ``--pretend`` option to preview the generated submission scripts on screen instead of submitting them.


Parallelization and Bundling
Expand Down
61 changes: 6 additions & 55 deletions docs/source/flow-project.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,8 @@ Executing this script on the command line will give us access to this project's
.. code-block:: bash
~/my_project $ python project.py
usage: project.py [-h] [-d] {status,next,run,script,submit,exec} ...
Using environment configuration: StandardEnvironment
usage: project.py [-h] [-v] [--show-traceback] [--debug] {status,next,run,submit,exec} ...
.. note::

Expand Down Expand Up @@ -214,63 +215,13 @@ As shown before, all *eligible* operations can then be executed with:
~/my_project $ python project.py run
The status determination is by default parallelized with threads, however this can be turned off or switched to using processes by setting a value for the ``flow.status_parallelization`` configuration key.
Possible values are ``thread``, ``process`` or ``none`` with ``thread`` being the default value and ``none`` turning off all parallelization.
The status determination operates in serial by default, because typically the overhead costs of using threads/processes are large. However this can be configured by setting a value for the ``flow.status_parallelization`` configuration key.
Possible values are ``thread``, ``process`` or ``none`` with ``none`` being the default value (turning off parallelization).

We can set the ``flow.status_parallelization`` configuration value by directly editing the configuration file(s) or via the command line, for example:
We can set the ``flow.status_parallelization`` configuration value by directly editing the configuration file(s) or via the command line:

.. code-block:: bash
~/my_project $ signac config set flow.status_parallelization process
.. _project-script:

Generating Execution Scripts
============================

Instead of executing operations directly we can also create a script for execution.
If we have any pending operations, a script might look like this:

.. code-block:: bash
~/my_project $ python project.py script
set -e
set -u
cd /Users/csadorf/my_project
# Operation 'hello' for job '14fb5d016557165019abaac200785048':
/Users/csadorf/miniconda3/bin/python project.py exec hello 14fb5d016557165019abaac200785048
# Operation 'hello' for job '2af7905ebe91ada597a8d4bb91a1c0fc':
/Users/csadorf/miniconda3/bin/python project.py exec hello 2af7905ebe91ada597a8d4bb91a1c0fc
# Operation 'hello' for job '42b7b4f2921788ea14dac5566e6f06d0':
/Users/csadorf/miniconda3/bin/python project.py exec hello 42b7b4f2921788ea14dac5566e6f06d0
# Operation 'hello' for job '9bfd29df07674bc4aa960cf661b5acd2':
/Users/csadorf/miniconda3/bin/python project.py exec hello 9bfd29df07674bc4aa960cf661b5acd2
# Operation 'hello' for job '9f8a8e5ba8c70c774d410a9107e2a32b':
/Users/csadorf/miniconda3/bin/python project.py exec hello 9f8a8e5ba8c70c774d410a9107e2a32b
These scripts can be used for the execution of operations directly, or they could be submitted to a cluster environment for remote execution.
For more information about how to submit operations for execution to a cluster environment, see the :ref:`cluster-submission` chapter.

This script is generated from a default jinja2_ template, which is shipped with the package.
We can extend this default template or write our own to cutomize the script generation process.

.. _jinja2: http://jinja.pocoo.org/

Here is an example for such a template, that would essentially generate the same output:

.. code-block:: bash
cd {{ project.config.project_dir }}
{% for operation in operations %}
operation.cmd
{% endfor %}
.. note::
Unlike the default template, this exemplary template would not allow for ``parallel`` execution.
Checkout the :ref:`next section <cluster-submission>` for a guide on how to submit operations to a cluster environment.
Check out the :ref:`next section <cluster-submission>` for a guide on how to submit operations to a cluster environment.
7 changes: 1 addition & 6 deletions docs/source/recipes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,7 @@ Then, we could implement a simple operation that passes it some metadata paramet
def compute_volume(job):
return "matlab -r 'prog {job.sp.foo} {job.sp.bar}' > {job.ws}/output.txt"
Executing this operation will store the output of the matlab script within the job's workspace within a file called ``output.txt``.
Executing this operation will store the output of the MATLAB script within the job's workspace within a file called ``output.txt``.

.. todo::

Expand All @@ -209,11 +209,6 @@ Running MPI-parallelized operations

There are basically two strategies to implement :class:`~.flow.FlowProject` operations that are MPI-parallelized, one for external programs and one for Python scripts.

.. tip::

Fully functional scripts can be found in the signac-docs repository under ``examples/MPI``.


MPI-operations with mpi4py or similar
-------------------------------------

Expand Down
6 changes: 1 addition & 5 deletions docs/source/templates.rst
Original file line number Diff line number Diff line change
Expand Up @@ -56,15 +56,11 @@ The third line is the actual command that we want to add and the fourth line ens
The base template
=================

The **signac-flow** package will select a different base script template depending on whether you are simply generating a script using the ``script`` command or whether you are submitting to a scheduling system with ``submit``.
In the latter case, the base script template is selected based on whether you are on any of the :ref:`officially supported environments <supported-environments>`, and if not, whether one of the known scheduling systems (e.g. Slurm, PBS, or LSF) is available.
The **signac-flow** package will select the base script template depending on whether you are on any of the :ref:`officially supported environments <supported-environments>`, and if not, whether one of the known scheduling systems (e.g. Slurm, PBS, or LSF) is available.
This is a short illustration of that heuristic:

.. code-block:: bash
# The `script` command always uses the same base script template:
project.py script --> base_script='base_script.sh'
# On system with SLURM scheduler:
project.py submit --> base_script='slurm.sh' (extends 'base_script.sh')
Expand Down
100 changes: 3 additions & 97 deletions docs/source/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -496,105 +496,11 @@ The ``job.data`` property is a short-cut for ``job.stores['signac_data']``, you

See :ref:`project-job-data` for an in-depth discussion.

Job scripts and cluster submission
==================================

Generating scripts
------------------

So far, we executed all operations directly on the command line with the ``run`` command.
However we can also generate scripts for execution, which is especially relevant if you intend to submit the workflow to a scheduling system typically encountered in high-performance computing (HPC) environments.

Scripts are generated using the `jinja2`_ templating system, but you don't have to worry about that unless you want to change any of the default templates.

.. todo::
Once we have templates documentation, point to it here.

.. _jinja2: http://jinja.pocoo.org/

We can generate a script for the execution of the *next eligible operations* with the ``script`` command.
We need to reset our workflow before we can test that:

.. code-block:: bash
~/ideal_gas_project $ rm -r workspace/
~/ideal_gas_project $ python init.py
Let's start by generating a script for the execution of up to two *eligible* operations:

.. code-block:: bash
~/ideal_gas_project $ python project.py script -n 2
set -e
set -u
cd /Users/csadorf/ideal_gas_project
# Operation 'compute_volume' for job '03585df0f87fada67bd0f540c102cce7':
python project.py exec compute_volume 03585df0f87fada67bd0f540c102cce7
# Operation 'compute_volume' for job '22a51374466c4e01ef0e67e65f73c52e':
python project.py exec compute_volume 22a51374466c4e01ef0e67e65f73c52e
By default, the generated script will change into the *project root directory* and then execute the command for each next eligible operation for all selected jobs.
We then have two ways to run this script.
One option would be to pipe it into a file and then execute it:

.. code-block:: bash
~/ideal_gas_project $ python project.py script > run.sh
~/ideal_gas_project $ /bin/bash run.sh
Alternatively, we could pipe it directly into the command processor:

.. code-block:: bash
~/ideal_gas_project $ python project.py script | /bin/bash
Executing the ``script`` command again, we see that it would now execute both the ``store_volume_in_document`` and the ``store_volume_in_json_file`` operation, since both share the same pre-conditions:

.. code-block:: bash
~/ideal_gas_project $ python project.py script -n 2
set -e
set -u
cd /Users/csadorf/ideal_gas_project
# Operation 'store_volume_in_document' for job '03585df0f87fada67bd0f540c102cce7':
python project.py exec store_volume_in_document 03585df0f87fada67bd0f540c102cce7
# Operation 'store_volume_in_json_file' for job '03585df0f87fada67bd0f540c102cce7':
python project.py exec store_volume_in_json_file 03585df0f87fada67bd0f540c102cce7
If we wanted to customize the script generation, we could either extend the base template or simply replace the default template with our own.
To replace the default template, we can put a template script called ``script.sh`` into a directory called ``templates`` within the project root directory.
A simple template script might look like this:

.. code-block:: bash
cd {{ project.config.project_dir }}
{% for operation in operations %}
{{ operation.cmd }}
{% endfor %}
Storing the above template within a file called ``templates/script.sh`` will now change the output of the ``script`` command to:
.. code-block:: bash
~/ideal_gas_project $ python project.py script -n 2
cd /Users/csadorf/ideal_gas_project
python project.py exec store_volume_in_document 03585df0f87fada67bd0f540c102cce7
python project.py exec store_volume_in_json_file 03585df0f87fada67bd0f540c102cce7
Please see ``$ python project.py script --template-help`` to get more information on how to write and use custom templates.
Submit operations to a scheduling system
----------------------------------------
========================================

In addition to executing operations directly on the command line and generating scripts, **signac** can also submit operations to a scheduler such as SLURM_.
This is essentially equivalent to generating a script as described in the previous section, but in this case the script will also contain the relevant scheduler directives such as the number of processors to request.
In addition to executing operations directly on the command line, **signac** can also submit operations to a scheduler such as SLURM_.
The submit command will generate and submit a script containing the operations to run and relevant scheduler directives such as the number of processors to request.
In addition, **signac** will also keep track of submitted operations in addition to workflow progress, which almost completely automates the submission process as well as preventing the accidental repeated submission of operations.

.. _SLURM: https://slurm.schedmd.com/
Expand Down

0 comments on commit a476c33

Please sign in to comment.