Skip to content

Commit

Permalink
Updated docs on resubmitting pulsar jobs
Browse files Browse the repository at this point in the history
  • Loading branch information
nuwang committed Nov 28, 2018
1 parent 81c140c commit 2338029
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 5 deletions.
2 changes: 2 additions & 0 deletions docs/samples/job_conf.xml.basic
Expand Up @@ -17,6 +17,8 @@
<param id="pulsar_runner_id">pulsar</param>
<!-- Destination to fallback to if no nodes are available -->
<param id="fallback_destination">local</param>
<!-- Pick next available server and resubmit if an unknown error occurs -->
<resubmit condition="unknown_error and attempt &lt;= 3" destination="galaxycloudrunner" />
</destination>
</destinations>
<tools>
Expand Down
2 changes: 2 additions & 0 deletions docs/samples/job_conf.xml.burst_if_queued
Expand Up @@ -25,6 +25,8 @@
<param id="pulsar_runner_id">pulsar</param>
<!-- Destination to fallback to if no nodes are available -->
<param id="fallback_destination_id">local</param>
<!-- Pick next available server and resubmit if an unknown error occurs -->
<resubmit condition="unknown_error and attempt &lt;= 3" destination="galaxycloudrunner" />
</destination>
</destinations>
<tools>
Expand Down
2 changes: 2 additions & 0 deletions docs/samples/job_conf.xml.burst_if_size
Expand Up @@ -33,6 +33,8 @@
<param id="pulsar_runner_id">pulsar</param>
<!-- Destination to fallback to if no nodes are available -->
<param id="fallback_destination_id">local</param>
<!-- Pick next available server and resubmit if an unknown error occurs -->
<resubmit condition="unknown_error and attempt &lt;= 3" destination="galaxycloudrunner" />
</destination>
</destinations>
<tools>
Expand Down
2 changes: 1 addition & 1 deletion docs/topics/configure_galaxy.rst
Expand Up @@ -20,7 +20,7 @@ Configuring Galaxy v19.01 or higher
.. literalinclude:: ../samples/job_conf.xml.basic
:language: xml
:linenos:
:emphasize-lines: 7,9-20
:emphasize-lines: 7,9-22

3. Launch as many Pulsar nodes as you need through `CloudLaunch`_. The job rule
will periodically query CloudLaunch, discover these new nodes, and route jobs
Expand Down
13 changes: 9 additions & 4 deletions docs/topics/configure_job_conf.rst
Expand Up @@ -10,7 +10,7 @@ started.
.. literalinclude:: ../samples/job_conf.xml.basic
:language: xml
:linenos:
:emphasize-lines: 7,9-20
:emphasize-lines: 7,9-22

In this simple configuration, all jobs are routed to GalaxyCloudRunner by
default. This works as follows:
Expand All @@ -24,8 +24,10 @@ default. This works as follows:
the GalaxyCloudRunner. This has implications for node addition and in
particular removal. When adding a node, there could be a delay of a few
minutes before the node is picked up. If a Pulsar node is removed, your jobs
may be routed to a dead node for the duration of the caching period. See
:ref:`additional-configuration` on how to change this cache period.
may be routed to a dead node for the duration of the caching period.
Therefore, we recommend attempting a job resubmission through the resubmit
tag as shown in the example. See :ref:`additional-configuration` on how to
change this cache period.
4. If no node is available, it will return the ``fallback_destination_id``, if
specified, in which case the job will be routed there. If no
``fallback_destination_id`` is specified, the job will be re-queued till a node
Expand Down Expand Up @@ -101,7 +103,10 @@ default. This works as follows:
the GalaxyCloudRunner. This has implications for node addition and in
particular removal. When adding a node, there could be a delay of a few
minutes before the node is picked up. If a Pulsar node is removed, your jobs
may be routed to a dead node for the duration of the caching period. See
may be routed to a dead node for the duration of the caching period.
Therefore, we recommend a job resubmission through a resubmit tag. However,
Galaxy versions prior to 19.01 do not support resubmissions for Pulsar, and
you may need to change the cache period to zero to handle this scenario. See
:ref:`additional-configuration` on how to change this cache period.
4. If no node is available, it will return the ``fallback_destination_id``, if
specified, in which case the job will be routed there. If no
Expand Down

0 comments on commit 2338029

Please sign in to comment.