Skip to content

Commit

Permalink
Update fusion docs [ci skip]
Browse files Browse the repository at this point in the history
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
  • Loading branch information
pditommaso committed Feb 7, 2023
1 parent d35cdc0 commit 12782ee
Showing 1 changed file with 22 additions and 2 deletions.
24 changes: 22 additions & 2 deletions docs/fusion.rst
Expand Up @@ -13,8 +13,6 @@ It bridges the gap between cloud-native storage and data analysis workflow by im
that allows any existing application to access object storage using the standard POSIX interface, thus simplifying
and speeding up most operations. Currently it supports AWS S3.

.. warning:: This is an incubating feature. It may change in future Nextflow releases.

Getting started
===============

Expand Down Expand Up @@ -171,6 +169,28 @@ Having the above configuration in place, you can run your pipeline using the fol

nextflow run <YOUR PIPELINE> -work-dir s3://<YOUR BUCKET>/scratch


NVMe storage
=============

Fusion file system implements a lazy download and upload algorithm that runs in the background to transfer files
in parallel to and from object storage into a container local temporal folder. This means that the performance of
the temporal folder inside the container (``/tmp`` in a default setup) is key to get maximum performance.

The temporal folder is used only as a temporal cache, so the size of the volume can be much lower than the actual
needs of your pipeline processes. Fusion has a build-in garbage collector that constantly monitors remaining disk
space on temporal folder and immediately evicts old cached entries when necessary.

The recommended setup to get maximum performance is to mount a NVMe disk as temporal folder and run the pipeline
with Nextflow :ref:`scratch <process-scratch>` directive set to ``false`` to also avoid stage-out transfer time.

Example extra configuration needed when using AWS Batch with `NVMe disks <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ssd-instance-store.html>`_
to maximize performance::

aws.batch.volumes = '/path/to/ec2/nvme:/tmp'
process.scratch = false


More examples
=============

Expand Down

0 comments on commit 12782ee

Please sign in to comment.