diff --git a/docs/img/Menu1.png b/docs/img/Menu1.png new file mode 100644 index 000000000..a243bd270 Binary files /dev/null and b/docs/img/Menu1.png differ diff --git a/docs/img/Menu2.png b/docs/img/Menu2.png new file mode 100644 index 000000000..dcac6e4ec Binary files /dev/null and b/docs/img/Menu2.png differ diff --git a/docs/img/Menu3.png b/docs/img/Menu3.png new file mode 100644 index 000000000..09df9bb00 Binary files /dev/null and b/docs/img/Menu3.png differ diff --git a/docs/img/Menu4.png b/docs/img/Menu4.png new file mode 100644 index 000000000..8baa19ad3 Binary files /dev/null and b/docs/img/Menu4.png differ diff --git a/docs/img/Menu5.png b/docs/img/Menu5.png new file mode 100644 index 000000000..094b67e1f Binary files /dev/null and b/docs/img/Menu5.png differ diff --git a/docs/img/Menu6.png b/docs/img/Menu6.png new file mode 100644 index 000000000..9f6b10788 Binary files /dev/null and b/docs/img/Menu6.png differ diff --git a/docs/img/Menu7.png b/docs/img/Menu7.png new file mode 100644 index 000000000..de03546cc Binary files /dev/null and b/docs/img/Menu7.png differ diff --git a/docs/img/launch_choice.png b/docs/img/launch_choice.png new file mode 100644 index 000000000..f18b43bd4 Binary files /dev/null and b/docs/img/launch_choice.png differ diff --git a/docs/img/progress.png b/docs/img/progress.png new file mode 100644 index 000000000..6437b6b2a Binary files /dev/null and b/docs/img/progress.png differ diff --git a/docs/img/slurm_partitions_quickstart.png b/docs/img/slurm_partitions_quickstart.png new file mode 100644 index 000000000..f2e02f901 Binary files /dev/null and b/docs/img/slurm_partitions_quickstart.png differ diff --git a/docs/source/installation.rst b/docs/source/installation.rst index b73e817a5..1fec1dcb9 100644 --- a/docs/source/installation.rst +++ b/docs/source/installation.rst @@ -1,3 +1,5 @@ +.. _installation-page: + ============ Installation ============ diff --git a/docs/source/nextflow-workflow.rst b/docs/source/nextflow-workflow.rst index 26112bc92..ba097318e 100644 --- a/docs/source/nextflow-workflow.rst +++ b/docs/source/nextflow-workflow.rst @@ -135,6 +135,298 @@ Example ``sample_sheet.csv`` +-------------------------------------------------+------------------------------------------------------+---------------------------------------------------------------------+ +Quick Start +########### + +The following is a condensed summary of steps required to get Autometa installed, configured and running. +There are links throughout to the appropriate documentation sections that can provide more detail if required. + +Installation +************ + +For full installation instructions, please see the :ref:`installation-page` section + +If you would like to install Autometa via conda (I'd recommend it, its almost foolproof!), +you'll need to first install Miniconda on your system. You can do this in a few easy steps: + +1. Type in the following and then hit enter. This will download the Miniconda installer to your home directory. + +.. code-block:: bash + + wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O $HOME/Miniconda3-latest-Linux-x86_64.sh + +.. note:: + + ``$HOME`` is synonymous with ``/home/user`` and in my case is ``/home/sam`` + +2. Now let’s run the installer. Type in the following and hit enter: + +.. code-block:: bash + + bash $HOME/Miniconda3-latest-Linux-x86_64.sh + +3. Follow all of the prompts. Keep pressing enter until it asks you to accept. Then type yes and enter. Say yes to everything. + +.. note:: + + If for whatever reason, you accidentally said no to the initialization, do not fear. + We can fix this by running the initialization with the following command: + + .. code-block:: bash + + cd $HOME/miniconda3/bin/ + ./conda init + + +4. Finally, for the changes to take effect, you'll need to run the following line of code which effectively acts as a "refresh" + +.. code-block:: bash + + source ~/.bashrc + +Now that you have conda up and running, its time to install the Autometa conda environment. Run the following code: + +.. code-block:: bash + + conda env create --file=https://raw.githubusercontent.com/KwanLab/Autometa/main/nextflow-env.yml + +.. attention:: + + You will only need to run the installation (code above) once. The installation does NOT need to be performed every time you wish to use Autometa. + Once installation is complete, the conda environment (which holds all the tools that Autometa needs) will live on your server/computer + much like any other program you install. + +Anytime you would like to run Autometa, you'll need to activate the conda environment. To activate the environment you'll need to run the following command: + +.. code-block:: bash + + conda activate autometa-nf + +Configuring a scheduler +*********************** + +For full details on how to configure your scheduler, please see the :ref:`Configuring your process executor` section. + +If you are using a Slurm scheduler, you will need to create a configuration file. If you do not have a scheduler, skip ahead to :ref:`Running Autometa` + +First you will need to know the name of your slurm partition. Run :code:`sinfo` to find this. In the example below, the partition name is "agrp". + +.. image:: ../img/slurm_partitions_quickstart.png + +Next, generate a new file called ``slurm_nextflow.config`` via nano: + +.. code-block:: bash + + nano slurm_nextflow.config + +Then copy the following code block into that new file ("agrp" is the slurm partition to use in our case): + +.. code-block:: bash + + profiles { + slurm { + process.executor = "slurm" + process.queue = "agrp" // <<-- change this to whatever your partition is called + docker.enabled = true + docker.userEmulation = true + singularity.enabled = false + podman.enabled = false + shifter.enabled = false + charliecloud.enabled = false + executor { + queueSize = 8 + } + } + } + +Keep this file somewhere central to you. For the sake of this example I will be keeping it in a folder called "Useful scripts" in my home directory +because that is a central point for me where I know I can easily find the file and it won't be moved e.g. +:code:`/home/sam/Useful_scripts/slurm_nextflow.config` + +Save your new file with Ctrl+O and then exit nano with Ctrl+O. + +Installation and set up is now complete. 🎉 🥳 + +Running Autometa +**************** + +For a comprehensive list of features and options and how to use them please see :ref:`Running the pipeline` + +Autometa can bin one or several metagenomic datasets in one run. Regardless of the number of metagenomes you +want to process, you will need to provide a sample sheet which specifies the name of your sample, the full path to +where that data is found and how to retrieve the sample's contig coverage information. + +If the metagenome was assembled via SPAdes, Autometa can extract coverage and contig length information from the sequence headers. + +If you used a different assembler you will need to provide either raw reads or a table containing contig/scaffold coverage information. +Full details for data preparation may be found under :ref:`sample-sheet-preparation` + +First ensure that your Autometa conda environment is activated. You can activate your environment by running: + +.. code-block:: bash + + conda activate autometa-nf + +Run the following code to launch Autometa: + +.. code-block:: bash + + nf-core launch KwanLab/Autometa + +.. note:: + + You may want to note where you have saved your input sample sheet prior to running the launch command. + It is much easier (and less error prone) to copy/paste the sample sheet file path when specifying the input (We'll get to this later in :ref:`quickstart-menu-4`). + +You will now use the arrow keys to move up and down between your options and hit your "Enter" or "Return" key to make your choice. + +**KwanLab/Autometa nf-core parameter settings:** + +#. :ref:`quickstart-menu-1` +#. :ref:`quickstart-menu-2` +#. :ref:`quickstart-menu-3` +#. :ref:`quickstart-menu-4` +#. :ref:`quickstart-menu-5` +#. :ref:`quickstart-menu-6` +#. :ref:`quickstart-menu-7` +#. :ref:`quickstart-menu-8` + +.. _quickstart-menu-1: + +Choose a version +---------------- + +The double, right-handed arrows should already indicate the latest release of Autometa (in our case ``2.0.0``). +The latest version of the tool will always be at the top of the list with older versions descending below. +To select the latest version, ensure that the double, right-handed arrows are next to ``2.0.0``, then hit "Enter". + +.. image:: ../img/Menu1.png + +.. _quickstart-menu-2: + +Choose nf-core interface +------------------------ + +Pick the ``Command line`` option. + +.. note:: + + Unless you've done some fancy server networking (i.e. tunneling and port-forwarding), + or are using Autometa locally, ``Command line`` is your *only* option. + +.. image:: ../img/Menu2.png + +.. _quickstart-menu-3: + +General nextflow parameters +--------------------------- + +If you are using a scheduler (Slurm in this example), ``-profile`` is the only option you'll need to change. +If you are not using a scheduler, you may skip this step. + +.. image:: ../img/Menu3.png + +.. _quickstart-menu-4: + +Input and Output +---------------- + +Now we need to give Autometa the full paths to our input sample sheet, output results folder +and output logs folder (aka where trace files are stored). + +.. note:: + + A new folder, named by its respective sample value, will be created within the output results folder for + each metagenome listed in the sample sheet. + +.. image:: ../img/Menu4.png + +.. _quickstart-menu-5: + +Binning parameters +------------------ + +If you're not sure what you're doing I would recommend only changing ``length_cutoff``. +The default cutoff is 3000bp, which means that any contigs/scaffolds smaller than 3000bp will not be considered for binning. + +.. note:: + + This cutoff will depend on how good your assembly is: e.g. if your N50 is 1200bp, I would choose a cutoff of 1000. + If your N50 is more along the lines of 5000, I would leave the cutoff at the default 3000. I would strongly recommend + against choosing a number below 900 here. In the example below, I have chosen a cutoff of 1000bp as my assembly was + not particularly great (the N50 is 1100bp). + +.. image:: ../img/Menu5.png + +.. _quickstart-menu-6: + +Additional Autometa options +--------------------------- + +Here you have a choice to make: + +* By enabling taxonomy aware mode, Autometa will attempt to use taxonomic data to make your bins more accurate. + +However, this is a more computationally expensive step and will make the process take longer. + +* By leaving this option as the default ``False`` option, Autometa will bin according to coverage and kmer patterns. + +Despite your choice, you will need to provide a path to the necessary databases using the ``single_db_dir`` option. +In the example below, I have enabled the taxonomy aware mode and provided the path to where the databases are stored +(in my case this is :code:`/home/sam/Databases`). + +For additional details on required databases, see the :ref:`Databases` section. + +.. image:: ../img/Menu6.png + +.. _quickstart-menu-7: + +Computational parameters +------------------------ + +This will depend on the computational resources you have available. You could start with the default values and see +how the binning goes. If you have particularly complex datasets you may want to bump this up a bit. For your +average metagenome, you won't need more than 150Gb of memory. I've opted to use 75 Gb as a +starting point for a few biocrust (somewhat diverse) metagenomes. + +.. note:: + + These options correspond to the resources provided to *each* process of Autometa, *not* the entire workflow itself. + + Also, for TB worth of assembled data you may want to try the :ref:`autometa-bash-workflow` using the + `autometa-large-data-mode.sh `_ template + +.. image:: ../img/Menu7.png + +.. _quickstart-menu-8: + +Do you want to run the nextflow command now? +-------------------------------------------- + +You will now be presented with a choice. If you are NOT using a scheduler, you can go ahead and type ``y`` to launch the workflow. +If you are using a scheduler, type ``n`` - we have one more step to go. In the example below, I am using the slurm scheduler so I have typed ``n`` +to prevent immediately performing the nextflow run command. + +.. image:: ../img/launch_choice.png + +If you recall, we created a file called :code:`slurm_nextflow.config` that contains the information Autometa will need to communicate with the Slurm scheduler. +We need to include that file using the :code:`-c` flag (or configuration flag). Therefore to launch the Autometa workflow, run the following command: + +.. note:: + + You will need to change the :code:`/home/sam/Useful_scripts/slurm_nextflow.config` file path to what is appropriate for your system. + +.. code-block:: bash + + nextflow run KwanLab/Autometa -r 2.0.0 -profile "slurm" -params-file "nf-params.json" -c "/home/sam/Useful_scripts/slurm_nextflow.config" + +Once you have hit the "Enter" key to submit the command, nextflow will display the progress of your binning run, such as the one below: + +.. image:: ../img/progress.png + +When the run is complete, output will be stored in your designated output folder, in my case ``/home/same/Trial/Autometa_output`` (See :ref:`quickstart-menu-4`). + + Basic #####