---
title: "Metagenomics Automation"
editor: visual
jupyter: python3
---


::: callout-warning
## Warning

Make sure you have read the Preparation chapter!
:::

In the previous chapters, you learned how to perform each step of the Nanopore metagenomics sequencing analysis pipeline manually. While this is a valuable learning experience, it’s not practical for analyzing large datasets or for ensuring reproducibility in the long term.

We can make use of a tool called [Snakemake](https://snakemake.readthedocs.io/en/stable/) to automate the previous steps into a single pipeline. With Snakemake, you can define the steps of your analysis in a Snakefile and then let Snakemake handle the execution, dependency management and error handling.

## Preparing to run the workflow

To run the automated pipeline, you need to fill out the excel sheet **file_paths.xlsx** within the **Metagenomics_automation** folder exactly as shown in the example below. Do not change the file name when saving! Here, you tell snakemake where to find your data. 

![Example](excel_fill.jpg)

::: callout-warning
## Warning

Remember to update the run paths and barcodes for each sample!
:::

You are nearly there!

Make sure you are in the right folder to run the script.

``` bash
cd /mnt/viro0002-data/sequencedata/processed/Diagnostics_metagenomics/Metagenomics_automation/ 
```

Now just copy the command below.

``` bash
snakemake --cores 16
```

::: callout-tip
## Tip

If its all green, everything is working. Just let it run in the background
:::

Once the run is finished, please see the figure below to find your data. 

![Example](snakemake.jpg)

::: callout-warning
## Warning
Rename the **results** folder with the run name (for example Viro_Run_0001).  
This step is critical—if another Snakemake run is started without renaming, the previous output will be overwritten.
:::

To rename, please use this example 

``` bash
cd /mnt/viro0002-data/sequencedata/processed/Diagnostics_metagenomics/Metagenomics_automation/
mv results Viro_Run_0001
```

### False postive findings 

After viewing the krona plot within the **krona** folder, you may want to remove some false positive findings. 

You can do this by going into the **classification** folder and opening up the **report_standard.txt** file. There you can set the number of reads back to zero for the false positive finding - remember in addition to setting the number of reads, you must also set the normalized percentage. Save with the same name. 

::: callout-warning
## Warning

Remember to be in the correct folder
:::

``` bash
cd /mnt/viro0002-data/sequencedata/processed/Diagnostics_metagenomics/Metagenomics_automation/Viro_Run_0001/barcode01/krona/
```

Remove the original krona plot 

``` bash
rm report.html
```

Then just re-run snakemake (it will ignore previously made files and just focus on generating your updated krona plot)

``` bash
snakemake --cores 16
```

In order to access your krona plot (and other files) within windows, you will need permission. Paste the following into the terminal:

``` bash
chmod 777 -R /mnt/viro0002-data/sequencedata/processed/Diagnostics_metagenomics/Viro_Run_0001/
```
This will take some