# **TITLE**
### Author : Andrea Grecu

## **Background | Pūtake**

Orr et al., (2020) described a phylogenomic method to measure somatic mutations within a phenotypically mosaic plant individual and further predict the somatic mutation rate within that individual. Both the bioinformatic workflow (*aka. pipeline*) and inputted data used are open-access and found within the orginal journal publication linked below.

[Orr et al., ORIGINAL!](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7126060/)

The original pipeline created by Orr et al., (2020) was published within a github repository utilizing many bioinformatics tools which cumitavley written in  5+ programming languages. The suggested method to replicate this pipeline is via using the makefiles provided.

### Āheitanga

***While the original pipeline is open access, it is not particulary user friendly.***

Computational methods within bioinformatics often fall short in reproducbility due to insufficient or inacessible documentation (Birmingham, 2017). It is imperative to the progression of bioinformatics research that an effort is made to produce interactive methods which are accessible to a wider audience with limited computer literacy. 

> ### Thus, the purpose of this notebook is to provide an easy to follow replication of the original pipeline created by Orr et al., (2020) to detect somatic mutations. 
>The notebook provides a *computational narrative* incorporating inputs, outputs and easy to understand explanations of each step. 
> By detailing the assumptions,logic, necessary input data and expected ouput this notebook should enable this pipeline to be applied to a new set of collected data.


## **Pipeline Technological Limitations | Tepenga**

Considering this pipeline utilizes many (24+) whole genome reads, it requires great computational capacity which restricts the technology which it can be succesfully run on.

While some changes can be made to the scripts to adapt for different RAM and thread values of your device, it is not recommended to run this pipeline on a device with capacity much lower than the default (*as this will likely take a very long time to run and much of the memory of your device*). 

> ### Default Capacity
> **RAM = 64 GB**
>
> **CPU = 20 Threads**




## **Required Software Installation |  Tāuta**

Prior to running this pipeline, the appropiate software's must be installed. 
This can be done directly within this notebook by running the cells below.

#### Bioconda 
Most of the packages used further along in this pipeline are provided via the **Bioconda** channel (package manager). 

More information about bioconda can be found [here](https://bioconda.github.io/)

***To install bioconda run the code cell below.***

In [5]:
import sys 
!conda config --add channels defaults
!conda config --add channels bioconda
!conda config --add channels conda-forge
!conda config --set channel_priority strict



#### Khmer

The khmer software allows for nucleotide sequence analysis, and is necessary for step (* COME BACK) -> uses the clean reads shell (Crusoe et al., 2015). 

More information on Khmer can be found [here](https://github.com/dib-lab/khmer)

***To install Khmer run the code cell below.***

In [6]:
import sys
!conda install --yes --prefix {sys.prefix} khmer

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### Rcorrector

The Rcorrector software allows for the correction of Illumina RNA-seq , and is necessary for cleaning the reads (STEP 1*)(Song & Florea, 2015). 

More information on Rcorrector can be found [here](https://github.com/mourisl/Rcorrector)

***To install Rcorrector run the code cell below.***

In [40]:
import sys
!conda install --yes --prefix {sys.prefix} rcorrector

1794.94s - pydevd: Sending message related to process being replaced timed-out after 5 seconds


Retrieving notices: ...working... done
Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### NextGenMap

The NextGenMap software allows for short read mapping with a high sensitivity threshold, and is necessary for step ?* (Sedlazeck et al., 2013).

More information on NextGenMap can be found [here](https://github.com/Cibiv/NextGenMap/wiki)

***To install NextGenMap run the code cell below.***

In [2]:
import sys
!conda install --yes --prefix {sys.prefix} nextgenmap

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### GNU Parallel

The GNU Parallel operating system is a free software used to conduct jobs in parallel, and is necessary for step ?* (Tange, 2018).

More information on GNU Parallel can be found [here](https://www.gnu.org/software/parallel/)

***To install GNU Parallel run the code cell below.***

In [4]:
import sys
!conda install --yes --prefix {sys.prefix} parallel

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### Samtools

The Samtools software enables manipulation of next-generation sequencing data, and is necessary for step ?* (Danecek et al., 2021).

More information on Samtools can be found [here](https://github.com/samtools/samtools)

***To install Samtools run the code cell below.***

In [11]:
import sys
!conda install --yes --prefix {sys.prefix} samtools

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### BCFtools

The BCFtools software provides commands used by Samtools and HTSlib which are adjacently installed (Danecek et al., 2021).

More information on BCFtools can be found [here](https://github.com/samtools/bcftools)

***To install BFCtools run the code cell below.***

In [None]:
import sys
!conda install --yes --prefix {sys.prefix} bcftools

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### HTSlib

The HTSlib is a C library used for high-throughput sequencing formats used within Samtools and BCFtools (Bonfield et al., 2021).

More information on HTSlib can be found [here](https://github.com/samtools/htslib)

***To install HTSlib run the code cell below.***

In [13]:
import sys
!conda install --yes --prefix {sys.prefix} htslib

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### GATK

GATK is a **G**enome **A**nalysis **T**ool**K**it designed to identify variants in genomes and is used throughout the pipeline (O’Connor & Van der Auwera, 2020).

More information on GATK can be found [here](https://gatk.broadinstitute.org/hc/en-us)

***To install GATK run the code cell below.***

In [14]:
import sys
!conda install --yes --prefix {sys.prefix} gatk

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### RAxML

RAxML is a **R**andomized **A**xelerated **M**aximised **L**ikelihood algorithim enabling maximum likelihood phylogenetic tree searches and is used in step ?* (Stamatakis, 2006).

More information on RAxML can be found [here](https://cme.h-its.org/exelixis/web/software/raxml/index.html)

***To install RAxML run the code cell below.***

In [16]:
import sys
!conda install --yes --prefix {sys.prefix} raxml

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### Bedtools

The bedtools software encompasses many algorithimic programmes used for genome analysis and is used in step ?* (Quinlan & Hall, 2010).

More information on bedtools can be found [here](https://bedtools.readthedocs.io/en/latest/)

***To install Bedtools run the code cell below.***

In [17]:
import sys
!conda install --yes --prefix {sys.prefix} bedtools

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### UCSC LiftOver

LiftOver is a tool provided within the UCSC Genome Browser used to collate genetic analyses to the same build, version or collate assemblies (Hinrichs et al., 2006).

More information on LiftOver can be found [here](http://hgdownload.cse.ucsc.edu/admin/exe/)

***To install LiftOver run the code cell below.***

In [18]:
import sys
!conda install --yes --prefix {sys.prefix} ucsc-liftover

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



#### VCFtools

The VCFtools program package enables operations such as filtering and categorising variants of VCF (**V**ariant **C**all **F**ormat) files and is used in step ?* (Danecek et al., 2011).

More information on VCFtools can be found [here](https://vcftools.github.io/)

***To install VCFtools run the code cell below.***

In [19]:
import sys
!conda install --yes --prefix {sys.prefix} vcftools

Collecting package metadata (current_repodata.json): done
Solving environment: done


  current version: 22.11.0
  latest version: 22.11.1

Please update conda by running

    $ conda update -n base -c defaults conda

Or to minimize the number of packages updated during conda update use

     conda install conda=22.11.1



# All requested packages already installed.



## **Data Input Requirements | Tāuru Raraunga**

### Data Collection 

Orr et al.,(2020) sampled data from a phenotypically mosaic *Eucalyptus melliodora* individual (aka. a Yellow Box tree) in which it was known that one branch expressed resistance to defoliation via the *Anoploganthus* or Christmas beetle genus. From 8 distinct branches, 3 replicate samples were taken from the leaf tip in order to sequence the full genome. 

Data collected to be used within this pipeline is not restrictive in the number of samples but rather the number of replicates per sample;

> #### **!!! TIP !!!**
>*There must be at least 2 replicates per sample and an equal number of replicates across all samples. Each sample should be sequenced fully to produce a genome via Illumina.
The suffix of the each replicate for one sample should be formatted as follows :*
**Sample1a, Sample1b, Sample1c etc.**
>
>*The raw data inputted should be paired sequencing reads for each sample in FASTQ format. The suffix of the pair files per replicate should be* **"R1.fastq"** *and* **"R2.fastq"**
>
>> Raw Data should be deposited in the raw folder (/rep_files/data/raw/), in a new folder "my_data". You will see the raw data used by Orr et al.,(2020) in the raw folder which will be used by default.
Each **replicate** of one **sample** should have two files in the following format: ***"Sample1a_R1.fastq"*** !!!!!!

As always, considerations into collecting and using data in a respectful and responsible manner towards communities (across all taxa) should take precedence. 

### Pseudogenome

There was no high-quality reference genome available for *E. melliodora* thus a pseudoreference genome was created using the reference genome of the closely related *Eucalyptus grandis* aka. Rose Gum tree (Bartholomé et al., 2014). 
> #### **!!! TIP !!!**
>*If replicating with your own data, use either the most closely related high quality reference genome available for your sampled taxa OR for your exact taxa if available.*
>
>*The genome should be in a Fa-file format within a new folder inside the data folder labelled "my_ref" (rep_files/data/my_ref/) and the file itself called "ref.fa".*
>
>The *E.grandis* genome found in the "e_grandis" folder (/rep_files/data/e_grandis/) will be used by default otherwise.


# **Analysis Makefile**
The first makefile that Orr et al., (2020) suggests executing is found within the analysis folder of the repository files. This makefile uses 4 different scripts from the scripts folder, and can be broken down into **? main steps**.
> *Below are descriptions of each stage considering inputs, ouputs, set variables and any potential errors.*

### **Step One | Read Correction**
The first step of the pipeline is to *correct* the raw reads.

This step utilizes the algorthims provided by the **[Rcorrector](https://www.researchgate.net/publication/283260409_Rcorrector_efficient_and_accurate_error_correction_for_Illumina_RNA-seq_reads)** software to determine trusted kmers (using a De Brujin Graph) which will be further used in the next step to correct random sequencing errors producing sliced reads.

#### **Input Data**
> Raw reads with the suffix "R1.fastq" within the directory specified via the **READSFOLDER** variable are utilized. *I.e. the first of the paired reads for one replicate*. By default this is **"../data/raw/"**.
> 
> If using your own raw reads, you may change the directory utilising the code in the stage one cell below.

#### **Scripts Used**
> Lines 100-103 of the script clean_reads.sh execute this step. 
>
> The clean_reads.sh script is called in line 25 of the Makefile
> (This utilizies the directories set via **SCRIPTDIR** and **CLEANREADS**).

#### **Output Produced**
> The corrected reads will be outputted in a folder named "corrected"found in the "./cleaned_reads" folder.
> 
> The code in the cell below can be used to check if the correct files have been  produced in the corrected folder.

In [3]:
# STEP ONE CODE
# TIP - Remove hash from code below and run cell to execute!

# Use own raw data 
#!sed -i '/READSFOLDER=/c\READSFOLDER=*path/to/your/raw/data*' *your/path/here*/analysis/Makefile

# Check Corrected Reads 
#!ls [your path]rep_files/analysis/cleaned_reads/corrected

SRR9650833_RRR1.cor.fq	SRR9650841_RRR1.cor.fq	SRR9650849_RRR1.cor.fq
SRR9650833_RRR2.cor.fq	SRR9650841_RRR2.cor.fq	SRR9650849_RRR2.cor.fq
SRR9650834_RRR1.cor.fq	SRR9650842_RRR1.cor.fq	SRR9650850_RRR1.cor.fq
SRR9650834_RRR2.cor.fq	SRR9650842_RRR2.cor.fq	SRR9650850_RRR2.cor.fq
SRR9650835_RRR1.cor.fq	SRR9650843_RRR1.cor.fq	SRR9650851_RRR1.cor.fq
SRR9650835_RRR2.cor.fq	SRR9650843_RRR2.cor.fq	SRR9650851_RRR2.cor.fq
SRR9650836_RRR1.cor.fq	SRR9650844_RRR1.cor.fq	SRR9650852_RRR1.cor.fq
SRR9650836_RRR2.cor.fq	SRR9650844_RRR2.cor.fq	SRR9650852_RRR2.cor.fq
SRR9650837_RRR1.cor.fq	SRR9650845_RRR1.cor.fq	SRR9650853_RRR1.cor.fq
SRR9650837_RRR2.cor.fq	SRR9650845_RRR2.cor.fq	SRR9650853_RRR2.cor.fq
SRR9650838_RRR1.cor.fq	SRR9650846_RRR1.cor.fq	SRR9650854_RRR1.cor.fq
SRR9650838_RRR2.cor.fq	SRR9650846_RRR2.cor.fq	SRR9650854_RRR2.cor.fq
SRR9650839_RRR1.cor.fq	SRR9650847_RRR1.cor.fq	SRR9650855_RRR1.cor.fq
SRR9650839_RRR2.cor.fq	SRR9650847_RRR2.cor.fq	SRR9650855_RRR2.cor.fq
SRR9650840_RRR1.cor.fq	SRR9650848_

### **Step Two | Khmer Count Graph**
The second step is to store trusted kmers in a khmer graph, which will then be used to filter corrected reads which are excessively high coverage to reduce errors when mapping reads to the reference genome. This is done using the [Khmer](https://www.researchgate.net/publication/282249513_The_khmer_software_package_Enabling_efficient_nucleotide_sequence_analysis) software.

#### **Input Data**
> Corrected reads produced from step one which were deposited in the "./cleaned_reads/corrected" folder are used. 

#### **Scripts Used**
> Line 106 of the script clean_reads.sh executes this step, calling on the script load-into-counting-COR.py provided (due to changes in Khmer software scripts).
>
> The clean_reads.sh script is called in line 25 of the Makefile
> (This utilizies the directories set via **SCRIPTDIR** and **CLEANREADS**).

#### **Outputs Produced**
> The Khmer graph itself is ouputted within the "./cleaned_reads" folder in a Binary file labelled **"khmer_count.graph"** alongside a textfile labelled **"khmer_count.graph.info"**.
>
> The textfile displays which khmer software version is installed, which files kmers were obtained from, the total number of unique kmers obtained and the **fp** *(false positive)* **rate**.
>
> If the last two values are displayed as zero, it is likely that an error occured!
> **Check the values by running the code in the cells below!**

In [8]:
# STEP TWO CODE

# Run the lines below to check the fp rate and total khmer value

import os
os.getcwd() #run this line first to get your own cleaned_reads folder file path and replace below
os.chdir('[your path]/SRS-AG/rep_files/analysis/cleaned_reads')
os.getcwd()# run to check working directory correct
!tail -3 khmer_count.graph.info

Total number of unique k-mers: 4536784228
fp rate estimated to be 0.004



### **Step Three | Read Slicing**
The third step of the pipeline is to slice the corrected reads, removing reads which have high or low coverage level. 

The default maximum coverage specified is 40000, however, this can be altered in the cell below. 

*Information on filtering reads using coverage and what threshold might be appropiate for your data can be found [here](https://khmer-recipes.readthedocs.io/en/latest/001-extract-reads-by-coverage/).*

#### **Input Data**
> Corrected reads produced from step one which were deposited in the "./cleaned_reads/corrected" folder are used as well as their respective unique kmers stored in the khmer_count.graph file in the same folder.

#### **Scripts Used**
> Lines 108-122 of the script clean_reads.sh executes this step, calling on the script slice-paired-reads-coverage.py provided (altered from khmer's script to handle paired-end reads). 
>
> The clean_reads.sh script is called in line 25 of the Makefile
> (This utilizies the directories set via **SCRIPTDIR** and **CLEANREADS**).

#### **Outputs Produced**
> Paired reads which passed the coverage filtering will be outputted in a folder "sliced" found in the "./cleaned_reads" folder. 
>
> Reads which passed filtering while their pair did not will also be outputted in the sliced folder and can be identified with the suffix ".cor_singletons.fastq"
>
> The code in the cell below can be used to check if the correct files have been  produced in the sliced folder. 


In [17]:
# STEP THREE CODE

# Change the max coverage to desired value below-> ** = replace
#!sed -i '/COVERAGE=/c\COVERAGE=*VALUE*' *your/path/here*/clean_reads.sh 

# Check Sliced Reads 
#!ls [your_path]/rep_files/analysis/cleaned_reads/sliced

SRR9650833_RR.cor_singletons.fastq  SRR9650845_RR.cor_singletons.fastq
SRR9650833_RRR1.cor_sliced.fastq    SRR9650845_RRR1.cor_sliced.fastq
SRR9650833_RRR2.cor_sliced.fastq    SRR9650845_RRR2.cor_sliced.fastq
SRR9650834_RR.cor_singletons.fastq  SRR9650846_RR.cor_singletons.fastq
SRR9650834_RRR1.cor_sliced.fastq    SRR9650846_RRR1.cor_sliced.fastq
SRR9650834_RRR2.cor_sliced.fastq    SRR9650846_RRR2.cor_sliced.fastq
SRR9650835_RR.cor_singletons.fastq  SRR9650847_RR.cor_singletons.fastq
SRR9650835_RRR1.cor_sliced.fastq    SRR9650847_RRR1.cor_sliced.fastq
SRR9650835_RRR2.cor_sliced.fastq    SRR9650847_RRR2.cor_sliced.fastq
SRR9650836_RR.cor_singletons.fastq  SRR9650848_RR.cor_singletons.fastq
SRR9650836_RRR1.cor_sliced.fastq    SRR9650848_RRR1.cor_sliced.fastq
SRR9650836_RRR2.cor_sliced.fastq    SRR9650848_RRR2.cor_sliced.fastq
SRR9650837_RR.cor_singletons.fastq  SRR9650849_RR.cor_singletons.fastq
SRR9650837_RRR1.cor_sliced.fastq    SRR9650849_RRR1.cor_sliced.fastq
SRR9650837_RRR2.cor_slic

### **Step Four | Allign Reads to Reference Genome**
The fourth step of the pipeline is to align/map reads to the reference genome (*Eucalyptus grandis* in the original exp.) using the [NextGenMap](https://cibiv.github.io/NextGenMap/) algorithim.

#### **Input Data**
> The input data is denoted by -i in the ngm_aligner.sh script and changes accordingly to which repeat is occuring 
>> **!!! SEE ITERATIONS OF STEPS FOUR AND FIVE !!!**

#### **Scripts Used**
> The script ngm_alinger.sh is used for this step, found in the scripts folder of the repository.
>
> This script is first called in line 47 of the Makefile (*due to iterations*).

#### **Outputs Produced**
> The output file of step four is a bam file (*aka. a Binary Alignment Map*) of the aligned reads denoted by the variable -o in ngm_aligner.sh which by default is the reference genome file name and iteration number in the following format **"e_mel_1.bam"**. 
>
> To check if the bam file per iteration has been produced and is not empty run the code in the cell below.

In [None]:
# STEP FOUR CODE

# Change directory to folder of current iteration
#import os
#os.chdir('/your/path/to/analysis/e_mel_*iteration num*')

# Check presence and size of bam file
#!du -bsh *

### **Step Five | Create a Consensus Sequence from the BAM**
The fifth step of the pipeline is to create a consensus FASTA sequence using the algorithims [bcftools consensus](https://samtools.github.io/bcftools/howtos/consensus-sequence.html) and [tabix](http://www.htslib.org/doc/tabix.html) both provided by samtools. 

The final iteration of the consensus sequence will be utilized as the pseudogenome reference genome.

#### **Input Data**
> The inputted data is the BAM file of the previous iteration produced by step 4 above.
>> **!!! SEE ITERATIONS OF STEPS FOUR AND FIVE !!!**

#### **Scripts Used**
> The script create_consensus.sh is used for this step, found in the scripts folder of the repository.
>
> This script is fist called in line 35 of the Makefile (*due to iterations*).

#### **Outputs Produced**
> The consensus FASTA sequence is outputted as a file denoted by -o in create_consensus.sh which by default is theis the reference genome file name and iteration number in the following format **"e_mel_1.fa"**. 
>
> Additionally, a chain file documenting each rearrangement between the input BAM file and the output consensus file. By default this has the same name as the outputted consensus sequence with the suffix ".chain".
>
> While using the bcf tools algorithim a vcf file containing genotype calls is generated but by default NOT SAVED as an output file. This is because this pipeline rellies on the [GATK best practices](https://gatk.broadinstitute.org/hc/en-us/sections/360007226651)
workflow to call the genotype variants further along in the pipeline. This vcf file was used to create the consensus sequence. 


In [None]:
# STEP FIVE CODE

# To save the bcf tools generated vcf file 
# !sed -i '/BCFTOOLSFILE=/c\BCFTOOLSFILE="*your_bcf_file.vcf"' *your-path-here*/create_consensus.sh

In [11]:
# Change Directory
import os
os.getcwd() #run this line first 
os.chdir('/home/agre945/anaconda3/envs/studyenv/GitHub/SRS-AG/rep_files/analysis')
os.getcwd()# run to check working directory correct

'/home/agre945/anaconda3/envs/studyenv/GitHub/SRS-AG/rep_files/analysis'

In [None]:
# Export the path for the scripts used FIRST
!export PATH=$PATH:/home/agre945/anaconda3/envs/studyenv/GitHub/SRS-AG/rep_files/scripts
# Run line below SECOND ** = replace
#!sed -i '/LOAD_COUNTING=/c\LOAD_COUNTING="*your-path-here*/load-into-counting-COR.py"' *your-path-here*/clean_reads.sh
#!sed -i '/SLICE_BY_COV=/c\SLICE_BY_COV="*your-path-here*/slice-paired-reads-by-coverage.py"' *your-path-here*/clean_reads.sh
!make


mkdir -p e_mel_1
../scripts/ngm_aligner.sh -s 0.3 -d ./tmp/ -r ../data/e_grandis/ref.fa -o e_mel_1/e_mel_1.bam -i ./cleaned_reads/
mktemp: failed to create directory via template ‘./tmp/ngm_aligner.sh_tmp_XXXXXX’: No such file or directory
Specified files are: ./cleaned_reads/sliced/SRR9650856_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650852_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650843_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650842_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650840_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650851_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650839_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650849_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650844_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650850_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650855_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650838_RRR1.cor_sliced.fastq ./cleaned_reads/sliced/SRR9650845_RRR1.cor_sliced.fastq ./cleaned_r