Skip to content

ISSRseq_ReferenceBased

Brandon Sinn edited this page Mar 2, 2022 · 15 revisions

Overview

ISSRseq_ReferenceBased.sh creates the same directory structure as ISSRseq_AssembleReference, but uses a user-supplied ISSRseq assembly or genome as a reference for the next step in the pipeline: ISSRseq_CreateBAMs.

Contaminant nor negative reference filtering are not conducted, since it is assumed that the user has supplied a reference consisting only of chromosomes or ISSR amplicons of interest. This script can retain more reads than ISSRseq_AssembleReference.sh, since one round of read trimming is conducted to remove adapter sequences, rather than two rounds in the AssembleReference script (in an attempt to remove SSR motifs and adapters from both ends of the reads in that script, since they are used for de novo assembly).

Before you start

  1. Copy your reads to a new directory. Decompress your reads. Ensure that this directory only contains your reads.

  2. Rename forward and reverse reads using the following convention:

    sample1_R1.fastq
    sample1_R2.fastq

  3. Create a plain UNIX-encoded text file listing the read file prefix used for each sample, one per line. Leave a blank line at the bottom of the file. For example:

    sample1
    sample2
    etc ...

  4. A reference assembly in FASTA format. Exclude organellar contigs. Contigs should be numerically and uniquely named.

Do not save the samples file in the read directory.

Usage

For a copy of the guide below at the prompt, simply execute: ISSRseq_ReferenceBased.sh help

Each of the flags below are required, and each is a capital letter.

-O [desired prefix of output directory]

-I [path to directory containing sequences]

-S [path to the samples file]

-R [path to reference assembly]

-T [number of parallel processing threads -- I recommend not exceeding number of virtualized cores]

-M [minimum post-trim read length]

-H [number of bases to hard trim from the end of reads]

-P [fasta file of ISSR motifs used]

-X [bbduk trimming kmer, equal to or longer than shortest primer used]