-
Notifications
You must be signed in to change notification settings - Fork 10
2. Prerequisites
Premal Shah edited this page Apr 10, 2018
·
10 revisions
- Cutadapt
- Hisat2
- We are working to add back a Bowtie alignment option
- python installation, with packages
- R installation, release 2.14.0 or later (including parallel package), with additional packages
For each condition/sample in your experiment, merge all fastq files into a single gzipped fastq file, e.g. condition1.fastq.gz
. Note that these files are .gzip compressed to save space.
cat condition1_subsetA.fastq condition1_subsetB.fastq | gzip >> condition1.fastq.gz
cat condition2_subsetC.fastq condition2_subsetD.fastq | gzip >> condition2.fastq.gz
.
.
.
It is easiest if you put all these files in a single input directory. Alternatively, you could symlink them from a single directory.
For your organism, you need transcript sequences in one fasta file, and ribosomal rRNA and other contaminant sequences in another fasta. Transcript sequences need to contain coding regions and flanking regions (which could be fixed, or coincident with measured UTRs). The GTF/gff3 file needs to give the locations of coding sequences within the transcripts.
For S. cerevisiae, download