Skip to content
Premal Shah edited this page Apr 10, 2018 · 10 revisions

Required software

Input files

For each condition/sample in your experiment, merge all fastq files into a single gzipped fastq file, e.g. condition1.fastq.gz. Note that these files are .gzip compressed to save space.

cat condition1_subsetA.fastq condition1_subsetB.fastq | gzip >> condition1.fastq.gz
cat condition2_subsetC.fastq condition2_subsetD.fastq | gzip >> condition2.fastq.gz
.
.
.

It is easiest if you put all these files in a single input directory. Alternatively, you could symlink them from a single directory.

Download transcript and rRNA sequences and gff3 file

For your organism, you need transcript sequences in one fasta file, and ribosomal rRNA and other contaminant sequences in another fasta. Transcript sequences need to contain coding regions and flanking regions (which could be fixed, or coincident with measured UTRs). The GTF/gff3 file needs to give the locations of coding sequences within the transcripts.

For S. cerevisiae, download

  • rRNA sequences from here.
  • transcript sequences from here.
  • gff file from here