Skip to content

0.b. Input files

Qingqing Wang edited this page Aug 19, 2018 · 2 revisions

Read alignment

  • JUM takes alignment output files using STAR for AS analysis.

  • JUM recommends users to apply the 2-pass mapping mode for alignment, which has been shown to greatly improve splice junction quantification.

  • The recommended mapping procedure is as follows (suppose the user has three replicates for the control condition and three replicates for the treatment condition, all 100bp paired-end reads, and map to human hg38 genome as an example):

    • genome indexing:
    $ mkdir genome_index_STAR_r1
    $ STAR --runThreadN 3 --runMode genomeGenerate --genomeDir genome_index_STAR_r1 --genomeFastaFiles hg38.fa --sjdbGTFfile hg38_genes.gtf --sjdbOverhang 99
    • 1st pass mapping and copy selected STAR outputs from the 1st mapping to a folder named as 1st_SJ:
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_1 --readFilesIn ctrl_1_R01.fastq ctrl_1_R02.fastq --outSJfilterReads Unique
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_2 --readFilesIn ctrl_2_R01.fastq ctrl_2_R02.fastq --outSJfilterReads Unique
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_2 --readFilesIn ctrl_3_R01.fastq ctrl_3_R02.fastq --outSJfilterReads Unique
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix treat_1 --readFilesIn treat_1_R01.fastq treat_1_R02.fastq --outSJfilterReads Unique
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix treat_2 --readFilesIn treat_2_R01.fastq treat_2_R02.fastq --outSJfilterReads Unique
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix treat_3 --readFilesIn treat_3_R01.fastq treat_3_R02.fastq --outSJfilterReads Unique
    $ mkdir 1st_SJ
    $ mv *SJ.out.tab 1st_SJ
    $ mv *Log.final.out 1st_SJ
    $ mv *Log.progress.out 1st_SJ
    $ mv *Log.out 1st_SJ
    $ rm *Aligned.out.sam
    • 2nd pass mapping:
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_1 --readFilesIn ctrl_1_R01.fastq ctrl_1_R02.fastq --outSJfilterReads Unique --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_2 --readFilesIn ctrl_2_R01.fastq ctrl_2_R02.fastq --outSJfilterReads Unique --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix ctrl_3 --readFilesIn ctrl_3_R01.fastq ctrl_3_R02.fastq --outSJfilterReads Unique --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix treat_1 --readFilesIn treat_1_R01.fastq treat_1_R02.fastq --outSJfilterReads Unique --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix treat_2 --readFilesIn treat_2_R01.fastq treat_2_R02.fastq --outSJfilterReads Unique --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab
    $ STAR --runThreadN 3 --genomeDir genome_index_STAR_r1 --outFileNamePrefix treat_3 --readFilesIn treat_3_R01.fastq treat_3_R02.fastq --outSJfilterReads Unique --outSAMstrandField intronMotif --outFilterMultimapNmax 1 -sjdbFileChrStartEnd 1st_SJ/ctrl_1SJ.out.tab 1st_SJ/ctrl_2SJ.out.tab 1st_SJ/ctrl_3SJ.out.tab 1st_SJ/treat_1SJ.out.tab 1st_SJ/treat_2SJ.out.tab 1st_SJ/treat_3SJ.out.tab
    • Convert and sort the resulted alignment files (please use the exact naming nomenclature as shown below). Here we show ctrl 1 sample as an example but users need to do this for all the samples:
    $ samtools view -bS ctrl1Aligned.out.sam > ctrl1Aligned.out.bam
    $ samtools sort -o ctrl1Aligned.out_sorted.bam -T ctrl1_temp ctrl1Aligned.out.bam 
    $ samtools index ctrl1Aligned.out_sorted.bam
    $ rm *Aligned.out.bam