# Sort and Index the alignment results

In [2]:
source config.sh

## [Converting SAM to BAM with samtools](http://quinlanlab.org/tutorials/samtools/samtools.html)

Here I followed the instructions from the [Samtools Tutorial](http://quinlanlab.org/tutorials/samtools/samtools.html) created by Aaron Quinlan.

- Converting SAM to BAM with samtools "view"

```
### convert SAM to BAM
samtools view -S -b sample.sam > sample.bam

### print the results
samtools view sample.bam | head
```

- samtools "sort"

"When you align FASTQ files with all current sequence aligners, the alignments produced are in random order with respect to their position in the reference genome. In other words, the BAM file is in the order that the sequences occurred in the input FASTQ files."

```
### Before sort
samtools view sample.bam | head

### sort
samtools sort sample.bam -o sample.sorted.bam

### After sort
samtools view sample.bam | head
```

In [4]:
DIR=( "$RAW_FASTQS" "$TRIMMED"    "$FILTER" )
OUT=( "$ALIGN_RAW"  "$ALIGN_TRIM" "$ALIGN_FILTER" )
REFFA=$GENOME/$FA

for ((i=0; i<${#OUT[@]}; ++i)); do
    echo "================================================"
    echo "${OUT[i]}"
    echo "++++++++++++++++++++++++"  
    
    ### SAM to BAM
    samtools view -S -b ${OUT[i]}/aln_pe.sam > ${OUT[i]}/aln_pe.bam
    
    ### Sorting
    samtools sort ${OUT[i]}/aln_pe.bam -o ${OUT[i]}/aln_pe_sorted.bam
done

/home/jovyan/work/Data/SRR4841864/align_raw
++++++++++++++++++++++++
[bam_sort_core] merging from 3 files and 1 in-memory blocks...
/home/jovyan/work/Data/SRR4841864/align_trim
++++++++++++++++++++++++
[bam_sort_core] merging from 3 files and 1 in-memory blocks...
/home/jovyan/work/Data/SRR4841864/align_filter
++++++++++++++++++++++++
[bam_sort_core] merging from 2 files and 1 in-memory blocks...


In [7]:
ls $ALIGN_RAW

aln_pe.bam  aln_pe.sam  aln_pe_sorted.bam


In [6]:
ls $ALIGN_TRIM

aln_pe.bam  aln_pe.sam  aln_pe_sorted.bam


In [8]:
ls $ALIGN_FILTER

aln_pe.bam  aln_pe.sam  aln_pe_sorted.bam


## Observe the bam files

In [10]:
samtools view $ALIGN_FILTER/aln_pe.bam | head -3

SRR4841864.5	99	XIII	495383	60	99M	=	495405	123	AGACTTGATGCGACCAATGTTATTCCAGGCGCAAATAAGGCATGTTGTGGTATTACTAAAATCAGTTTGGATCATGAAAGCTTTTTAGGTAACACCTTG	=DDFFFHHGHHJIJJJJJJIJJJJJJIJJIJIJJJJJJJJIIEIGHICHI@DGIJIJHHHHHHFFCEFFDEEEEDDDDDDDDDDDDDDD:ACDDDDDDC	NM:i:0	MD:Z:99	MC:Z:101M	AS:i:99	XS:i:0
SRR4841864.5	147	XIII	495405	60	101M	=	495383	-123	TTCCAGGCGCAAATAAGGCATGTTGTGGTATTACTAAAATCAGTTTGGATCATGAAAGCTTTTTAGGTAACACCTTGTCTGAAATCTCTAAAGAGAAAGCA	DDDDDDDDDFEEEFFFFFFHHGHJJJJIIHHHCJJJJJIIJJJJJJJJJIIIJJJJJIIIIJJJIJIHDHGCJJJJJJJIJJJJJIIJHHHHHFFFFFCCC	NM:i:0	MD:Z:101	MC:Z:99M	AS:i:101	XS:i:0
SRR4841864.6	99	XVI	288099	60	99M	=	288150	152	ACCGTCACGTTTCAACAAATTTTTATACGTTGCCTCCAGTGATTTTCCTCCCATATATTCCATGGCAATGTATATTGAAGAACTCTGTTCGTCGGTAAA	=DDFDAHHHCFIJIIGIIGHIJJJJIJJI?FGIEIJJJJFDGGIJIIIIIIJAFIJGIIGIJIGIIIIEE;7777=ABDEFDEEEEC@CCDBDDD=AB@	NM:i:1	MD:Z:4C94	MC:Z:101M	AS:i:94	XS:i:0


In [11]:
samtools view $ALIGN_FILTER/aln_pe_sorted.bam | head -3

SRR4841864.604955	163	I	78	0	60M2I39M	=	488	512	AACACTACCCTAATCTAACCCTGATCAACCTGTCCCCCAACTTACCCTCCATTACCCTGCCTCTCCACTCGTTACCCTGTCCCATTTCACCTTACCACTCC	?@?DDDDDHFHH3EGEG@<FHFDFHGHII3?EF@FGCH<GHGGEEGHHHHAGBDGGFHDHIIIIICAC;AEEE?BCCCC@@AAAACDDDCCCC@CCAC3<?	NM:i:11	MD:Z:5A0G16G0C9T1T47C0A3A9	MC:Z:13M1D88M	AS:i:49	XS:i:48
SRR4841864.4430617	65	I	78	0	60M2I39M	IX	439520	0	AACACTACCCTAATCTAACCCTGATCAACCTGTCCCCCAACTTACCCTCCATTACCCTGCCTCTCCACTCGTTACCCTGTCCCATTTCACCTTACCACTCC	CCCFFFFFGGHHHJJJJJJJJJJJJJJJJJJJJJJJJJJJJJJIJJJIJJJJJJIJJJJJJIJJIJHHHHHFFFDEEEEEDDDDDDEEEDDDDDDDDDDDD	NM:i:11	MD:Z:5A0G16G0C9T1T47C0A3A9	MC:Z:100M	AS:i:49	XS:i:48
SRR4841864.1907733	99	I	85	39	28S55M2I15M	=	143	125	ACCCTATCTCAACCCTACTCTAACCCTACCCTACTCTAACCCTGACCATCCTGTCTCTCAGCTTACCCTCCATTACCCTGCCTCCCCACTCGTTACCCTG	CC@DFFFFHHHHHJJJIGIHIIEHGGGIJIJGBFIJIJGIJJJJHIHIGGHIFFGIIIGGIJJJJJIIJJHAEFFFFFFDBDCEDDDDDACDBBBDDDDC	NM:i:6	MD:Z:5A10G3A11A37	MC:Z:20M2I47M31S	AS:i:42	XS:i:51
