# Sorting bam files
BAM files are Binary sAM files. SAM files are nothing more than big tables that tell us what read from the FastQ file, mapped where exactly on the scaffolds of the metagenome assembly. If that sentence was too long, read it again.

The rows of this table that we call a SAM file, are not ordered in any logic way. For many [computational purposes](https://www.google.com/search?client=ubuntu&hs=tg5&channel=fs&ei=3vYPW9uDOumRgAbF3LHoCw&q=why+sorting+is+necessary&oq=why+sort&gs_l=psy-ab.3.2.35i39k1j0l2j0i20i263k1j0l6.2238.7325.0.11765.8.8.0.0.0.0.76.455.8.8.0....0...1c.1.64.psy-ab..0.8.454...0i67k1j0i203k1.0.EJEFo_3okmA), we want to sort/order these rows according to 
1. the scaffold they mapped on
2. the position on that scaffold. (position in bp)



We will achieve this with the `samtools` programme. As the name suggests, `samtools` comprises a lot of tools: one of which is the `samtools sort` tool. This tool we will use to sort our BAM files.

`samtools` also contains the `samtools view` tool we used earlier. Samtools view is used to convert SAM to BAM and also BAM to SAM.

Before we proceed, lets have a quick look in the BAM files we produced with `samtools view`.

In [None]:
!samtools view ./path/to/your/bamfile.bam | head

You should see a lot of tab-delimited names, numbers and sequences. Perhaps google how a sam/bam file should look and see if it corresponds to what you get.

## Another loop

We have seen how to make a loop in the m3 part of this practical. Now let's to the same but then to sort the BAM files we created earlier.

The cell bellow contains a copy of the loop from the previous notebook. Edit in a way so that the loop sorts your bam files.
1. Make sure to do this step by step. Test every little thing you change in the loop
2. Don't forget to very carefully read the help page of `samtools view`

Make sure you only use one CPU/thread, we have to share this computer with all of us.

In [None]:
%%bash
samples=$(find ./data/reads -name '*.fastq.gz' -type f -printf '%P\n' | cut -d '.' -f 1 | sort | uniq)

# next I use this variable in my loop
for i in $samples
    do echo $i
done

## check
After sorting, again check wether your bam files were sorted correctly. 
1. Use `ls --size` to see if the files exist and have a proportional size
2. run samtools view to view your BAM files.

## clean
Did your bam files sort correctly? Then remove the unsorted bam files. We don't need these anymore and we save some disk space.