# Metagenomics analysis

As a first step, you decide to run shotgun metagenomic sequencing of the prokaryotic microbiome in two of kind of samples: 1) one taken during the high temperature episodes, and 2) right after the episodes, when the temperature is back to normal and there is a bloom of algae. 

After months of waiting, the sequencing results from your two metagenomic samples just arrived! The raw read files (reverse and forward) were produced by Illumina pair-end sequencing and are located in your computing server: 

1. Forward and reverse reads from the high temperature sample

>/home/2019_2020/data/metagenomics/hotspring-hightemp.1.fq.gz <br>
/home/2019_2020/data/metagenomics/hotspring-hightemp.2.fq.gz  
  
2. Forward and reverse reads from the normal temperature sample  

>/home/2019_2020/data/metagenomics/hotspring-normaltemp.1.fq.gz <br>
/home/2019_2020/data/metagenomics/hotspring-normaltemp.2.fq.gz


### 1. Generating taxonomic profiles
<font color='MidnightBlue'>
    In the metagenomic samples we probably have sequences from many different genes and species. If we want to find out which one is the most abundant organisms in the sample, we have to do sample's profiling, which can be done by taxonomy or gene function. Probably looking into the function would be interesting to figure out why the outburst of life after the episodes of high temperature in the volcano. However, since we want to identify organisms, a taxonomic profiling looks like a better approach for now.

We performe an analysis of the taxonomic profiles using the tool mOTUs (ref: https://motu-tool.org/tutorial.html) <br>
The reference for the results analysis: https://github.com/motu-tool/mOTUs_v2/wiki/Explain-the-resulting-profile

In [1]:
# first we go to the directory with the files
cd /home/2019_2020/data/metagenomics
ls

[0m[38;5;9mhotspring-hightemp.1.fq.gz[0m  [38;5;9mhotspring-normaltemp.1.fq.gz[0m
[38;5;9mhotspring-hightemp.2.fq.gz[0m  [38;5;9mhotspring-normaltemp.2.fq.gz[0m


In [2]:
# high temperature samples
motus profile -f hotspring-hightemp.1.fq.gz -r hotspring-hightemp.2.fq.gz -o /home/2019_2020/s.sanchez-heredero/Metagenomics_outputs/hightemp.motus -n hightemp

In [3]:
# normal temperature samples
motus profile -f hotspring-normaltemp.1.fq.gz -r hotspring-normaltemp.2.fq.gz -o /home/2019_2020/s.sanchez-heredero/Metagenomics_outputs/normaltemp.motus -n normaltemp

In [4]:
cd /home/2019_2020/s.sanchez-heredero/Metagenomics_outputs
ls

hightemp.motus  normaltemp.motus


In [5]:
# I remove the header from the file
tail -n+4  hightemp.motus | sort -t$'\t' -k2 -nr | head -n 5

Aquifex aeolicus [ref_mOTU_v25_10705]	0.6189655514
Pelagibacteraceae species incertae sedis [meta_mOTU_v25_13988]	0.0907761851
-1	0.0721252510
Pelagibacteraceae species incertae sedis [meta_mOTU_v25_13493]	0.0664534531
Porticoccaceae species incertae sedis [meta_mOTU_v25_13235]	0.0193465739
sort: write failed: standard output: Broken pipe
sort: write error


In [9]:
# I remove the header from the file
tail -n+4 normaltemp.motus | sort -t$'\t' -k2 -nr | head -n 5

Pelagibacteraceae species incertae sedis [meta_mOTU_v25_13988]	0.2385013523
-1	0.1892140031
Pelagibacteraceae species incertae sedis [meta_mOTU_v25_13493]	0.1743434863
Porticoccaceae species incertae sedis [meta_mOTU_v25_13235]	0.0507562117
Candidatus Aquiluna sp. IMCC13023 [ref_mOTU_v25_06613]	0.0484321303
sort: write failed: standard output: Broken pipe
sort: write error
