# Lecture 5 - Multiple Sequence Alignment

In the previous lecture, you learned to align pairs of sequences and to *BLAST* a sequence against a protein database. 

In this tutorial you will learn to align multiple sequences together and use that to understand how they have evolved from a common ancestor. 

### Learning objectives:

- Perform MSA on the web using [**Clustal Omega**](https://www.ebi.ac.uk/jdispatcher/msa/clustalo).
- Generating and interpreting phylogenetic trees. 




-----

### Exercise 1.1

**Glucose-6-phosphate isomerase** (GPI) is an enzyme involved in the first part of glycolysis responsible for converting glucose-6-phosphate to fructose-6-phosphate:

![stuff](files/R00771.png)

As you can imagine, this is a very conserved metabolic pathway that is present across all domains of life.

Inside the  `files/` folder you will find the file [gpi.faa](files/gpi.faa) with protein sequences for this enzyme retrieved from different organisms. 
Let's run an MSA and use that to build a phylogenetic tree.

- Go to [Clustal Omega](https://www.ebi.ac.uk/jdispatcher/msa/clustalo) and upload the fasta file (or copy-paste) and submit
- Explore the alignment and try to distinguish between the most and least conserved regions. 
- Go to the **Phylogenetic Tree** tab (and scroll down to *Phylogram*)


**Discussion point:**
- Consider the position of the organisms in the tree. Is this what you would expect? 🤔 

----------

### Exercise 1.2:

**ATP synthase** is a transmembrane protein that uses an electrochemical proton gradient to store energy as ATP. It is a complex protein composed of multiple subunits. 

![ATP synthase](files/atps.png)

Inside the  `files/` folder you will find the file [atps_a.faa](files/atps_a.faa) that contains protein sequences for the **Subunit a** of the enzyme complex retrieved from different organisms. 

> Repeat the previous exercise using this protein.


**Discussion points**:

- Does the phylogenetic tree look similar to the previous one? 🤔
- Can you say something about the evolution of the species by looking at the evolution of individual genes?

-----------
### Exercise 1.3

Ribosomes are universal proteins present in all life forms. They are complex *"molecular machines"* composed of a mixture of RNA (orange) and protein subunits (blue):

![ribosome](files/ribosome.png)

The (non-protein-coding) ribosomal RNA sequences are typically used for phylogenetic identification of species due to their slow evolution rates and widely conserved sequences. In particular, we use the **16S** subunit in prokaryotes and the **18S** subunit in eukaryotes.

Inside the  `files/` folder you will find the file [18S.fna](files/18S.fna) that contains 18S rRNA sequences for the different organisms we analysed before. Let's once again build a phylogenetic tree.

- Go to [Clustal Omega](https://www.ebi.ac.uk/jdispatcher/msa/clustalo)
  - Remember to change the input format to **RNA** !
- Upload the fasta file (or copy-paste) and submit
- Go to *Phylogenetic Tree* 

**Discussion points**:

- Does this tree look like what you expected?
- 🧠 Can you think of a better way to build a phylogenetic **species tree**? 

--------
### Exercise 2

You will notice that many Bioinformatics tools are often interconnected and there are several ways of doing the same task.

Here is a different way to run a multiple sequence alignment:

- Go to [**UniProt**](https://www.uniprot.org/) and search for **p53**(*)
- Randomly select several entries (or click on the top-right to select the top 25)
- Click on **align** (it's somewhere at the top) and press the submission on the next page
- Explore the result to see the evolution of this gene across species
- Are some regions more frequently mutated than others? 

(*) **p53** is an important tumor-surpressing protein, often called *The Guardian of the Genome*, if you never heard about it [you definitely should](https://www.nature.com/articles/d41586-022-00567-9).

------
## Wrap-up

Today you did not have to write code, so you probably finished this tutorial quite quickly. 

Don't be shy, raise your hand and make a question or start a (science related) discussion 🙂