# Submodule 4
## Analyze Phylogenetic Tree
### 4.1 Interpret and Visually Represent Phylogenetic Trees:

Visualization tools are essential for interpreting and presenting phylogenetic trees.

**Tools for Tree Visualization:**

iTOL (Interactive Tree of Life): An online tool for the display and annotation of phylogenetic trees.
- Upload the seq_output.nwk file: iTOL Up: https://itol.embl.de/upload.cgi
- Visualize and customize your phylogenetic tree as needed.


### 4.2 Importance of Visual Representation
**Visual representation of phylogenetic trees aids in:**
- Interpreting Results: Makes it easier to understand evolutionary relationships.
- Communication: Helps in conveying findings to a broader audience, including those who may not be specialists in phylogenetics.
- Highlighting Key Features: Emphasizes important evolutionary events and patterns.

### 4.3 Conduct Comparative Metagenomics along Different Branches
Comparative metagenomics involves comparing the genetic content of different samples to uncover variations.
Steps for Comparative Metagenomics:

1.	Installing BLAST:
 - Install BLAST using conda:

In [None]:
!conda install -c bioconda blast

2. Creating a BLAST Database:
- Create a BLAST database from your sequence file:

In [None]:
!makeblastdb -in sequences.fasta -dbtype nucl –out seq_database

3. Running BLAST:
- Create a new file query_sequences.fasta with the sequences you want to compare.
- Run BLAST to compare your query sequences against the database:
- The results will be saved in seq_results.txt.

In [None]:
!blastn -query query_sequences.fasta -db seq_database -out seq_results.txt -outfmt 6

### 4.4 Automate Comparative Metagenomics Analysis using Biopython
Automation can streamline comparative metagenomics analysis, making it more efficient.

**Script for Automation:**

In [2]:
!pip install biopython

Collecting biopython
  Downloading biopython-1.84-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (12 kB)
Collecting numpy (from biopython)
  Downloading numpy-2.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.9/60.9 kB[0m [31m741.9 kB/s[0m eta [36m0:00:00[0ma [36m0:00:01[0m
[?25hDownloading biopython-1.84-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m3.2/3.2 MB[0m [31m10.0 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hDownloading numpy-2.0.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.5 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m19.5/19.5 MB[0m [31m22.2 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hInstalling collected packages: numpy, biopython
Successfully installed biopython-1.84 numpy-2.0.1


In [3]:
from Bio.Blast import NCBIWWW, NCBIXML

# Function to run BLAST and parse results
def run_blast(query_file, db_file, output_file):
    result_handle = NCBIWWW.qblast("blastn", db_file, query_file)
    with open(output_file, "w") as out_handle:
        out_handle.write(result_handle.read())
    result_handle.close()

# Run the BLAST
run_blast("data/cov/query_sequences.fasta", "data/cov/seq_database", "blast_results.xml")

# Parse the BLAST results
with open("blast_results.xml") as result_handle:
    blast_records = NCBIXML.parse(result_handle)
    for blast_record in blast_records:
        for alignment in blast_record.alignments:
            for hsp in alignment.hsps:
                print(f"****Alignment****")
                print(f"sequence: {alignment.title}")
                print(f"length: {alignment.length}")
                print(f"e value: {hsp.expect}")
                print(f"{hsp.query[0:75]}...")
                print(f"{hsp.match[0:75]}...")
                print(f"{hsp.sbjct[0:75]}...")


### 4.5 Discuss Insights from Ancestral State Reconstruction
Ancestral state reconstruction provides insights into:
- Evolutionary Dynamics: Understanding how certain traits or genetic sequences have evolved over time.
- Diversity: Gaining a deeper understanding of the diversity within and between metagenomic samples.
- Evolutionary Pressures: Identifying the evolutionary pressures that have shaped the genetic makeup of organisms.


### 4.6 Utilize Bayesian Inference Methods with BEAST for Ancestral State Reconstruction
Bayesian inference methods are powerful for reconstructing ancestral states and understanding evolutionary dynamics.

**Using BEAST for Ancestral State Reconstruction:**

1.	Installing BEAST:
- Install BEAST using conda:

In [None]:
!conda install -c bioconda beast

- You can check the installation and available options with:

In [2]:
!beast -beagle_info


        BEAST v1.10.4 Prerelease #bc6cbd9, 2002-2018
       Bayesian Evolutionary Analysis Sampling Trees
                 Designed and developed by
   Alexei J. Drummond, Andrew Rambaut and Marc A. Suchard
                              
               Department of Computer Science
                   University of Auckland
                  alexei@cs.auckland.ac.nz
                              
             Institute of Evolutionary Biology
                  University of Edinburgh
                     a.rambaut@ed.ac.uk
                              
              David Geffen School of Medicine
           University of California, Los Angeles
                     msuchard@ucla.edu
                              
                Downloads, Help & Resources:
                  	http://beast.community
                              
Source code distributed under the GNU Lesser General Public License:
          	http://github.com/beast-dev/beast-mcmc
                              
      

2.	Launching BEAUti:
- Find the path to the BEAUti software:

In [5]:
!find $CONDA_PREFIX -name "beauti"

/home/anushuyabaidya/anaconda3/envs/phylo/bin/beauti


- Open BEAUti by running the found path in your command line.
    - Example:
        - /path/to/bin/beauti

3.	Using BEAUti:
- In BEAUti, go to File > Import and load your aligned_sequences.fasta.
- Set up the parameters for your analysis and generate the BEAST XML file.
- Save the configuration as seq_config.xml.
4.	Running BEAST:
- Run BEAST with the configuration file:


In [None]:
!beast seq_config.xml