# Biopython

In order to install Biopython (this needs to be done only once):
1. Go to File > New > Terminal
2. Copy-paste and run: `conda install -c conda-forge biopython`

In [None]:
# conda install -c conda-forge biopython

## FASTA files
**DOCUMENTATION:** https://biopython.org/wiki/SeqIO

### Basic example of loading a fasta file

In [None]:
# Importing the relevant package
from Bio import SeqIO

In [None]:
# File name and address
fname = "./input/EHD_nucleotide.richFasta"

In [None]:
with open(fname) as handle:
    for record in SeqIO.parse(handle, "fasta"):
        
        # Print sequence ID and first 50 basepairs
        print(record.id, record.seq[0:50])

### Creating dictionary of sequences with IDs
Python's dictionary is a convenient way to store the FASTA file in memory

In [None]:
with open(fname) as handle:
    record_dict = SeqIO.to_dict(SeqIO.parse(handle, "fasta"))

type(record_dict)

In [None]:
## Let's look at one dictionary entry    
# Type?
print(type(record_dict["NM_079608.2"]))

dir(record_dict["NM_079608.2"])

In [None]:
record_dict["NM_079608.2"].description

In [None]:
# Print as an example
record_dict["NM_079608.2"]  # use any record ID

In [None]:
record_dict["NM_079608.2"].letter_annotations

## Phylogenetics

In [None]:
from Bio import Phylo
tree = Phylo.read("./input/simple.dnd", "newick")

In [None]:
print(tree)

In [None]:
tree.rooted = True
Phylo.draw(tree)