# 4. Introduction to Biopython

Biopython is a set of freely available tools for biological computation written in Python. It is a distributed collaborative effort to develop Python libraries and applications which address the needs of current and future work in bioinformatics. Before diving into your (biological) data and try analyzing it with complex self-written scripts, it makes sense to search through the [Biopython documentation](http://biopython.org/DIST/docs/tutorial/Tutorial.html) and look for out-of-the-box solutions as part of Biopython.  

<img src="img/logo_biopython.PNG" width="500" height="300"/>


In the next few chapters, we'll learn some of Biopython's most frequently used functionalities. There are several ways of importing Biopython. Installing the complete module can be done:
- Using Anaconda's environments and searching for the package, or
- Immediately in a Notebook using the following code:

In [1]:
# pip install biopython 

# Import the Biopython library
import Bio

If there was no error doing this, than you're probably good to go. You can surely check this with asking Python what version you have installed. 

In [2]:
# Check version for proper installment (v1.74)
print(Bio.__version__)

1.74


Of course it makes sense to install functions or submodules that are part of Biopython in order to ease the use. Imagine that you want to work with sequences, you can import the Seq-object in the following way. This will allow you to work directly with the Seq (sequence) object. 

In [5]:
from Bio.Seq import Seq

In [12]:
help(Seq)

Help on class Seq in module Bio.Seq:

class Seq(builtins.object)
 |  Seq(data, alphabet=Alphabet())
 |  
 |  Read-only sequence object (essentially a string with an alphabet).
 |  
 |  Like normal python strings, our basic sequence object is immutable.
 |  This prevents you from doing my_seq[5] = "A" for example, but does allow
 |  Seq objects to be used as dictionary keys.
 |  
 |  The Seq object provides a number of string like methods (such as count,
 |  find, split and strip), which are alphabet aware where appropriate.
 |  
 |  In addition to the string like sequence, the Seq object has an alphabet
 |  property. This is an instance of an Alphabet class from Bio.Alphabet,
 |  for example generic DNA, or IUPAC DNA. This describes the type of molecule
 |  (e.g. RNA, DNA, protein) and may also indicate the expected symbols
 |  (letters).
 |  
 |  The Seq object also provides some biological methods, such as complement,
 |  reverse_complement, transcribe, back_transcribe and translate 

Biopython has many ways of working with sequence data. Today we'll have a look at the following components:
- Working with sequences in : `Seq` and `Alphabets`,
- Sequence annotations with: `SeqRecord` objects,
- Reading, writing and parsing files with: `SeqIO`
- Querying NCBI with: `SeqIO`
- BLAST from within Python