Skip to content
Fetching contributors…
Cannot retrieve contributors at this time
226 lines (155 sloc) 8.12 KB
Biopython README file
The Biopython Project is an international association of developers of freely
available Python tools for computational molecular biology.
Our website provides an online resource for modules,
scripts, and web links for developers of Python-based software for life
science research. This is hosted by the Open Bioinformatics Foundation or
O|B|F,, who also host BioPerl etc.
This Biopython package is made available under generous terms. Please
see the LICENSE file for further details.
If you use Biopython in work contributing to a scientific publication, we ask
that you cite our application note (below) or one of the module specific
publications (listed on our website):
Cock, P.J.A. et al. Biopython: freely available Python tools for computational
molecular biology and bioinformatics. Bioinformatics 2009 Jun 1; 25(11) 1422-3 pmid:19304878
For the impatient
To build and install Biopython, download and unzip the source code, go to this
directory at the command line, and type:
python build
python test
sudo python install
System Requirements
- Python 2.5, 2.6 or 2.7, see
See below for notes about testing with Python 3.
- NumPy, see (optional, but strongly recommended)
This package is only used in the computationally-oriented modules.
It is required for Bio.Cluster, Bio.PDB and a few other modules. If you
think you might need these modules, then please install NumPy first BEFORE
installing Biopython. The older Numeric library is no longer supported in
- ReportLab, see (optional)
This package is only used in Bio.Graphics, so if you do not need this
functionality, you will not need to install this package. You can install
it later if needed.
- matplotlib, see (optional)
The Bio.Phylo uses this package to plot phylogenetic trees. As with
ReportLab, you can install this at any time to enable the plotting
- networkx, see (optional) and
pygraphviz or pydot, see and (optional)
These packages are used for certain niche functions in Bio.Phylo.
Again, they are only needed to enable these functions and can be installed
later if needed.
- psycopg2, see (optional) or
PyGreSQL (pgdb), see (optional)
These packages are used by BioSQL to access a PostgreSQL database.
- MySQLdb, see (optional)
This package is used by BioSQL to access a MySQL database.
In addition there are a number of useful third party tools you may wish to
install such as standalone NCBI BLAST, EMBOSS or ClustalW.
Python 3.x
Note we do not intend to support Python 3.0, please use Python 3.1 or later.
Python 3 support is still incomplete (and we do not yet officially support
using Biopython with Python 3), but the majority of modules are largely
functional. In order to be able to run Biopython with Python 3 it is first
converted using the 2to3 script. This is done automatically by the
script. Use `python3 build` etc (see installation notes below).
**Make sure that Python is installed correctly**
Installation should be as simple as going to the biopython source code
directory, and typing:
python build
python test
sudo python install
If you need to do additional configuration, e.g. changing the base
directory, please type `python`, or see the documentation for
Biopython includes a suite of regression tests to check if everything is
running correctly. To do the tests, go to the biopython source code directory
and type:
python test
Do not panic if you see messages warning of skipped tests:
test_DocSQL ... skipping. Install MySQLdb if you want to use Bio.DocSQL.
This most likely means that a package is not installed. You can
ignore this if it occurs in the tests for a module that you were not
planning on using. If you did want to use that module, please install
the required dependency and re-run the tests.
Source Code Repository
Like most of the Open Bioinformatics Foundation (OBF) projects, Biopython
currently uses git for distributed source code control, hosted by github
Until late September 2009, Biopython source code was kept in CVS running
on the OBF servers. These will remain online for a while as "read only"
static backups of everything up to and including Biopython 1.52.
Experimental code
Biopython 1.61 introduced a new warning, Bio.BiopythonExperimentalWarning,
which will be used to mark any experimental code included in the otherwise
stable Biopython releases. Such 'beta' level code is ready for wider
testing, but still likely to change, and should only be tried by early
adopters in order to give feedback via the biopython-dev mailing list.
We'd expect such experimental code to reach stable status within one or two
releases, at which point our normal policies about trying to preserve
backwards compatibility would apply.
While we try to ship a robust package, bugs inevitably pop up. If you are
having problems that might be caused by a bug in Biopython, it is possible
that it has already been identified. Update to the latest release if you are
not using it already, and retry. If the problem persists, please search our
bug database at and our
mailing lists to see if it has already been reported (and hopefully fixed),
and if not please do report the bug. We can't fix problems we don't know
about ;)
If you suspect the problem lies within a parser, it is likely that the data
format has changed and broken the parsing code. (The text BLAST and GenBank
formats seem to be particularly fragile.) Thus, the parsing code in
Biopython is sometimes updated faster than we can build Biopython releases.
You can get the most recent parser by pulling the relevant files (e.g. the
ones in Bio.SeqIO or Bio.Blast) from our git repository. However, be careful
when doing this, because the code in github is not as well-tested as released
code, and may contain new dependencies.
Finally, you can send a bug report to the bug database or the mailing list at In the bug report, please let us know:
1. Which operating system and hardware (32 bit or 64 bit) you are using
2. Python version
3. Biopython version (or git version/date)
4. Traceback that occurs (the full error message)
And also ideally:
5. Example code that breaks
6. A data file that causes the problem
Contributing, Bug Reports
Biopython is run by volunteers from all over the world, with many types of
backgrounds. We are always looking for people interested in helping with code
development, web-site management, documentation writing, technical
administration, and whatever else comes up.
If you wish to contribute, please visit the web site
and join our mailing list:
Distribution Structure
README -- This file.
NEWS -- Release notes and news
LICENSE -- What you can do with the code.
CONTRIB -- An (incomplete) list of people who helped Biopython in
one way or another.
DEPRECATED -- Contains information about modules in Biopython that are
removed or no longer recommended for use, and how to update
code that uses those modules. -- Tells distutils what files to distribute -- Installation file.
Bio/ -- The main code base code.
BioSQL/ -- Code for using Biopython with BioSQL databases.
Doc/ -- Documentation.
Scripts/ -- Miscellaneous, possibly useful, standalone scripts
Tests/ -- Regression testing code
Something went wrong with that request. Please try again.