Skip to content


Subversion checkout URL

You can clone with
Download ZIP
tag: release-1_70
Fetching contributors…

Cannot retrieve contributors at this time

679 lines (516 sloc) 30.339 kB
Generic Genome Browser Installation
GBrowse is distributed as binary packages for Windows and Macintosh OS
X, and as source code for Unix systems. For binary installations, please
see the online instructions located at
The remainder of this documentation contains instructions are for a
source code (manual) build.
Source Code (Manual) Build
GBrowse runs on top of several software packages. These must be
installed and configured before you can run GBrowse. Most preconfigured
Linux systems will have some of these packages installed already.
A) MySQL -- <>
The MySQL database is a fast open source relational database that is
widely used for web applications. For small projects (a few
thousands of annotated features), you can skip installing MySQL and
use an in-memory database instead.
B) Apache Web Server -- <>
The Apache web server is the industry standard open source web
server for Unix and Windows systems.
C) Perl 5.005 -- <>
The Perl language is widely used for web applications. Version 5.6
is preferred, but 5.00503 or higher will work.
D) Standard Perl modules -- <>
The following Perl modules must be installed for GBrowse to work.
They can be found on the Comprehensive Perl Archive Network (CPAN):
CGI (2.56 or higher)
GD (2.07 or higher)
CGI::Session (4.03 or higher)
DBI (any version)
DBD::mysql (any version)
Digest::MD5 (any version)
Text::Shellwords (any version)
Class::Base (any version)
E) BioPerl version 1.6 or higher -- <>
Or 'bioperl-live'.
F) Bio::Graphics 1.97 or higher
Bio::Graphics used to be part of BioPerl but has been broken out as
a separate package that can be installed from CPAN.
Optional modules:
F) XML::Parser, XML::Writer, XML::Twig, XML::DOM
If these modules are present, the "Sequence Dumper" plugin will be
able to produce GAME and BSML output. They can be downloaded from
To load remote 3d party annotations. Available from CPAN.
H) Bio::Das
To display remote annotations using the Distributed Annotation
System. The current version is available at
Needed by gbrowse_moby to fetch and display data from MOBY
providers. Available from; obtain via anonymous cvs
until it is released. Directions are at
To save images as publication-quality editable images in Scalar
Vector Graphics format. Available from CPAN.
K) Bio::SCF File::Temp io-lib(v1.7+)
Needed by the trace glyph which can parse SCF files and display the
trace graph. The io-lib library can be downlowded from
e_id=108243 which is part of the Staden Package
Once the prerequisites are installed, download the most recent version
of the Generic-Genome-Browser source code from:
This will give you a .tar.gz file, which must be uncompressed and
unpacked. Then run the following commands (in brief):
perl Makefile.PL
make test (optional)
make install UNINST=1
This will install the software in the default location under
/usr/local/apache. See "Details" to change this, or to install gbrowse
into your home directory. The 'UNINST=1' will insure that older versions
of perl modules being installed will be removed to help prevent
To further configure GBrowse, see CONFIGURE_HOWTO. To run GBrowse on top
of Oracle and PostgreSQL databases see ORACLE_AND_POSTGRESQL. To run on
top of a BioSQL database, see BIOSQL_ADAPTER_HOWTO.
The browser consists of a CGI script named "gbrowse", a Perl module that
handles some of the gory details, a small number of static image files,
and a configuration directory that contains configuration files for each
data source. The correct locations of these CGI script, configuration
directory and static files depend on how Apache was installed on your
system, which varies from operating system to operating system, and are
controlled by the following installation options:
CGI script: /usr/local/apache/cgi-bin/gbrowse
Static images: /usr/local/apache/htdocs/gbrowse
Config files: /usr/local/apache/conf/gbrowse.conf
The module: -standard site-specific Perl library location-
You can change change the location of the installation by passing
Makefile.PL one or more NAME=VALUE pairs, like so:
perl Makefile.PL CONF=/etc HTDOCS=/home/html
This will cause the configuration files to be installed in
/etc/gbrowse.conf and the static files to be installed in
Fortunately, this isn't usually necessary. The Makefile.PL script
attempts to guess the appropriate directory locations for your system,
but sometimes you will have to specify them manually. For example, if
you are on an unusual system, where the Apache installation uses
/opt/www/html for HTML files, /opt/run/cgi-bin for CGI scripts, and
/etc/httpd/conf for the configuration files, you should specify the
following configuration: perl Makefile.PL HTDOCS=/opt/www/html \
CONF=/etc/httpd/conf \ CGIBIN=/opt/run/cgi-bin
As a convenience, you can use the configuration option APACHE, in which
case the static and CGI files will be placed into APACHE/conf,
APACHE/htdocs and APACHE/cgi-bin respectively, where APACHE is the
location you specified on the command line:
perl Makefile.PL APACHE=/home/www
Note that the configuration files are always placed in a subdirectory
named gbrowse.conf. You cannot change this. Similarly, the static files
are placed in a directory named gbrowse. The install script will detect
if there are already configuration files in the selected directory and
not overwrite them if so. The same applies to the cascading stylesheet
file (gbrowse.css) located in the gbrowse subdirectory. However, neither
the GIF files in the "buttons" subdirectory nor the plugin modules in
the gbrowse.conf/plugins directory are checked before overwriting them,
so be careful to copy the new copies somewhere safe if you have modified
The DO_XS flag, if true (perl Makefile.PL DO_XS=1), will compile a small
C subroutine for nucleotide alignments. This will vastly improve the
performance of the gbrowse_details script when displaying alignments. To
use this feature, you will need a C compiler.
You can always manually move the files around after install. See
CONFIGURE_HOWTO for details.
When installing the static files, the install script also creates an
empty directory named "tmp". This directory is set to be world writable
so that the GBrowse server can use it to manage temporary image files
that it creates on the fly. If you would prefer not to have a world
writable directory on your system, simply change the ownership and
permissions to allow the web server account to write into it. The
directory is located in HTDOCS/gbrowse/tmp by default.
The first time you run Makefile.PL, a file named GGB.def will be created
your file path settings. When Makefile.PL is run again, it will ask you
whether you wish to reuse the settings stored in the file.
Read this section only if you are on a Unix system and do not have root
privileges. You will need to configure Apache to run out of your home
directory. One way to do this is to install Apache from source code and
to specify your home directory when you first configure it:
% cd apache_x.xx.xx
% ./configure --prefix=$HOME/apache
% make
% make install
This will place Apache into your home directory under ~/apache. You
should then edit ~/apache/conf/httpd.conf and replace the directive:
Listen 80
Listen 8000
so that Apache will listen for connections to the unprivileged port 8000
rather than the usual port 80. If you also see a "Port 80" directive,
change it to read "Port 8000." You will now be able to talk to Apache
using URLs like
You may not need to install Apache from scratch if your Unix
distribution already has Apache installed. What you will do is to create
an Apache directory tree in your home directory and then start Apache
using command-line arguments that tell it to start up from the home
directory rather than its default system-wide directory.
Create an Apache directory and its subdirectories using the following
series of commands:
% cd ~
% mkdir apache
% mkdir apache/conf
% mkdir apache/logs
% mkdir apache/htdocs
% mkdir apache/cgi-bin
Now copy the system-wide httpd.conf into ~/apache/conf. You may need to
search around a bit to find out where the system-wide httpd.conf lives
(try running the command "locate httpd.conf"):
% cp /etc/httpd/conf/httpd.conf ~/apache/conf
Now open up ~/apache/conf/httpd.conf with a text editor and add the
following four directives, replacing $HOME with the full path to your
home directory (for example "/home/fred"):
Listen 8000
ServerRoot $HOME/apache
DocumentRoot $HOME/apache/htdocs
You should search the httpd.conf file for older versions of these
directives, and delete them if they are there. If you see a Port
directive, change it to read "Port 8000".
Somewhere in httpd.conf there will be a ScriptAlias directives, as well
as a <Directory> section that refers to "cgi-bin". Delete the
ScriptAlias directive and the entire <Directory> section through to the
</Directory> line. Replace both these sections with the following:
ScriptAlias /cgi-bin/ "cgi-bin/"
<Location "/cgi-bin">
AllowOverride None
Options None
Order allow,deny
Allow from all
You can now start Apache from the command line using the "apachectl"
% /usr/sbin/apachectl -d ~/apache -k start
If Apache starts successfully, then this command will return silently.
Otherwise, it will print an error message. More error messages may be
found in ~/apache/logs/error_log.
To confirm that Apache is running from your home directory, create a
file named index.html and copy it into ~/apache/htdocs. You should then
be able to open a browser, connect to http://localhost:8000/, and see
the index.html file that you just created.
Now you can build and install gbrowse with the following incantation:
% cd Generic-Genome-Browser-X.XX
% perl Makefile.PL APACHE=~/apache LIB=~/lib BIN=~/bin NONROOT=1
% make
% make install
When you are prompted to load gbrowse using http://localhost/gbrowse,
use http://localhost:8000/gbrowse instead.
The installation procedure will create a small in-memory database of
yeast chromosome 1 for you to play with. To try the browser out, use
your favorite browser to open:
Try searching for "I" (the name of the first chromosome of yeast), or a
gene such as NUT21 or TCF3. Then try searching for "membrane
For your interest, the feature and DNA files for this database is
located in the Apache document root at gbrowse/databases/yeast_chr1. The
configuration file is in the web server configuration directory under
More configuration information and a short tutorial are located at:
This step takes you through populating the database with the full yeast
genome. You can skip this step if you use the in-memory database for
small projects (see section 6).
Remember as well that there are other datbase possibilities. For example
you could also use BioSQL (Mysql, Postgres, Oracle) or Chado (Postgres).
This example uses Mysql as it is relatively easy to set up.
mysql -uroot -p password -e 'create database yeast'
mysql -uroot -p password -e 'grant all privileges on yeast.* to me@localhost'
mysql -uroot -p password -e 'grant file on *.* to me@localhost'
mysql -uroot -p password -e 'grant select on yeast.* to nobody@localhost' -d yeast sample_data/yeast_data.gff
Note: This section refers to the user account under which Apache runs
as "nobody" because that is the most common case. However, many
systems use a different user account. Mac OSX uses "www", Fedora Core
uses "apache" and Ubuntu uses "www-data." In the instructions that
follow, replace 'nobody' with the appropriate Apache account name.
You will need an installation of MySQL for this section. Using the mysql
command line, create a database (called "yeast" in the synopsis above),
and ensure that you have update and file privileges on it. The example
above assumes that you have a username of "me" and that you will allow
updates from the local machine only. It also gives all privileges to
"me". You may be comfortable with a more restricted set of privileges,
but be sure to provide at least SELECT, UPDATE and INSERT privileges.
You will need to provide the administrator's name and correct password
for these commands to succeed.
In addition, grant the "nobody" user the SELECT privilege. The web
server usually runs as nobody, and must be able to make queries on the
database. Modify this as needed if the web server runs under a different
The next step is to load the database with data. This is accomplished by
loading the database from a tab-delimited file containing the genomic
annotations in GFF format. The Bioperl distribution comes with three
tools for loading Bio::DB::GFF databases:
This will incrementally load a database, optionally initializing it
if it does not already exist. This script will work correctly even
if the MySQL server is located on another host.
This Perl script will initialize a new Bio::DB::GFF database with a
fresh schema, deleting anything that was there before. It will then
load the file. Only suitable for use the very first time you create
a database, or when you want to start from scratch! The bulk loader
is as much as 10x faster than, but does not work in
the situation in which the MySQL database is running on a remote
This will incrementally load a database. On UNIX systems, it will
activate a fast loader that makes the speed almost the same as the
bulk loader. Be careful, though, because this is an experimental
piece of software.
You will find these scripts in the Bioperl distribution, in the
subdirectory scripts/Bio-DB-GFF. If you requested that Bioperl scripts
be installed during installation, they will also be found in your
command path.
For testing purposes, this distribution includes a GFF file with yeast
genome annotations. The file can be found in the test_data subdirectory.
If the load is successful, you should see a message indicating that
13298 features were successfully loaded.
Provided that the yeast load was successful, you may now run "make
test". This invokes a small test script that tests that the database is
accessible by the "nobody" user and that the basic feature retrieval
functions are working.
You may also wish to load the yeast DNA, so that you can test the
three-frame translation and GC content features of the browser. Because
of its size, the file containing the complete yeast genome is
distributed separately and can be downloaded from:
Load the file with this command: -d yeast -fasta yeast.fasta.gz </dev/null
You should now be able to browse the yeast genome. Type the following
URL into your favorite browser:
This will display the genome browser instructions and a search field.
Type in "III" to start searching chromosome III, or search for "glucose"
to find a bunch of genes that are involved in glucose metabolism.
*IF YOU GET AN ERROR* examine the Apache server error log (depending on
how Apache was installed, it may be located in /usr/local/apache/logs/,
/var/log/httpd/, /var/log/apache, or elsewhere). Usually there will be
an informative error message in the error log. The most common problem
is MySQL password or permissions problems.
7.GFF3 Loading
An increasing number of model organism databases are distributing genome
annotation in GFF3 format. An example of this format can be found at SGD
s_verevisiae.gff. Although these files will load into the standard
Bio::DB::GFF database, some of the features of GFF3, such as the ability
to represent multiple alternative splice forms as a single gene, will be
lost. We suggest instead that you use a Bio::DB::SeqFeature::Store
Here is a quick recipe.
Get a gff3 file (available from SGD, WormBase, FlyBase and many other
sites) and save it as genome.gff3. Then launch the mysql command-line
client and run commands similar to these (be sure to replace the example
user names with correct ones as described earlier).
mysql -uroot -p password -e 'create database genomegff3'
mysql -uroot -p password -e 'grant all privileges on genomegff3.* to me@localhost'
mysql -uroot -p password -e 'grant select on genomegff3.* to nobody@localhost' -d genomegff3 -f -c genome.gff3
Create a GBrowse config file by copying one of the existing examples,
and modify the top lines to read like the following:
db_adaptor = Bio::DB::SeqFeature::Store
db_args = -adaptor DBI::mysql
-dsn dbi:mysql:database=genomegff3
-user nobody
The database should now be browsable. For more details, see
Sample genome feature tables for the major model organisms and human can
be found at in the downloads section, but they are
increasingly out of date. Please go to the individual model organism
database's web sites to find the GFF or GFF3-format files you need. A
few notable sites are:
WormBase (C. elegans)
tables/ >
SGD (S. cerevisiae)
FlyBase (D. melanogaster)
In addition, the bin/ subdirectory of the GBrowse distribution contains
a series of scripts to convert annotation files in various formats into
GFF2 or GFF3 format. For example, the script will
convert gene models in Table Browser format files from
<> into GFF3 format. will download and load sequence annotation files in
GenBank format from NCBI. The sample configuration file 08.genbank.conf
(located in contrib/conf_files) is appropriate for data loaded with
To display the DNA sequence and to run sequence-dependent glyphs such as
the three-frame translation, you will need to load the DNA as well as
the annotations. The DNA must be formatted as a series of one or more
FASTA-format files in which each entry in the file corresponds to a
top-level sequence such as a chromosome pseudomolecule. You can then run
the or script using the -fasta
argument. For example, if the yeast genome is contained in a FASTA file
named yeast.fa, you would run the command: -d yeast -fasta yeast.fa sample/yeast_data.gff
Alternatively, you may put several FASTA files into a directory, and
provide the directory name as the argument to -fasta.
(The yeast DNA is too large to be included in this distribution, but you
can get a copy of it from <>)
Run " -h" to see usage instructions.
Newer versions of GFF (the so-called "GFF2.5" and "GFF3" formats)
include the DNA at the bottom of the file, following the sequence
annotations. If you are loading one of these GFF files, the DNA will be
recognized automatically and loaded by any of the loaders.
See the file doc/pod/CONFIGURE_HOWTO.pod for information on how to
create new databases from scratch, add new browser tracks, and how to
get the browser to dump the DNA from the region currently under display.
Three factors are major contributors to the length of time it takes to
load a gbrowse page:
1 Loading the Perl interpreter and parsing BioPerl and all the other
Perl libraries that gbrowse uses.
2 Query speed on the database
3 The conversion at the Perl layer of database data into BioPerl
objects for rendering.
To improve (1), we recommend that you install the mod_perl module for
Apache. (<>). By configuring an Apache::Registry
directory and placing gbrowse inside it (rather than in the default
cgi-bin directory). The overhead for loading Perl and its libraries are
eliminated, thereby increasing the performance of the script noticeably.
Be aware that there is a bad interaction between the Apache::DBI module
(often used to speed up database accesses) and Bio::DB::GFF. This will
cause the GFF dumper plugin to fail intermittently. GBrowse does not
need Apache::DBI to achieve performance increases under mod_perl and it
is suggested that you disable Apache::DBI. If you cannot do this, then
you should remove the file from the gbrowse.conf/plugins
Database query performance (2) is also a major factor. If you are using
MySQL as the backend, you will see dramatic performance increases by
increasing the amount of memory available to the key buffer, sort
buffer, table cache and other in-memory data structures. we suggest that
you replace the default MySQL configuration file (usually stored in
/etc/my.cnf) with one of the large-memory sample configuration files
provided in the support-files subdirectory of the MySQL distribution. Of
course, if you tell MySQL to use more memory than you have, then
performance will degrade again.
Finally, there is a slowdown when gbrowse converts the results of
database SQL queries into renderable biological objects. This becomes
particularly noticeable when there are lots of multi-segment objects to
be displayed. You can work around this slowdown by using semantic
zooming (see CONFIGURE_HOWTO). Otherwise, there's not much that can be
done about this short of buying a faster machine. The GMOD team is
working hard to reduce this performance hit.
Whenever you are running a server-side Web script using information
provided by a web client, there is a risk that maliciously-formatted
data provided by the use will trick the server-side script into
performing some unintentional action, such as modifying a file on the
server. Perl's "taint" checks are designed to catch places in the code
where such malicious data could cause harm, and GBrowse has been tested
extensively with these taint checks activated.
Because of taint checks' noticeable impact on performance, they have
been turned off in the distributed version of gbrowse. If you wish to
reactivate the extra checking (at the expense of a performance hit), go
to the file "gbrowse" located in the Web scripts directory and edit the
top line of the file to read:
#!/usr/bin/perl -w -T
The -T switch turns on taint checks.
If you are running GBrowse under mod_perl, add the following line to the
httpd.conf configuration file:
PerlTaintCheck On
This will affect all mod_perl scripts globally.
The gbrowse_img CGI script is a stripped-down version of gbrowse which
just generates images. It is suitable for incorporating into <img> tags
in order to make a thumbnail of a region of interest. The thumbnail can
then be linked to the full-featured gbrowse. Here is an example of how
this works using the WormBase site:
<a href="">
<img src=";width=200">
This will generate a 200-pixel inline image of the region. Clicking on
the image will link to the fully-navigable gbrowse script.
You can also use gbrowse_img to superimpose temporary features (like
BLAST hits) on the existing genome features.
If the script is called without CGI arguments, it will generate usage
instructions. Select <> to see this
internal documentation.
Gbrowse has a plugin architecture which makes it easy for third-party
developers to expand its functionality. The plugins are Perl .pm files
located in the directory gbrowse.conf/plugins/. To install plugins,
simply copy them into this directory. To uninstall, remove them.
If you wish to install your own or third party plugins, it is suggested
that you create a separate directory outside the gbrowse.conf/ hierarchy
in which to store them and then to indicate the location of these
plugins using the plugin_path setting:
plugin_path = /usr/local/gbrowse_plugins
This setting should be somewhere in the [GENERAL] section of the
relevant gbrowse configuration file.
Sample configuration number 5 ("05.embl.conf") corresponds to a
pass-through proxy for Genbank. At least in theory, if you enter a
landmark that isn't recognized, gbrowse will go to EMBL using the
bioperl BioFetch facility, parse the record, and enter it into the local
database. This allows you to browse arbitrary Genbank/EMBL/Refseq
This functionality is not well supported, but here is a recipe for
giving it a try:
Create a local database named "embl" and initialize it this way:
Set up permissions for this database so that "nobody@localhost" has
Initialize the database for use with this command:
% -c -d embl
If you need to use a proxy to access remote web sites, uncomment the
-proxy line in the conf file, and adjust the URL of the proxy as
Go to <http://localhost/cgi-bin/gbrowse/embl>. Search for a Genbank or
embl accession number, such as CEF58D5
As GBrowse runs, it creates temporary image files in the gbrowse tmp
directory (typically HTDOCS/gbrowse/tmp). These image files are
relatively small, but if you run GBrowse for a long time they may begin
consuming significant amounts of disk space. The following Unix shell
commands will remove old image files and unused directories:
cd HTDOCS/gbrowse/tmp
find . -type f -atime +20 -print -exec rm {} \;
find . -type d -empty -depth -exec rmdir {} \;
Be sure to replace HTDOCS with the path to your web server HTML document
root directory. You might want to run this command under cron, but be
sure that the user that the cron job runs under has the proper
permissions. You may need to install it in root's cron script.
The balloon tooltip effect requires the balloon javascript files and the
background balloon images files. You can download them from:
Copy balloon.js, yahoo-dom-event.js and prototype.js into the folder:
where $HTDOCS is the path to your apache HTML files. For example, if
your apache installation path is D:/apache, copy those js files into:
And then copy all balloon image files into the folder:
information on how to use this feature.
Please report bugs to the GMOD project bug tracking system at
<>. EMail
support is available by sending requests for help to
Have fun!
Lincoln Stein & the GMOD team
Jump to Line
Something went wrong with that request. Please try again.