The goal of ChromoZoom is to make genome browsing online as effortless as navigating the world on Google Maps, while retaining superior data density and customizability, modeled off of the capabilities of UCSC genome browser and IGV.
All data is drawn directly in the browser using canvas and SVG, similar to the approach of igv.js and pileup.js. There's a few substantial differences though:
- We placed a premium on fast navigation. You can zoom with the mousewheel and "throw" the display, just like Google Maps.
- You don't need to install software to a server or embed code into a webpage to use ChromoZoom. Simply visit chromozoom.org, which is designed as a first-class genome browsing experience for nearly all of UCSC's tracks and genomes.
- It's easy to create and load custom genomes using the IGB Quickload format.
ChromoZoom is free for academic, nonprofit, and personal use. The source code is licensed under the GNU Affero General Public License v3. In a nutshell, this license means that you are free to copy, redistribute, and modify the source code, but you are expected to provide the source for any code derived from ChromoZoom to anybody that receives the modified code or uses it over a computer network (e.g. as a web application). ChromoZoom is not free for commercial use. For commercial licensing, please contact the Roth laboratory.
To host ChromoZoom or run the UCSC track scraper, you need either macOS or Linux. For Windows users, we suggest usage of our virtual environment.
The web interface should work in any recent version of a modern HTML5-capable web browser (Chrome, Firefox, Safari, IE ≥11).
To serve the ChromoZoom web interface
Out of the box, ChromoZoom is serves a web interface that can display data on top of genome layouts crossloaded from UCSC, or data in IGB Quickload directories. You will need:
- PHP 5.x + Apache (or another webserver that can run PHP scripts)
- Note that magic quotes must be disabled.
- libcurl bindings for PHP (on macOS, this is included in the default PHP install)
- To support any of the binary track and genome formats, you will need the following on your
$PATH, which during setup will be symlinked into a new directory in this repo called
tabix, a generic indexer for TAB-delimited genome position files
samtools, utilities for viewing for the Sequence Alignment/Map (SAM) and BAM (Binary SAM) formats
- The following Jim Kent binaries:
Place a checkout of this repo somewhere in your webserver's DOCROOT. To setup the aforementioned symlinks to binaries, run
rake check from the command line at the root of the repo. Files under
index.php will need to be executable by the webserver. Access
index.php from a web browser to view the ChromoZoom interface.
Note: To support HTTPS URLs for VCF/tabix or BAM files, you will need to compile
libcurl support. See below for details.
To scrape data from UCSC
We provide a pipeline to convert data from genomes hosted at UCSC into highly efficient binary formats that make it simple to serve thousands of annotation tracks from flatfiles. This is the strategy used for chromozoom.org.
The script is at
UCSC_tracks/get_tracks.py. See the README.md in that directory for full instructions on how to run the track scraper. You can target the scraper to specific UCSC genome assemblies using the
Running in virtual environment
Using virtualization ChromoZoom can run easily from any system. VirtualBox and Vagrant must be installed. To set up your environment, run the following:
$ cd path/to/this/repo $ vagrant up
Once set up, you can access ChromoZoom at
In addition to the above, you'll need node.js and two npm packages:
$ npm install -g browserify watchify
$ git clone https://github.com/rothlab/chromozoom.git $ cd chromozoom $ rake check
This will tell you if you're missing any of the previously mentioned binaries needed for hosting ChromoZoom or running the UCSC track scraper. You should then serve this directory from Apache + PHP (symlinking into your existing webroot usually works) and access
js/, you need to recompile the scripts in
build/. When developing, use
$ rake watchify
which will open three screen sessions and continuously recompile debug-friendly versions of the scripts (quit by pressing Ctrl + A, then type
:quit + Enter.) To compile minified scripts for production, use
$ rake browserify
which also runs right before you commit code to git, since
rake check installs a pre-commit hook (see
None of the following components are strictly necessary for running ChromoZoom—however, they add useful capabilities, such as improved searching and track format support. Both of these upgrades were used for our main instance at chromozoom.org.
bigBedSearch, which allows prefix searching of bigBed fields
- HTTPS support for
The bigBed format can include extra B+ tree indices in the very last section of the file, which ChromoZoom can then use to search for features by the text content of various fields in the uncompressed BED data. e.g., if you want to search a gene track for gene names matching a certain prefix, these indices make such a search practical even if the track itself is large and somewhere else on the web.
I've created a binary that enables these prefix queries, which you can install if you have
$ git clone https://github.com/powerpak/bigBedSearch.git $ cd bigBedSearch $ make
This should produce a
bigBedSearch executable that you can copy to ChromoZoom's
bin/ directory so the web frontend can use it.
If you want HTTPS to work, either make sure
/usr/include/openssl is available, or specify the equivalent SSL_DIR as an environment variable.
You can also use that source tree to produce customized versions of
bigWigSummary, if UCSC's binaries weren't compiled in the way you prefer. (e.g., HTTPS doesn't always seem to work in UCSC's macOS binaries.)
HTTPS support for
Current release versions for
tabix don't support HTTPS, but
libcurl is being merged into the next planned release so that this is possible. To get these features now, follow these instructions, which are largely cribbed from this answer on BioStars, with a major change being that libcurl was already merged into the development branch for htslib.
You'll first need to have
ncurses with development headers. On macs,
brew install autoconf and you should already have the rest if you have Xcode. On most Linux distros, these are all easily found in your respective package manager.
Get the development version of htslib and setup the configure script:
$ git clone https://github.com/samtools/htslib.git $ cd htslib/ $ autoconf
If the last step fails with something about m4 macros, try being more forceful with
autoreconf --install. Then configure with libcurl support and compile:
$ ./configure --enable-libcurl $ make
(Side note. To get this to compile with a slightly older
libcurl, such as the moderately ancient version 7.19.7 on certain high-performance computing nodes, you may have to remove the case statement about
Once it works, you'll find
tabix in this directory, along with
htsfile (which is like
file, for sequencing formats), both with HTTPS support. Test that it's working with
$ ./htsfile https://hostname.example.com/path/to/some.bam
All good? Then get the source release for
$ cd .. $ curl -LO https://github.com/samtools/samtools/releases/download/1.2/samtools-1.2.tar.bz2 $ tar xzvf samtools-1.2.tar.bz2 $ cd samtools-1.2
Although this includes htslib 1.2.1, you want to point it to the development version you just installed:
$ rm -rf htslib-1.2.1 $ ln -s ../htslib htslib-1.2.1 $ make LDLIBS+=-lcurl LDLIBS+=-lcrypto
You should find
samtools in this directory. Test it against some BAM file on an HTTPS server, and if you get back SAM data you're in good shape:
$ ./samtools view https://hostname.example.com/path/to/some.bam 1:1-10000
(Note that this will spit out a
.bai file into the current directory, which you can safely delete afterward.)