Interactive visualizations for differential expression
Clone or download
mccalluc Move the commandline help to the bottom of the README (#222)
* Move the commandline help to the bottom of the README

* input file format section in readme

* CSV example; reword. [skip ci]
Latest commit f5b3e46 Sep 21, 2018


A heatmap-scatterplot using Dash by plotly. From the commandline it can be started on localhost or AWS, or it can be run from the Refinery GUI.


Getting Started


If you have Docker installed, and data available at public URLs, this is the easiest way to get started:

$ V=v0.1.5
$ FIXTURES=$V/fixtures/good/data
$ docker run --name heatmap --detach --publish 8888:80 \
  -e "FILE_URLS=$FIXTURES/counts.csv $FIXTURES/counts-copy.csv.gz" \

Then visit http://localhost:8888.

If multiple URLs are desired, use spaces in the value of the environment variables, as in the example. Besides providing count data, you can also specify DIFF_URLS for differential expression data and META_URLS for metadata.

From Source

Check out the project and install dependencies:

  # Requires Python3:
$ python --version
$ git clone
$ cd heatmap-scatter-dash
$ pip install -r requirements-freeze.txt

Then run it locally:

$ cd context

  # Generate a random matrix:
$ ./ --demo 100 10 5

  # Load data from disk:
$ ./ --files ../fixtures/good/data/counts.csv \
                  --diffs ../fixtures/good/data/stats-* \
                  --meta ../fixtures/good/data/metadata.csv

and visit http://localhost:8050/.

Input file format

For input, a variety of tabular data formats are supported: CSV, TSV, GCT, or any of those zipped. In the example below, r1, r2, etc. are genes and c1, c2, etc. are conditions. Optionally, name1, name2, etc. are human-readable names for the genes,

gene c1 c2 c3 c4 code1 code2
r1 1 2 3 4 a b
r2 5 6 7 8 c b
r3 3 5 7 9 c d

More examples are available.


One bash script,, handles all our tests:

  • Python unit tests
  • Python style tests (flake8 and isort)
  • interaction tests
  • Docker container build and launch

A few more dependencies are required for this to work locally:

  # Install Docker:
$ pip install -r requirements-dev.txt
$ npm install cypress --save-dev


Successful Github tags and PRs will prompt Travis to push the built image to Dockerhub. For a new version number:

$ git tag v0.0.x && git push origin --tags


There are a few notes on implementation decisions and lessons learned.

The online help can be previewed to get a better sense of the operational details.

Command line usage:

$ python -h
usage: [-h] (--demo ROWS COLS META | --files CSV [CSV ...])
                     [--diffs CSV [CSV ...]] [--metas CSV [CSV ...]]
                     [--most_variable_rows ROWS] [--html_table]
                     [--truncate_table N] [--port PORT]
                     [--p_value_re RE [RE ...]] [--log_fold_re RE [RE ...]]
                     [--profile [DIR]] [--html_error] [--debug]
                     [--api_prefix PREFIX]

Light-weight visualization for differential expression

optional arguments:
  -h, --help            show this help message and exit
                        Generates a random matrix with the number of rows and
                        columns specified. In addition, "META" determines the
                        number of mock metadata fields to associate with each
  --files CSV [CSV ...]
                        Read CSV or TSV files. Identifiers should be in the
                        first column and multiple files will be joined on
                        identifier. Gzip files are also handled.
  --diffs CSV [CSV ...]
                        Read CSV or TSV files containing differential
                        expression data.
  --metas CSV [CSV ...]
                        Read CSV or TSV files containing metadata: Row labels
                        should match column headers of the raw data.
  --most_variable_rows ROWS
                        For the heatmap, we first sort by row variance, and
                        then take the number of rows specified here. Defaults
                        to 500.
  --html_table          The default is to use pre-formatted text for the
                        tables. HTML tables are available, but are twice as
  --truncate_table N    Truncate the table to the first N rows. Table
                        rendering is often a bottleneck. Default is not to
  --port PORT           Specify a port to run the server on. Defaults to 8050.

  These parameters will probably only be of interest to developers, and/or
  they are used when the tool is embedded in Refinery.

  --p_value_re RE [RE ...]
                        Regular expressions which column headers will be
                        checked against to identify p-values. Defaults to
                        ['p.*value', 'padj', 'fdr'].
  --log_fold_re RE [RE ...]
                        Regular expressions which column headers will be
                        checked against to identify fold-change values.
                        Defaults to ['\\blog[^a-z]'].
  --profile [DIR]       Saves a profile for each request in the specified
                        directory, "/tmp" by default. Profiles can be viewed
                        with snakeviz.
  --html_error          If there is a configuration error, instead of exiting,
                        start the server and display an error page.
  --debug               Run the server in debug mode: The server will restart
                        in response to any code changes, and some hidden
                        fields will be shown.
  --api_prefix PREFIX   Prefix for API URLs.