Genome U-Plot sample implementation

The Genome U-Plot is a JavaScript tool to visualize Chromosomal abnormalities in the Human Genome using a U-shape layout.

Whole Genome U-Plot. Visible are the 24 human chromosomes arranged in a U-shape, the cytobands, the chromosome junctions and the copy number variations (CNVs). The axes at the bottom right of the graph are respectively for the chromosomes on the right side of the plot.

Node

Node.js is an open-source, cross-platform JavaScript runtime environment for developing a diverse variety of server tools and applications.

We use Node for basically everything in this project, so we are going to need it. Please visit the download page for macOS or Windows binaries, or the package manager installations page for Linux distributions.

In this project we used Node.js v6.10.0 LTS.

Node Version Management Tools

If you need the flexibility to use multiple versions of Node, check out NVM or Windows NVM.

NPM

NPM is the default package manager for Node. It is automatically installed alongside with Node. Package managers are used to install and manage packages (modules of code that you or someone else wrote). We are going to use a lot of packages but we'll use Yarn, another package manager.

Yarn

Yarn is a Node.js package manager which is much faster than NPM, has offline support, and fetches dependencies more predictably.

To install yarn

Use NPM and run:

> $ npm install --global yarn

To install the project dependencies

Start a command shell, change directory to the directory of the project and install the project dependencies using:

> $ yarn install

To run the project

Use:

> $ yarn start

Using a modern browser visit:

http://localhost:8000/GenomePlot.html?sampleId=LNCAP

Data Visualization

A sample (LNCAP) with all required files is provided in the public/data directory

LNCAP/LNCAP_alts_comprehensive.csv  (Sample Rearrangements)
LNCAP/LNCAP_cnvIntervals.csv        (Sample Copy Number Variation - Intervals)
LNCAP/LNCAP_genomePlot_cnv30.json   (Sample Copy Number Variation - Raw Frequency)
LNCAP/LNCAP_visualization.json      (Sample Definition)

In order to run the application against a different sample (eg. MY_SAMPLE) you need to create an appropriate directory and file structure replacing for example LNCAP with MY_SAMPLE. Finally don't forget to replace your sample name in the URL parameter of the app.

Reference file

A Human Genome Assembly GRCh38 cytobands reference file is provided by the visualization (public/reference/cytobands/hg38/cytoBand.json), however if you want to use your own you may download and uncompress a definition file from ftp://hgdownload.cse.ucsc.edu/goldenPath/hg38/database/cytoBand.txt.gz. Then you must convert the file to a json format of the following form:

[
    {
        chrom: "chr1",
        chromStart: 0,
        chromEnd: 2300000,
        gieStain: "gneg",
        name: "p36.33"
    }, {
        chrom: "chr1",
        chromStart: 2300000,
        chromEnd: 5300000,
        gieStain: "gpos25",
        name: "p36.32"
    },
    ...
]

Sample Definition

A sample specific json file must be provided (as in LNCAP\LNCAP_visualization.json):

{
    fileFormatVersion: 1,    
    altsComprehensive: "sampleId_alts_comprehensive.csv",
    cnvBinned30KJson: "sampleId_genomePlot_cnv30.json",
    cnvIntervals: "sampleId_cnvIntervals.csv"
}

Sample Rearrangements

In order to visualize chromosomal rearrangements, a csv file is required (as in LNCAP/LNCAP_alts_comprehensive.csv) and the following columns of integers must be supplied:

Nassoc,chrA,chrB,posA,posB

where Nassoc is the number (integer) of supporting fragments of the events.

Sample Copy Number Variation

In order to visualize copy number, two files of a specific format must be supplied. First, a file (as in LNCAP/LNCAP_genomePlot_cnv30.json) with the raw frequency data from a 30000 bin moving window.

The second file contains the copy number state information; a csv file (as in LNCAP/LNCAP_cnvIntervals.csv) with the following columns must be supplied:

chr,start,end,cnvState,nrd

where cnvState is one of 1 (loss), 2 (normal) or 3 (gain) and nrd is a floating point value corresponding to the Normalized Read Depth score that provides a quantitative measure of how far the CNV deviates from the calculated normal level (nrd = 2.0).

Variant Call Format (VCF) file Support

In order to run the application against a sample that is stored in a VCF file, we provide an R script vcftoUplot.R (which resides in the public/data directory). The script was tested with R-3.3.3 and requires the R package VariantAnnotation, which will be automatically installed if not present. The script takes as input a VCF file (tested VCF v4.1 and v4.2) and produces the file structure hierarchy required by the Genome U-Plot in order to visualize the sample. Finally don't forget to replace your sample name in the URL parameter of the app.

To run `vcftoUplot.R`

Given a VCF sample file NA12878.vcf (provided in the public/data directory), run

Rscript vcftoUplot.R NA12878.vcf

This will produce the following directory hierarchy

NA12878/
├── NA12878_alts_comprehensive.csv
└── NA12878_visualization.json

Then, using a modern browser visit:

http://localhost:8000/GenomePlot.html?sampleId=NA12878

Note: For this particular example you should use the "Filter on # of Frags" GUI option in order to reduce the number of visualized Chromosomal abnormalities. You can also uncheck the "Line width to # Frags" to disassociate the line thickness from the number of fragments supporting the event.

Note II: The Human Genome Assembly GRCh38 is assumed

Commercial use

If you want to use Genome U-Plot in commercial settings, please contact us.

How to cite

Gaitatzes AG, Johnson SH, Smadbeck JB and Vasmatzis G.; Genome U-Plot: a whole genome visualization. Bioinformatics 2017 Dec 21. https://doi.org/10.1093/bioinformatics/btx829

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
public		public
src		src
vendor/dat-gui		vendor/dat-gui
.babelrc		.babelrc
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
README.md		README.md
package.json		package.json
webpack.config.babel.js		webpack.config.babel.js
yarn.lock		yarn.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Genome U-Plot sample implementation

Node

Node Version Management Tools

NPM

Yarn

To install yarn

To install the project dependencies

To run the project

Data Visualization

Reference file

Sample Definition

Sample Rearrangements

Sample Copy Number Variation

Variant Call Format (VCF) file Support

To run `vcftoUplot.R`

Commercial use

How to cite

About

Releases

Packages

Contributors 2

Languages

gaitat/GenomeUPlot

Folders and files

Latest commit

History

Repository files navigation

Genome U-Plot sample implementation

Node

Node Version Management Tools

NPM

Yarn

To install yarn

To install the project dependencies

To run the project

Data Visualization

Reference file

Sample Definition

Sample Rearrangements

Sample Copy Number Variation

Variant Call Format (VCF) file Support

To run vcftoUplot.R

Commercial use

How to cite

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

To run `vcftoUplot.R`

Packages