Skip to content
Rick Kim edited this page Jan 9, 2019 · 27 revisions

Open-CRAVAT

Open-CRAVAT is a python package that performs genomic variant interpretation including variant impact, annotation, and scoring. Open-CRAVAT is similar to the original web-based CRAVAT but it can be installed locally and is easy to integrate into bioinformatics pipelines. Also, Open-CRAVAT has a modular architecture with a wide variety of analysis modules that can be selected and installed based on the needs of a given study. The modules are made available via the CRAVAT Store and are developed both by the CRAVAT team and the broader variant analysis community. Open-CRAVAT is a product of the Karchin Lab at Johns Hopkins University in collaboration with In Silico Solutions with funding provided by the National Cancer Institute's ITCR program.

Overview

CRAVAT_Overview

Open-CRAVAT is a modular python package that is available in the pip PyPI repository. It takes a file of genomic variants as input. The most common input format is a VCF file but other formats are supported. The Open-CRAVAT package includes 3 programs:

  • cravat runs variant analysis
  • cravat_admin configures cravat including getting modules from the CRAVAT Store
  • cravat_view is used to interactively visualize, sort, filter, and explore cravat results

The type of analysis performed by cravat is dependent on which annotators have been installed. The CRAVAT Store contains all of the available annotators and can be browsed with the cravat_admin tool. In the near future, we will have a graphical CRAVAT Store in cravat_view program. Some annotators include large reference databases so they take time to install and use considerable disk space. Open-CRAVAT provides several output formats including text reports, Excel spreadsheets, and a SQLite database of results used by cravat_view.

Open-CRAVAT Processing

CRAVAT_Components

When the cravat program is run, it will execute a series of modules required for variant analysis. First, the appropriate converter will be run to parse the input variant file. Next, a mapper module will determine the transcripts and associated genes affected by each variant including protein impact. Then cravat runs all of the requested/installed annotation modules and after all annotation is complete, an aggregator program collects and collates the results into a SQLite database. Finally, reporter modules are run to produce the requested format of results.

Available Annotators

As of 9/24/2018, openCRAVAT has the following annotators available, with more on the way.

  • Cancer Genome Census
  • Cancer Genome Landscape
  • CHASMplus
  • ClinVar
  • COSMIC
  • COSMIC Gene
  • dbSNP
  • Denovo-DB
  • ESP6500
  • gnomAD
  • Gene Ontology
  • GRASP-GWAS
  • HGVS Format
  • MuPIT
  • MutPred
  • NCBI Gene
  • ncRNA
  • NDEx
  • PhD-SNPg
  • Pseudogene
  • PubMed
  • Repeat Sequences
  • REVEL
  • TARGET
  • Annotator template
  • 1000 Genomes
  • VEST

Quick Start

To install Open-CRAVAT you need Python 3.5 or newer. There are two steps in installing Open-CRAVAT, installing Open-CRAVAT pip package and installing the base components of Open-CRAVAT which are essential in the operation of Open-CRAVAT.

For Mac OS: We recommend installing Python 3 using the installation file provided at python.org instead of using any other manual way of installing Python 3. After installing Python 3, a new terminal should be opened and used in executing the below commands. Any terminal session which was already open before installing Python 3 will not work properly with open-cravat commands.

For Ubuntu: pip3 provided by apt does not install executables properly. We recommend the following steps before proceeding. sudo apt remove python-pip if pip3 has already been installed with apt. Then wget https://bootstrap.pypa.io/get-pip.py and sudo python3 get-pip.py.

Install Python Package

pip3 install open-cravat

Install Base Components

cravat-admin install-base

One of them, hg38 gene mapper, can take ~15 minutes to install.

Test Run

Open-CRAVAT is now ready to use. With the base components installed, Open-CRAVAT can annotate variants with genes and sequence ontology. If you want more annotation, you can pick and choose annotation modules and use them, as shown in "Install Annotators" section.

If you want, you can test if Open-CRAVAT is working properly using a built-in test input. Make the test input file with

cravat-admin make-example-input .

which will create example_input in the current working directory. If you want to create the file in another directory, replace "." with the path to the directory.

Then, run Open-CRAVAT analysis on the test input file with

cravat ./example_input

If example_input is not in the current working directory, use the full path to example_input instead of just "example_input". The run will create a bunch of files, all with the prefix "example_input.". The final result file is example_input.xlsx which can be opened with Excel.

Install Annotators

You can search for annotators to install with the command

cravat-admin ls –a

This also tells you which annotators you have installed.

To install a new one:

cravat-admin install <annotator name>

For example: cravat-admin install clinvar

Depending on the size of the data for the annotator, it may take some time to download and install.

Run Analysis

To run your analysis you then can just type:

cravat <input file>

This command has various command line options you can see by typing cravat –h. By default, it will create text, excel, and SQLite output in the current directory and will run all of the installed annotators. Command line options can be used to select specific output or to run a subset of the installed annotators.

Open-CRAVAT Documentation Pages

Method Developers

For variant interpretation methods developers, the following pages describe how to package your method as an Open-CRAVAT annotation module and how to publish it to make it available to all Open-CRAVAT users.

You can’t perform that action at this time.