Skip to content

Commit

Permalink
Merge commit '2b8a884e60d499b9bf603d6d6cb54459e22f53ab' as 'svtools/b…
Browse files Browse the repository at this point in the history
…in/svtyper'
  • Loading branch information
ernfrid committed Mar 12, 2017
2 parents d79d96f + 2b8a884 commit 80a856e
Show file tree
Hide file tree
Showing 26 changed files with 6,244 additions and 0 deletions.
40 changes: 40 additions & 0 deletions svtools/bin/svtyper/.gitignore
@@ -0,0 +1,40 @@
# Compiled source #
###################
*.com
*.class
*.dll
*.exe
*.o
*.so

# Packages #
############
# it's better to unpack these files and commit the raw source
# git has its own built in compression methods
*.7z
*.dmg
*.gz
*.iso
*.jar
*.rar
*.tar
*.zip

# Logs and databases #
######################
*.log
*.sql
*.sqlite

# OS generated files #
######################
*~
\#*\#
.DS_Store
.DS_Store?
._*
.Spotlight-V100
.Trashes
Icon?
ehthumbs.db
Thumbs.db
54 changes: 54 additions & 0 deletions svtools/bin/svtyper/.travis.yml
@@ -0,0 +1,54 @@
language: python
python:
- "2.7"

install:
- if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
wget https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
else
wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
fi
- bash miniconda.sh -b -p $HOME/miniconda
- export PATH="$HOME/miniconda/bin:$PATH"
- hash -r
- conda config --set always_yes yes --set changeps1 no
- conda update -q conda
# Useful for debugging any issues with conda
- conda info -a

- deps='pysam'
- conda create -q -c bioconda -n test-environment python=$TRAVIS_PYTHON_VERSION $deps
- source activate test-environment

script:
- cd test && ./test.sh
- python test_svtyper.py

# # command to install dependencies
# install:
# # We do this conditionally because it saves us some downloading if the
# # version is the same.
# - if [[ "$TRAVIS_PYTHON_VERSION" == "2.7" ]]; then
# wget https://repo.continuum.io/miniconda/Miniconda-latest-Linux-x86_64.sh -O miniconda.sh;
# else
# wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O miniconda.sh;
# fi
# - bash miniconda.sh -b -p $HOME/miniconda
# - export PATH="$HOME/miniconda/bin:$PATH"
# - hash -r
# - conda config --set always_yes yes --set changeps1 no
# - conda update -q conda
# # Useful for debugging any issues with conda
# - conda info -a

# - deps='libgfortran pip nose coverage statsmodels numpy pandas scipy'
# - conda create -q -n test-environment python=$TRAVIS_PYTHON_VERSION $deps
# - source activate test-environment

# # command to run tests
# script:
# # FIXME If these were modules, this shouldn't be necessary
# nosetests --all-modules --traverse-namespace --with-coverage --cover-inclusive --with-id -v
# after_success:
# - pip install coveralls
# - coveralls
22 changes: 22 additions & 0 deletions svtools/bin/svtyper/LICENSE
@@ -0,0 +1,22 @@
The MIT License (MIT)

Copyright (c) 2014 Colby Chiang

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

68 changes: 68 additions & 0 deletions svtools/bin/svtyper/README.md
@@ -0,0 +1,68 @@
SVTyper
=======
[![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/hall-lab/svtyper/master/LICENSE)
[![Build Status](https://travis-ci.org/hall-lab/svtyper.svg?branch=master)](https://travis-ci.org/hall-lab/svtyper)

Bayesian genotyper for structural variants

## Example

```
svtyper \
-i sv.vcf \
-B sample.bam \
-l sample.bam.json \
> sv.gt.vcf
```

## Overview

SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data. Users must supply a VCF file of sites to genotype (which may be generated by [LUMPY](https://github.com/arq5x/lumpy-sv)) as well as a BAM/CRAM file of Illumina paired-end reads aligned with [BWA-MEM](https://github.com/lh3/bwa). SVTyper assesses discordant and concordant reads from paired-end and split-read alignments to infer genotypes at each site. Algorithm details and benchmarking are described in [Chiang et al., 2015](http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html).

![NA12878 heterozygous deletion](etc/het.png?raw=true "NA12878 heterozygous deletion")

## Installation

Requirements
- Python 2.7 or newer
- Pysam 0.8.1 or newer

Clone the repository
```
git clone git@github.com:hall-lab/svtyper.git
```

Test the installation
```
cd svtyper/test
../svtyper \
-i example.vcf \
-B NA12878.target_loci.sorted.bam \
-l NA12878.bam.json
> test.vcf
```

## Troubleshooting

Many common issues are related to abnormal insert size distributions in the BAM file. SVTyper provides methods to assess and visualize the characteristics of sequencing libraries.

Running SVTyper with the `-l` flag creates a JSON file with essential metrics on a BAM file. SVTyper will sample the first N reads for the file (1 million by default) to parse the libraries, read groups, and insert size histograms. This can be done in the absence of a VCF file.
```
svtyper \
-B my.bam \
-l my.bam.json
```

The [lib_stats.R](scripts/lib_stats.R) script produces insert size histograms from the JSON file
```
scripts/lib_stats.R my.bam.json my.bam.json.pdf
```
![Insert size histogram](etc/my.bam.json.png?raw=true "Insert size histogram")


## Citation

C Chiang, R M Layer, G G Faust, M R Lindberg, D B Rose, E P Garrison, G T Marth, A R Quinlan, and I M Hall. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Meth 12, 966–968 (2015). doi:10.1038/nmeth.3505.

http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html
Binary file added svtools/bin/svtyper/etc/het.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added svtools/bin/svtyper/etc/hom_alt.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added svtools/bin/svtyper/etc/hom_ref.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added svtools/bin/svtyper/etc/my.bam.json.pdf
Binary file not shown.
Binary file added svtools/bin/svtyper/etc/my.bam.json.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions svtools/bin/svtyper/requirements.txt
@@ -0,0 +1 @@
pysam
48 changes: 48 additions & 0 deletions svtools/bin/svtyper/scripts/lib_stats.R
@@ -0,0 +1,48 @@
#!/usr/bin/env Rscript

# print usage
usage <- function() {
cat(
'usage: lib_stats.R <file>
lib_stats.R
author: Colby Chiang (colbychiang@wustl.edu)
description: Plot a library information for BAM file
positional arguments:
<input JSON> JSON file of library info, created by SVTyper
<output PDF> path to output pdf
')
}

# Draw a histogram from a text file
args <- commandArgs(trailingOnly=TRUE)
file <- args[1]
filename <- basename(args[1])
output <- args[2]

# Check input args
if (is.na(file) || is.na(output)) {
usage()
quit(save='no', status=1)
}

# install R packages if needed
options(repos=structure(c(CRAN="http://cran.wustl.edu")))
list.of.packages <- c("jsonlite")
new.packages <- list.of.packages[!(list.of.packages %in% installed.packages()[,"Package"])]
if(length(new.packages)) install.packages(new.packages)
library(jsonlite)

bam <- fromJSON(file)

pdf(output, h=6, w=8)
for (sample in bam) {
ins <- as.data.frame(t(sample$libraryArray$histogram))

for (i in 1:ncol(ins)) {
plot(as.numeric(row.names(ins)), ins[,i], type='h', xlab='Insert size (bp)', ylab='Number of read-pairs', main=paste0('LB: ', sample$libraryArray$library_name[i], '\nprevalence: ', round(100*sample$libraryArray$prevalence[i],1), "%"), col='steelblue', bty='l')
abline(v=sample$libraryArray$mean[i], col='red', lty=1)
legend('topright', c(paste0('mean: ', round(sample$libraryArray$mean[i],1)), paste0('sd: ', round(sample$libraryArray$sd[i],1))), lty=c(1, 0), col=c('red'))
}
}
dev.off()

0 comments on commit 80a856e

Please sign in to comment.