Center for Integrated Diagnostics at Mass General Hospital NGS tools
virtualenv --python=/path/to/python3 venv3
source venv3/bin/activate
python setup.py install
If you receive an error pertaining to lzma.h you may need to disable lzma and try python setup.py install again. (This occurs on MacOS Mojave)
export HTSLIB_CONFIGURE_OPTIONS=--disable-lzma
python setup.py install
With Mac OS High Sierra there is a new security feature to disable multithreading. If you see an error such as:
objc[49174]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called.
objc[49174]: +[__NSPlaceholderDate initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set
a breakpoint on objc_initializeAfterForkError to debug.
set the following environment variable:
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
Scans a vcf file and combines multiple nearby SNPs and indels into a single genomic event.
The pickle library DOES NOT work with the cli library which allows calling celltics directly. When multithreading you must call
python /path/to/celltics/tools/vargroup.py -i <input> -o <output>
VarGrouper runs multithreaded by default. Use --debug or set threads to 1 (-t 1) to avoid multiprocessing.
This will produce a warning if invoked with the cli library.
celltics vargroup --input-file sorted_variants.vcf --output-file grouped_variants.vcf --ref-seq hg19.fasta -t 1
or (no warning message)
python celltics/tools/vargroup.py --input-file sorted_variants.vcf --output-file grouped_variants.vcf --ref-seq hg19.fasta -t 1
celltics vargroup --input-file sorted_variants.vcf --output-file grouped_variants.vcf --bam-file sorted_alignment.bam --ref-seq hg19.fasta -t
or (no warning message)
python celltics/tools/vargroup.py --input-file sorted_variants.vcf --output-file grouped_variants.vcf --bam-file sorted_alignment.bam --ref-seq hg19.fasta -t 1
If a reference sequence is not supplied the UCSC hg19 api is queried (http://genome.ucsc.edu/). Variants will be grouped if they are within a certain distance and occur on the same reads. For more advanced options run celltics vargroup --help
.
Errors are not very informative when masked by the python multithreading module. Run vargroup with --debug and error messages are more informative.
- added multiprocessing
- converted from python 2.7 to python 3