Skip to content

jn7163/pandasVCF

 
 

Repository files navigation

pandasVCF

VCF parser using the Python pandas library for interactive analysis

Update: March 9 2016

VCF header parsing requires the tabix -H command which is broken in tabix version 1.2.X Please update your tabix version to 1.3.

Update: August 21 2015

pandasVCF handles both multi-sample and single-sample VCF files. Please see ipynb/ for usage. pandasVCFmulti and pandasVCFsingle are now depracated.

Update: February 12 2015

pandasVCFmulti now handles both multi-sample and single-sample VCF files. Please see http://nbviewer.ipython.org/github/erscott/pandasVCF/blob/master/ipynb/multi_sample_ex.ipynb for usage. pdVCFsingle.py is now depracated and will be removed in the near future.

Command line support will be added in the near future.

Update: Nov 10 2014

pdVCFsingle.py can now parse a dataframe with a single individual, either from a multi-sample VCF or a single-sample VCF. Missing genotype calls maked with '.' are dropped when add_variant_annotations are called. 100,000 variants are parsed in ~10sec.

About

VCF parser using the Python pandas library

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Jupyter Notebook 71.7%
  • Python 28.3%