Skip to content

A VCF filter for WGS data to filter FPs without losing too many TPs

License

Notifications You must be signed in to change notification settings

genomicsITER/FPfilter

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

#FPfilter

  • FPfilter is a false-positive specific filter for vcf files from whole genome sequencing. It is proved to be more false-positive specific than GATK hard filter
  • ##Publications

    ##Test Datasets

  • The test vcf file "PE150-example.vcf" is provided in the script folder.
  • ##Setup

  • To run FPfilter, you need to have anaconda installed in your Linux system.
  • After that, you need to get FPfilter from binstar and create the running environment by:
  • conda create -c yuxiang fpfilter -n $env_name #(Recommendation: use FPF with version number as the env_name.)
  • Once it is setup correctly, it will tell you how to activate the environment. For example: source activate $env_name
  • ##Input Data Format

  • Regular vcf, such as from GATK.
  • ##How to Run

  • Before running FPfilter, you need to activate the FPfilter environment by:
  • source activate $env_name #(The one used in the setup step)
  • Running FPfilter is as simple as running a single command line by just typing "FPfilter -v vcf-file -p FPfilter_path".
  • [Details of all parameters: please see the help information in each FPfilter version for latest updates.]
  • *Note1: the output directory is the folder in which the vcf file is located. In order to avoid running error, the current working folder should be the folder in which the vcf file is located.*
  • *Note2: generally, the FPfilter_path is under path_of_miniconda2/envs/env_name/lib/FPfilter/. You can get the path_of_miniconda2 by "which FPfilter" and change "bin" to "lib" in the path.*
  • ##Parameters ###All the parameters are required.

  • -h help
  • -v The vcf file as input to be filtered *[No default value]
  • -p The path of FPfilter scripts *[No default value]
  • ##Output ###Final Output Stucture

  • ${input_vcf}_TP_after_filtered.vcf: the "TP" locations which passed the filter.
  • ${input_vcf}_filtered_FP.vcf :the "false positive" locations which were filtered.
  • About

    A VCF filter for WGS data to filter FPs without losing too many TPs

    Resources

    License

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published

    Languages

    • Shell 72.2%
    • Python 15.2%
    • Perl 12.6%