Skip to content

Lifts over a Structural Variation VCF file from one reference build to another.

Notifications You must be signed in to change notification settings

lgmgeo/liftoverSV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

liftoverSV:

Lifts over a Structural Variation VCF file from one reference build to another


COMMAND LINE USAGE

   $LIFTOVERSV/bin/liftoverSV -I $INPUT_FILE -C $CHAIN_FILE -O $OUTPUT_FILE

OPTIONS

--BCFTOOLS,-F <File>          The bcftools path
                              See https://samtools.github.io/bcftools/howtos/install.html
                              Default: "bcftools"

--BEDTOOLS,-B <File>          The bedtools path
                              See https://bedtools.readthedocs.io/en/latest/content/installation.html
                              Default: "bedtools"

--CHAIN,-C <File>             The liftover chain file
                              See https://genome.ucsc.edu/goldenPath/help/chain.html for a description of chain files
                              See http://hgdownload.soe.ucsc.edu/downloads.html#terms for where to download chain files.
                              Required

--help,-h <Boolean>           Display the help message
                              Default value: false. Possible values: {true, false}

--INPUTFILE,-I <File>         The SV VCF input file
                              Required

--LIFTOVER,-L <File>          The UCSC Liftover tool path
                              Default: "liftOver"

--OUTPUTFILE,-O <File>        The liftover SV VCF output file
                              Required

--PERCENT,-P <float>          Variation in length authorized for a lifted SV (e.g. difference max between SVLEN < 5%)
                              Default value: 0.05

--REFFASTASEQ,-R <File>       The reference sequence (fasta) for the TARGET genome build (i.e., the new one after the liftover)

How to cite?

Please cite the following doi if you are using this tool in your research:
DOI

liftoverSV:

  • Lift over #CHROM and POS

  • Lift over INFO/END and INFO/SVEND
    => drop the SV if:

    • Case1: one position (start or end) is lifted while the other doesn't
    • Case2: one position (start or end) goes to a different chrom from the other
    • Case3: "lifted start" > "lifted end"
    • Case4: the distance between the two lifted positions changes significantly (difference between both SVLENs > 5%)
  • Lift over INFO/SVLEN, INFO/SVSIZE (for deletion, duplication, insertion and inversion)

  • The structured contig field includes all the ID attributes (do not include additional optional attributes)
    e.g. ##contig=<ID=chr22>

Requirements

a) The UCSC Liftover tool (required)

The UCSC Liftover tool needs to be locally installed.
https://hgdownload.cse.ucsc.edu/admin/exe/linux.x86_64/liftOver

b) bcftools

The “bcftools” toolset needs to be locally installed to sort the VCF output file

c) bedtools (to be required in future development)

The “bedtools” toolset will need to be locally installed to lift over sequences (e.g. ACGGTTG]chr1:12569863])

SV VCF format: Documentation

cf https://samtools.github.io/hts-specs/VCFv4.4.pdf
See section 3: "INFO keys used for structural variants"

Feature requested for future release

  • Check INFO/CIPOS and INFO/CIEND
    => POS-CIPOS >0
    => END+CIEND < chrom_length

  • Liftover INFO/MEINFO and INFO/METRANS

  • Liftover INFO/HOMLEN, INFO/HOMSEQ

  • Liftover ALT when described with square bracket notation. For example, G]17:198982] or ]chr1:3000]A
    cf variantextractor for notation rules

About

Lifts over a Structural Variation VCF file from one reference build to another.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages