Skip to content

wharvey31/denovo_sv_validation

Repository files navigation

SV Validation Pipeline

Actions Status Actions Status Actions Status

This is a Snakemake project designed to facilitate the validation of de novo structural variation. The Snakefile is under workflow.

Setup

This pipeline starts with a simple setup of a pedigree file (e.g. .test/pedigree.tab) with IDs for the parents and child. From there, it can take in raw reads, assemblies aligned to a reference, intersection files generated by SVPOP (https://github.com/EichlerLab/svpop), and regions bed files to determine de novo variants. An explanation of the config structure can be seen in config/README.md

It uses a combination of SUBSEQ calls (extract reads and compare lengths) and multi-sequence alignments to determine if variants which are proposed to be de novo are supported by other callers, or seen in the parents.

This is important because some variants escape detection by traditional callers, and manual inspection of the reads can be tedious and impractical for large callsets.

Input Bed File

The input bed file must have a header and be of the format:

#CHROM	POS	END	ID	SVTYPE

Output

The output is a TSV with bed ID and a summary of the validation metrics for each step as well as a final validation call in column (VAL_DNSV)

About

Pipeline for validation of de novo SVs

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published