Skip to content

andreasKroepelin/SNP_Evaluation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SNP Evaluation

This tool lets you have a deeper look at your sequenced (bacterial) genomes.

The idea is that you load a VCF file as well as a SNP Table and then choose from a wide range of possibilities what properties of the SNPs you are interested in. This tool then gives you the opportunity to explore these properties in detail.

The SNP Table can be obtained by using the Multi VCF Analyzer.

Known issues

Lesley Sitter developed a script to fix an issue with this tool:

This tool is used to correct VCF files generated by unified genotyper based on references containing N's. UG ignores N's and does not create an entry for these in the .vcf file. Tools like SNPevaluator that rely on a complete record of all the positions, will generate frameshifts if this happens. To correct for this we add dummy lines in area's where UG did not generate VCF entries.

Cases that are not handled:

  • Multi reference VCF files, as SNPevaluation cannot handle these due to the frame shift
  • N's at the end of a genome, as the script doesn't know how long the genome is there is metadata in the header but this is not a standard

Script usage should be easy, just invoke Rscript, type the script location and add the vcf files you want to correct Example: Rscript /path/to/VCF_N_corrector.R file1.vcf file2.vcf

The output uses the orignal name and adds "Ncorrected." to it to identify this is the corrected file, and it saves the file in the same location as the original vcf.

You can find this script in this repository, as well as some slides by Lesley Sitter that explain the problem in more detail.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published