Skip to content

weikevinhc/k-seek

Repository files navigation

k-seek

Version 4. de novo identification and quantification of satellite DNA from illumina whole genome sequence data.

k.seek.pl identifies and quantifies satellite DNAs (tandemly repeating simple sequences) where the base motif is <= 20bp and generates two outputs .rep and .total. The former contains the the fastq reads from which the repeat was called with an additional 5th line documenting the the called repeat and the number of occurences. The latter contains the summation for all the called repeats.

Usage: perl k.seek.pl input.fastq output_basename

k.compiler.pl will concatenate multiple .total files in a folder into a table where each row is a sample and each column a repeat. Different phases and reverse compliments of the same repeat (e.g AAC, ACA, CAA, TTG, etc) will be summed. The .toal files need to be put into a folder without other files. The output will be .rep.compiled

Usage: perl k.compile.pl folder/ output_basename

For reference see: https://www.ncbi.nlm.nih.gov/pubmed/25512552

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages