find likely coding segments in DNA using composition-normalised hexamer tables
License
richarddurbin/hexamer
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
hexamer and hextable -------------------- hextable makes files of statistics that hexamer uses to scan for likely coding regions.The principle is to use 6mers, but to avoid deriving any information from base composition. I therefore normalise the frequencies of each 6mer by dividing by the total frequency of all 6mers with the same base composition. The input of hextable is a fasta file of coding sequences in frame. The -o file output is an ascii list of 4096 floating point numbers giving log likelihood ratio scores in bits. The output on stdout is a summary of the information content of the table, indicating how disriminative it is likely to be. The output of hexamer is in GFF format (http://www.sanger.ac.uk/Users/rd/gff.html). Type "make" to build the programs, and "make clean" to remove them. Example usage: hextable -o worm.hex worm.coding hexamer -T 20 worm.hex AH6.dna NB these programs assume all a,c,g,t. n's found in sequences are converted to c. Richard Durbin (rd@sanger.ac.uk) 9/95-4/98 PS 30/3/99 The original version of hexamer had some initialisation bugs, which have been fixed today.
About
find likely coding segments in DNA using composition-normalised hexamer tables
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published