Skip to content
Martin Asser Hansen edited this page Oct 2, 2015 · 5 revisions

Biopiece: calc_bit_scores

Description

calc_bit_scores calculates the bit score for the sum of all residues per column aligned sequences from the stream. The bit scores are calculated using Shannon's famous general formula for uncertainty as documentet:

http://www.ccrnp.ncifcrf.gov/~toms/paper/hawaii/latex/node5.html

The maximum bit score is 2 and 4 for nucleotide and protein sequences, respectively.

Usage

... | calc_bit_scores [options]

Options

[-?         | --help]               #  Print full usage description.
[-I <file!> | --stream_in=<file!>]  #  Read input from stream file   -  Default=STDIN
[-O <file>  | --stream_out=<file>]  #  Write output to stream file   -  Default=STDOUT
[-v         | --verbose]            #  Verbose output.

Examples

Consider the following alignment in the file aln.fna in FASTA format:

>test5
---TAACAGGCACT
>test2
-----GAATCGACT
>test1
--CTAGCTTCGACT
>test3
ACGAAACTAGCATC
>test4
----AGCATCGACT

To calculate the bit scores from the above alignment, read it in with read_fasta and pipe the stream through calc_bit_scores:

read_fasta -i aln.fna | calc_bit_scores | write_tab -x

1.54    1.54    1.07    1.01    1.74    1.03    1.28    1.03    0.63    1.03    1.03    2.00    1.28    1.28

See also

read_fasta

write_tab

create_weight_matrix

plot_seqlogo

Author

Martin Asser Hansen - Copyright (C) - All rights reserved.

mail@maasha.dk

August 2007

License

GNU General Public License version 2

http://www.gnu.org/copyleft/gpl.html

Help

calc_bit_scores is part of the Biopieces framework.

http://www.biopieces.org

Clone this wiki locally