This program is intended to show some examples of using the scores in pyMSA.


Our multiple sequence alignment will be represented as a list of pairs -identifier and sequence-. In this example, we will create the structure manually, but you could also use `read_fasta_file_as_list_of_pairs()` method `from pymsa.util`:

In [2]:
msa = [("id1", "ACTG"), ("id2", "A-T-")]
print(msa)

[('id1', 'ACTG'), ('id2', 'A-T-')]


In [3]:
sequences = list(pair[1] for pair in msa)
print(sequences)

['ACTG', 'A-T-']


Note that every MSA need to fulfill one requirement:
- All the sequences in an msa must be aligned (same length).

pyMSA implements a wide range of score methods. In this example we will use `SumOfPair()`.

In [4]:
from pymsa.score import Score, SumOfPairs, Entropy, Star, \
    PercentageOfNonGaps, PercentageOfTotallyConservedColumns
from pymsa.substitutionmatrix import PAM250, Blosum62 

substitution_matrix = Blosum62(gap_penalty=-8, gap_character='-')
score_method = SumOfPairs(substitution_matrix=substitution_matrix)
result = score_method.compute(sequences)

print("MSA score using Sum of Pairs: ", result)

MSA score using Sum of Pairs:  -7


As we can see, we need to pass `SumOfPairs()` a weight matrix (also called _substitution matrix_). pyMSA also implements two of them: PAM250 (by default) and Blosum62.