Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer overflow bug in AbstractMatrixAligner #202

Closed
josemduarte opened this issue Nov 6, 2014 · 0 comments · Fixed by #203
Closed

Integer overflow bug in AbstractMatrixAligner #202

josemduarte opened this issue Nov 6, 2014 · 0 comments · Fixed by #203
Labels
bug Bugs and bugfixes

Comments

@josemduarte
Copy link
Contributor

There seems to be an integer overflow bug in AbstractMatrixAligner, the type used for max and min scores is short. Whenever the score is higher than that, then it overflows and goes to the negative side.

Code to reproduce:

import org.biojava3.alignment.NeedlemanWunsch;
import org.biojava3.alignment.SimpleGapPenalty;
import org.biojava3.alignment.SubstitutionMatrixHelper;
import org.biojava3.alignment.template.SubstitutionMatrix;
import org.biojava3.core.sequence.DNASequence;
import org.biojava3.core.sequence.compound.AmbiguityDNACompoundSet;
import org.biojava3.core.sequence.compound.NucleotideCompound;

public class Test {

    public static void main (String[] args) throws Exception {

        SubstitutionMatrix<NucleotideCompound> matrix = SubstitutionMatrixHelper.getNuc4_4();
        SimpleGapPenalty gap = new SimpleGapPenalty();

        String str1 =
        "AGATATATCTGAAGCTTAAAGGGCAGTGACAATGGCTGGCTCGGTTAACGGGAATCATAGTGCTGTAGGACCTGGTATAAATTATGAGACGGTGTCTCAAGTGGATGAGTTCTGTAAAGCACTTAGAGGGAAAAGGCCGATCCATAGTATTTTGATAGCTAACAATGGAATGGCGGCTGTGAAGTTTATACGTAGTGTCAGAACATGGGCTTATGAAACATTTGGTACGGAAAAAGCCATATTGTTGGTGGGGATGGCAACCCCTGAAGACATGCGGATCAATGCGGAGCATATCAGAATCGCTGATCAGTTTGTTGAGGTTCCCGGAGGAACCAACAATAACAATTATGCTAACGTTCAGCTGATTGTGGAGATGGCTGAAGTAACACGCGTGGATGCAGTTTGGCCTGGTTGGGGTCATGCATCTGAAAACCCCGAATTACCTGATGCCCTAGATGCAAAAGGAATCATATTTCTTGGTCCTCCAGCATCTTCAATGGCAGCACTGGGAGATAAGATTGGTTCTTCGTTGATTGCACAAGCTGCTGATGTACCCACTCTGCCATGGAGTGGTTCCCATGTTAAAATACCTCCTAATAGCAACTTGGTAACCATCCCAGAGGAGATCTACCGGCAAGCATGTGTCTACACAACTGAAGAAGCGATTGCTAGCTGTCAAGTTGTCGGTTACCCAGCAATGATCAAAGCATCGTGGGGTGGTGGTGGTAAAGGAATCAGGAAGGTTCATAATGATGATGAGGTTAGGGCTCTATTCAAGCAAGTTCAGGGTGAGGTCCCAGGCTCACCAATATTCATAATGAAGGTTGCGTCACAGAGTCGGCATCTAGAGGTCCAGCTGCTCTGTGACAAGCATGGAAATGTTTCAGCTCTGCATAGCCGTGATTGTAGCGTCCAGAGAAGACATCAAAAGATCATAGAGGAGGGTCCAATTACTGTGGCTCCGCCAGAAACTGTCAAGAAACTTGAACAAGCAGCTAGAAGGTTGGCTAAGAGTGTTAACTATGTTGGAGCTGCTACTGTTGAGTATCTCTACAGTATGGACACTGGGGAGTACTACTTCTTAGAGCTTAACCCTCGCTTACAGGTTGAGCATCCTGTCACTGAGTGGATTGCCGAGATAAATCTTCCTGCTGCCCAAGTTGCTGTGGGGATGGGAATTCCTCTCTGGCAAATCCCTGAGATAAGACGGTTCTATGGAATAGAACATGGTGGAGGTTATGATTCTTGGCGAAAAACATCTGTTGTAGCCTTCCCTTTTGATTTTGATAAAGCTCAATCTATAAGGCCAAAAGGTCATTGTGTGGCTGTACGTGTGACAAGTGAGGATCCTGATGACGGGTTCAAACCAACCAGCGGTAGAGTTCAGGAGTTGAGTTTTAAGAGCAAGCCAAATGTGTGGGCGTACTTCTCTGTCAAGTCTGGTGGAGGCATCCACGAGTTCTCGGATTCCCAGTTTGGACATGTTTTTGCATTTGGGGAATCCAGAGCCCTGGCGATAGCGAATATGGTTCTTGGGCTAAAAGAAATTCAGATCCGTGGAGAAATTAGGACTAACGTTGACTACACGATCGACCTTTTACATGCTTCTGATTACCGTGATAACAAAATTCACACTGGTTGGTTGGATAGTAGGATTGCTATGCGGGTCAGAGCTGAGAGGCCTCCATGGTATCTCTCTGTTGTCGGCGGAGCTCTCTATAAAGCATCAGCGACCAGTGCTGCTGTGGTTTCAGATTACGTTGGTTATCTGGAGAAGGGGCAAATCCCTCCAAAGCATATATCTCTTGTACATTCTCAAGTGTCTCTGAATATTGAAGGAAGTAAATATACGATTGATGTAGTCCGGGGTGGATCAGGAACCTACAGGCTAAGAATGAACAAGTCAGAAGTGGTAGCAGAAATACACACTCTACGTGATGGAGGTCTGTTGATGCAGTTGGATGGCAAAAGCCATGTGATATATGCAGAGGAAGAAGCTGCAGGAACTCGTCTTCTCATTGATGGAAGAACTTGTTTGCTACAGAATGACCACGATCCATCAAAGTTAATGGCTGAGACACCGTGCAAGTTGATGAGGTATTTGATTTCCGACAACAGCAATATTGACGCTGATACGCCTTATGCCGAAGTTGAGGTCATGAAGATGTGCATGCCACTTCTTTCACCTGCTTCAGGAGTTATCCATTTTAAAATGTCTGAAGGACAAGCCATGCAGGCTGGTGAACTTATAGCCAATCTTGATCTTGATGATCCTTCTGCTGTAAGAAAGGCCGAACCCTTCCATGGAAGTTTCCCAAGATTAGGGCTTCCAACTGCAATATCCGGTAGAGTTCATCAGAGATGTGCCGCAACATTAAATGCTGCACGCATGATTCTTGCTGGCTATGAGCATAAAGTAGATGAGGTTGTTCAAGACTTACTTAATTGCCTTGATAGCCCTGAACTCCCATTTCTTCAGTGGCAAGAGTGCTTTGCAGTTCTGGCGACACGACTACCTAAAAATCTCAGGAACATGCTAGAATCAAAGTATAGGGAATTTGAGAGTATTTCCAGAAACTCTTTGACCACCGATTTCCCTGCCAAACTTTTAAAAGGCATTCTTGAGGCACATTTATCTTCTTGTGATGAGAAAGAGAGAGGTGCCCTTGAAAGGCTCATTGAACCATTGATGAGCCTTGCAAAATCTTATGAAGGTGGTAGAGAAAGTCATGCCCGTGTTATTGTTCATTCTCTCTTTGAAGAATATCTATCAGTAGAAGAATTATTCAATGATAACATGCTGGCTGATGTTATAGAACGCATGCGTCAGCTATACAAGAAAGATCTGTTGAAAATTGTGGATATAGTGCTCTCACACCAGGGCATAAAAAACAAAAACAAACTCGTTCTCCGGCTCATGGAGCAGCTTGTTTACCCTAATCCTGCTGCTTACAGAGATAAACTTATTCGATTCTCAACACTTAACCATACTAACTACTCTGAGTTGGCGCTCAAGGCGAGTCAATTACTTGAACAGACCAAACTAAGTGAGCTTCGTTCAAACATTGCTAGAAGCCTTTCAGAGTTAGAAATGTTTACAGAGGACGGAGAAAATATGGATACTCCCAAGAGGAAAAGTGCCATTAATGAAAGAATAGAAGATCTTGTAAGCGCATCTTTAGCTGTTGAAGACGCTCTCGTGGGACTATTTGACCATAGCGATCACACACTTCAAAGACGGGTTGTTGAGACTTATATTCGCAGATTATACCAGCCCTACGTCGTTAAAGATAGCGTGAGGATGCAGTGGCACCGTTCTGGTCTTCTTGCTTCCTGGGAGTTCCTAGAGGAGCATATGGAAAGAAAAAACATTGGCTTAGACGATCCCGACACATCTGAAAAAGGATTGGTTGAGAAGCGTAGTAAGAGAAAATGGGGGGCTATGGTTATAATCAAATCTTTGCAGTTTCTTCCAAGTATAATAAGTGCAGCATTGAGAGAAACAAAGCACAACGACTATGAAACTGCCGGAGCTCCTTTATCTGGCAATATGATGCACATTGCTATTGTGGGCATCAACAACCAGATGAGTCTGCTTCAGGACAGTGGGGATGAAGACCAAGCTCAGGAAAGAGTAAACAAGTTGGCCAAAATTCTTAAAGAGGAAGAAGTGAGTTCAAGCCTCTGTTCTGCCGGTGTTGGTGTAATCAGCTGTATAATTCAGCGAGATGAAGGACGAACACCCATGAGACATTCTTTCCATTGGTCGTTGGAGAAACAGTATTATGTAGAAGAGCCGTTGCTGCGTCATCTTGAACCTCCTCTGTCCATTTACCTTGAGTTGGATAAGCTGAAAGGATACTCAAATATACAATATACGCCTTCTCGAGATCGTCAATGGCATCTGTATACTGTTACAGACAAGCCAGTGCCAATCAAGAGGATGTTCCTGAGATCTCTTGTTCGACAGGCTACAATGAACGATGGATTTATATTGCAGCAAGGGCAGGATAAGCAGCTTAGCCAAACACTGATCTCCATGGCGTTTACGTCGAAATGTGTTCTGAGGTCTTTGATGGATGCCATGGAGGAACTGGAACTGAATGCCCATAATGCTGCAATGAAACCAGATCACGCACATATGTTTCTTTGCATATTGCGTGAGCAGCAGATAGATGATCTTGTGCCTTTCCCCAGGAGAGTTGAAGTGAATGCGGAGGATGAAGAAACTACAGTTGAAATGATCTTAGAAGAAGCAGCACGAGAGATACATAGATCTGTTGGAGTGAGAATGCATAGGTTGGGCGTGTGCGAGTGGGAAGTGCGGCTGTGGTTGGTGTCCTCTGGACTGGCATGTGGTGCTTGGAGGGTTGTGGTTGCAAACGTGACAGGCCGTACATGCACTGTCCACATATACCGAGAAGTTGAAACTCCTGGAAGAAACAGTTTAATCTACCACTCAATAACCAAGAAGGGACCTTTGCATGAAACACCAATCAGTGATCAATATAAGCCCCTGGGATATCTCGACAGGCAACGTTTAGCAGCAAGGAGGAGTAACACTACTTATTGCTATGACTTCCCGTTGGCATTTGGGACAGCCTTGGAACTGTTGTGGGCATCACAACACCCAGGAGTTAAGAAACCATATAAGGATACTCTGATCAATGTTAAAGAGCTTGTATTCTCAAAACCAGAAGGTTCTTCGGGTACATCTCTAGATCTGGTTGAAAGACCACCCGGTCTCAACGACTTTGGAATGGTTGCCTGGTGCCTAGATATGTCGACCCCAGAGTTTCCTATGGGGCGGAAACTTCTCGTGATTGCGAATGATGTCACCTTCAAAGCTGGTTCTTTTGGTCCTAGAGAGGACGCGTTTTTCCTTGCTGTTACTGAACTCGCTTGTGCCAAGAAGCTTCCCTTGATTTACTTGGCAGCAAATTCTGGTGCCCGACTTGGGGTTGCTGAAGAAGTCAAAGCCTGCTTCAAAGTTGGATGGTCGGATGAAATTTCCCCTGAGAATGGTTTTCAGTATATATACCTAAGCCCTGAAGACCACGAAAGGATTGGATCATCTGTCATTGCCCATGAAGTAAAGCTCTCTAGTGGGGAAACTAGGTGGGTGATTGATACGATCGTTGGCAAAGAAGATGGTATTGGTGTAGAGAACTTAACAGGAAGTGGGGCCATAGCGGGTGCTTACTCAAAGGCATACAATGAAACTTTTACTTTAACCTTTGTTAGTGGAAGAACGGTTGGAATTGGTGCTTATCTTGCCCGCCTAGGTATGCGGTGCATACAGAGACTTGATCAGCCGATCATCTTGACTGGCTTCTCTACACTCAACAAGTTACTTGGGCGTGAGGTCTATAGCTCTCACATGCAACTGGGTGGCCCGAAAATCATGGGCACAAATGGTGTTGTTCATCTTACAGTCTCAGATGATCTTGAAGGCGTATCAGCAATTCTCAACTGGCTCAGCTACATTCCTGCTTACGTGGGTGGTCCTCTTCCTGTTCTTGCCCCTTTAGATCCACCGGAGAGAATTGTGGAGTATGTCCCAGAGAACTCTTGCGACCCACGAGCGGCTATAGCTGGGGTCAAAGACAATACCGGTAAATGGCTTGGAGGTATCTTTGATAAAAATAGTTTCATTGAGACTCTTGAAGGCTGGGCAAGGACGGTAGTGACTGGTAGAGCCAAGCTCGGGGGAATACCCGTTGGAGTTGTTGCAGTTGAGACACAGACTGTCATGCAGATCATCCCAGCCGATCCTGGACAGCTTGACTCTCATGAAAGAGTGGTTCCGCAAGCAGGGCAAGTCTGGTTTCCTGATTCAGCGGCCAAGACTGCTCAAGCGCTTATGGATTTCAACCGGGAAGAGCTTCCATTGTTTATCCTAGCGAACTGGAGAGGGTTTTCAGGTGGGCAGAGAGATCTTTTCGAAGGAATACTTCAGGCAGGTTCAACTATAGTAGAAAATCTGAGAACCTATCGTCAGCCAGTGTTTGTGTACATCCCAATGATGGGAGAGCTGCGCGGTGGAGCGTGGGTTGTTGTTGACAGCCAGATAAATTCGGATTATGTTGAAATGTATGCTGATGAAACAGCTCGTGGAAATGTGCTTGAGCCAGAAGGGACAATAGAGATAAAATTTAGAACAAAAGAGCTATTAGAGTGCATGGGAAGGTTGGACCAGAAGCTAATCAGTCTGAAAGCAAAACTGCAAGATGCCAAGCAAAGCGAGGCCTATGCAAACATCGAGCTTCTCCAGCAACAGATTAAAGCCCGAGAGAAACAGCTTTTACCAGTTTATATCCAAATCGCCACCAAATTTGCAGAACTTCATGACACTTCCATGAGAATGGCTGCAAAGGGAGTGATCAAAAGTGTTGTGGAATGGAGCGGCTCGCGGTCCTTCTTCTACAAAAAGCTCAATAGGAGAATCGCTGAGAGCTCTCTTGTGAAAAACGTAAGAGAAGCATCTGGAGACAACTTAGCATATAAATCTTCAATGCGTCTGATTCAGGATTGGTTCTGCAACTCTGATATTGCAAAGGGGAAAGAAGAAGCTTGGACAGACGACCAAGTGTTCTTTACATGGAAGGACAATGTTAGTAACTACGAGTTGAAGCTGAGCGAGTTGAGAGCGCAGAAACTACTGAACCAACTTGCAGAGATTGGGAATTCCTCAGATTTGCAAGCTCTGCCACAAGGACTTGCTAATCTTCTAAACAAGGTGGAGCCGTCGAAAAGAGAAGAGCTGGTGGCTGCTATTCGAAAGGTCTTGGGTTGACTGA";
        String str2 =
        "TAAAGTCTTCGATATCAGTCAACCCAAGACCTTTCGAATAGCAGCCACCAGCTCTTCTCTTTTCGACGGCTCCACCTTGTTTAGAAGATTAGCAAGTCCTTGTGGCAGAGCTTGCAAATCTGAGGAATTCCCAATCTCTGCAAGTTGGTTCAGTAGTTTCTGCGCTCTCAACTCGCTCAGCTTCAACTCGTAGTTACTAACATTGTCCTTCCATGTAAAGAACACTTGGTCGTCTGTCCAAGCTTCTTCTTTCCCCTTTGCAATATCAGAGTTGCAGAACCAATCCTGAATCAGACGCATTGAAGATTTATATGCTAAGTTGTCTCCAGATGCTTCTCTTACGTTTTTTACAAGAGAGCTCTCAGCGATTCTCCTATTGAGCTTTTTGTAGAAGAAGGACCGCGAGCCGCTCCATTCCACAACACTTTTGATCACTCCCTTTGCAGCCATTCTCATGGAAGTGTCATGAAGTTCTGCAAATTTGGTGGCGATTTGGATATAAACTGGTAAAAGCTGTTTCTCTCGGGCTTTAATCTGTTGCTGGAGAAGCTCGATGTTTGCATAGGCCTCGCTTTGCTTGGCATCTTGCAGTTTTGCTTTCAGACTGATTAGCTTCTGGTCCAACCTTCCCATGCACTCTAATAGCTCTTTTGTTCTAAATTTTATCTCTATTGTCCCTTCTGGCTCGAGCACATTTCCACGAGCTGTTTCATCAGCATACATTTCAACATAATCCGAATTTATCTGGCTGTCAACAACAACCCACGCTCCACCGCGCAGCTCTCCCATCATTGGGATGTACACAAACACTGGCTGACGATAGGTTCTCAGATTTTCTACTATAGTTGAACCTGCCTGAAGTATTCCTTCGAAAAGATCTCTCTGCCCACCTGAAAACCCTCTCCAGTTCGCTAGGATAAACAATGGAAGCTCTTCCCGGTTGAAATCCATAAGTGCTTGAGCAGTCTTGGCCGCTGAATCAGGAAACCAGACTTGCCCTGCTTGCGGAACCACTCTTTCATGAGAGTCAAGCTGTCCAGGATCGGCTGGGATGATCTGCATGACAGTCTGTGTCTCAACTGCAACAACTCCAACGGGTATTCCCCCGAGCTTGGCTCTACCAGTCACTACCGTCCTTGCCCAGCCTTCAAGAGTCTCAATGAAACTATTTTTATCAAAGATACCTCCAAGCCATTTACCGGTATTGTCTTTGACCCCAGCTATAGCCGCTCGTGGGTCGCAAGAGTTCTCTGGGACATACTCCACAATTCTCTCCGGTGGATCTAAAGGGGCAAGAACAGGAAGAGGACCACCCACGTAAGCAGGAATGTAGCTGAGCCAGTTGAGAATTGCTGATACGCCTTCAAGATCATCTGAGACTGTAAGATGAACAACACCATTTGTGCCCATGATTTTCGGGCCACCCAGTTGCATGTGAGAGCTATAGACCTCACGCCCAAGTAACTTGTTGAGTGTAGAGAAGCCAGTCAAGATGATCGGCTGATCAAGTCTCTGTATGCACCGCATACCTAGGCGGGCAAGATAAGCACCAATTCCAACCGTTCTTCCACTAACAAAGGTTAAAGTAAAAGTTTCATTGTATGCCTTTGAGTAAGCACCCGCTATGGCCCCACTTCCTGTTAAGTTCTCTACACCAATACCATCTTCTTTGCCAACGATCGTATCAATCACCCACCTAGTTTCCCCACTAGGGAGCTTTACTTCATGGGCAATGACAGATGATCCAATCCTTTCGTGGTCTTCAGGGCTTAGGTATATATACTGAAAACCATTCTCAGGGGAAATTTCATCCGACCATCCAACTTTGAAGCAGGCTTTGACTTCTTCAGCAACCCCAAGTCGGGCACCAGAATTTGCTGCCAAGTAAATCAAGGGAAGCTTCTTGGCACAAGCGAGTTCAGTAACAGCAAGGAAAAACGCGTCCTCTCTAGGACCAAAAGAACCAGCTTTGAAGGTGACATCATTCGCAATCACGAGAAGTTTCCGCCCCATAGGAAACTCTGGGGTCGACATATCTAGGCACCAGGCAACCATTCCAAAGTCGTTGAGACCGGGTGGTCTTTCAACCAGATCTAGAGATGTACCCGAAGAACCTTCTGGTTTTGAGAATACAAGCTCTTTAACATTGATCAGAGTATCCTTATATGGTTTCTTAACTCCTGGGTGTTGTGATGCCCACAACAGTTCCAAGGCTGTCCCAAATGCCAACGGGAAGTCATAGCAATAAGTAGTGTTACTCCTCCTTGCTGCTAAACGTTGCCTGTCGAGATATCCCAGGGGCTTATATTGATCACTGATTGGGGTTTCATGCAAAGGTCCCTTCTTGGTTATTGAGTGGTAGATTAAACTGTTTCTTCCAGGAGTTTCAACTTCTCGGTATATGTGGACAGTGCATGTACGGCCTGTCACGTTTGCAACCACAACCCTCCAAGCACCACATGCCAGTCCAGAGGACACCAACCACAGCCGCACTTCCCACTCGCACACGCCCAACCTATGCATTCTCACTCCAACAGATCTATGTATCTCTCGTGCTGCTTCTTCTAAGATCATTTCAACTGTAGTTTCTTCATCCTCCGCATTCACTTCAACTCTCCTGGGGAAAGGCACAAGATCATCTATCTGCTGCTCACGCAATATGCAAAGAAACATATGTGCGTGATCTGGTTTCATTGCAGCATTATGGGCATTCAGTTCCAGTTCCTCCATGGCATCCATCAAAGACCTCAGAACACATTTCGACGTAAACGCCATGGAGATCAGTGTTTGGCTAAGCTGCTTATCCTGCCCTTGCTGCAATATAAATCCATCGTTCATTGTAGCCTGTCGAACAAGAGATCTCAGGAACATCCTCTTGATTGGCACTGGCTTGTCTGTAACAGTATACAGATGCCATTGACGATCTCGAGAAGGCGTATATTGTATATTTGAGTATCCTTTCAGCTTATCCAACTCAAGGTAAATGGACAGAGGAGGTTCAAGATGACGCAGCAACGGCTCTTCTACATAATACTGTTTCTCCAACGACCAATGGAAAGAATGTCTCATGGGTGTTCGTCCTTCATCTCGCTGAATTATACAGCTGATTACACCAACACCGGCAGAACAGAGGCTTGAACTCACTTCTTCCTCTTTAAGAATTTTGGCCAACTTGTTTACTCTTTCCTGAGCTTGGTCTTCATCCCCACTGTCCTGAAGCAGACTCATCTGGTTGTTGATGCCCACAATAGCAATGTGCATCATATTGCCAGATAAAGGAGCTCCGGCAGTTTCATAGTCGTTGTGCTTTGTTTCTCTCAATGCTGCACTTATTATACTTGGAAGAAACTGCAAAGATTTGATTATAACCATAGCCCCCCATTTTCTCTTACTACGCTTCTCAACCAATCCTTTTTCAGATGTGTCGTGATCGTCTAAGCCAATGTTTTTTCTTTCCATATGCTCCTCTAGGAAATCCCAGGAAGCAAGAAGACCAGAACGGTGCCACTGCATCCTCACGCTATCTTTAACGACGTAGGGCTGGTATAATCTGCGAATATAAGTCTCAACAACCCGTCTTTGAAGTGTGTGATCGCTATGGTCAAATAGTCCCACGAGAGCGTCTTCAACAGCTAAAGATGCGCTTACAAGATCTTCTATTCTTTCATTAATGGCACTTTTCCTCTTGGGAGTATCCATATTTTCTCCGTCCTCTGTAAACATTTCTAACTCTGAAAGGCTTCTAGCAATGTTTGAACGAAGCTCACTTAGTTTGGTCTGTTCAAGTAATTGACTCGCCTTGAGCGCCAACTCAGAGTAGTTAGTATGGTTAAGTGTTGAGAATCGAATAAGTTTATCTCTGTAAGCAGCAGGATTAGGGTAAACAAGCTGCTCCATGAGCCGGAGAACGAGTTTGTTTTTGTTTTTTATGCCCTGGTGTGAGAGCACTATATCCACAATTTTCAACAGATCTTTCTTGTATAGCTGACGCATGCGTTCTATAACATCAGCCAGCATGTTATCATTGAATAATTCTTCTACTGATAGATATTCTTCAAAGAGAGAATGAACAATAACACGGGCATGACTTTCTCTACCACCTTCATAAGATTTTGCAAGGCTCATCAATGGTTCAATGAGCCTTTCAAGGGCACCTCTCTCTTTCTCATCACAAGAAGATAAATGTGCCTCAAGAATGCCTTTTAAAAGTTTGGCAGGGAAATCGGTGGTCAAAGAGTTTCTGGAAATACTCTCAAATTCCCTATACTTTGATTCTAGCATGTTCCTGAGATTTTTAGGTAGTCGTGTCGCCAGAACTGCAAAGCACTCTTGCCACTGAAGAAATGGGAGTTCAGGGCTATCAAGGCAATTAAGTAAGTCTTGAACAACCTCATCTACTTTATGCTCATAGCCAGCAAGAATCATGCGTGCAGCATTTAATGTTGCGGCACATCTCTGATGAACTCTACCGGATATTGCAGTTGGAAGCCCTAATCTTGGGAAACTTCCATGGAAGGGTTCGGCCTTTCTTACAGCAGAAGGATCATCAAGATCAAGATTGGCGATAAGTTCACCAGCCTGCATGGCTTGTCCTTCAGACATTTTAAAATGGATAACTCCTGAAGCAGGTGAAAGAAGTGGCATGCACATCTTCATGACCTCAACTTCGGCATAAGGCGTATCAGCGTCAATATTGCTGTTGTCAGAAACCAAATACCTCATCAACTTGCACGGTGTCTCAGCCATTAACTTTGATGGATCATGGTCATTCTGTAGCAAACAAGTTCTTCCATCAATGAGAAGACGAGTTCCTGCAGCTTCTTCCTCTGCATATATCACATGGCTTTTGCCATCCAACTGCATCAACAGACCTCCATCACGTAGAGTGTGTATTTCTGCTACCACTTCTGACTTGTTCATTCTTAGCCTGTAGGTTCCTGATCCACCCCGGACTACATCAATCGTATATTTACTTCCTTCAATATTCAGAGACACTTGAGAATGCACAAGAGATATATGCTTTGGGGGAATTTGCCCCTTTTCTAGATAGCCAACGTAATCCGAAACTACAGCAGAACTGGTCGTAGATGCTTTATAAAGAGCCCCACCGACTACAGAGAGATACCATGGAGGTCTCTCTGCTCTGACCCGCATAGCAATCCTACTGTCCAACCAACCAGTGTGTATTTTGTTTTCCCGGTAATCAGAAGCATGTAGAAGGTCGATCGTGTAGTCAACGTTAGTCCTAATTTCTCCACGGATCTGAATTTCTTTTAGCCCAAGAACCATATTCGCTATCGCCAGGGCTCTGGATTCCCCAAATGCAAAAACATGTCCAAACTGGGAATCCGAGAACTCGTGGATGCCTCCACCAGACTTGACAGAGAAGTACGCCCACACATTTGGCTTGCTCTTAAAACTCAACTCCTGAACTCTACCGCTGGTTGGTTTGAACCCGTCATCAGGATCCTCACTTGTCACACGTACAGCCACACAATGACCTTTTGGCCTTATAGATTGAGCTTTATCAAAATCAAAAGGGAAGGCTACAACAGATGTTTTTCGCCAAGAATCATAACCTCCACCATGTTCTATTCCATAGAACCGTCTTATCTCAGGGATTTGCCAGAGAGGAATTCCCATCCCCACAGCAACTTGGGCAGCAGGAAGATTTATCTCGGCAATCCACTCAGTGACAGGATGCTCAACCTGTAAGCGAGGGTTAAGCTCTAAGAAGTAGTACTCCCCAGTGTCCATACTGTAGAGATACTCAACAGTAGCAGCTCCAACATAGTTAACACTCTTAGCCAACCTTCTAGCTGCTTGTTCAAGTTTCTTGACAGTTTCTGGCGGAGCCACAGTAATTGGACCCTCCTCTATGATCTTTTGATGTCTTCTCTGGACGCTACAATCACGGCTATGCAGAGCTGAAACATTTCCATGCTTGTCACAGAGCAGCTGGACCTCTAGATGCCGACTCTGTGACGCAACCTTCATTATGAATATTGGTGAGCCTGGGACCTCACCCTGAACTTGCTTGAATAGAGCCCTAACCTCATCATCATTATGAACCTTCCTGATTCCTTTACCACCACCACCCCACGATGCTTTGATCATTGCTGGGTAACCGACAACTTGACAGCTAGCAATCGCTTCTTCAGTTGTGTAGACACATGCTTGCCGGTAGATCTCCTCTGGGATGGTTACCAAGTTGCTATTAGGAGGTATTTTAACATGGGAACCACTCCATGGCAGAGTGGGTACATCAGCAGCTTGTGCAATCAACGAAGAACCAATCTTATCTCCCAGTGCTGCCATTGAAGATGCTGGAGGACCAAGAAATATGATTCCTTTTGCATCTAGGGCATCAGGTAATTCGGGGTTTTCAGATGCATGACCCCAACCAGGCCAAACTGCATCCACGCGTGTTACTTCAGCCATCTCCACAATCAGCTGAACGTTAGCATAATTGTTATTGTTAGTTCCTCCGGGAACCTCAACAAACTGATCAGCGATTCTGATATGCTCCGCATTGATCCGCATGTCTTCAGGGGTTGCCATCCCCACCAACAATATGGCTTTTTCCGTACCAAATGTTTCATAAGCCCATGTTCTGACACTACGTATAAACTTCACAGCCGCCATTCCATTGTTAGCTATCAAAATACTATGGATCGGCCTTTTCCCTCTAAGTGCTTTA";
        System.out.println("Lengths: "+str1.length()+" "+str2.length());


        DNASequence target = new DNASequence(str1,
                AmbiguityDNACompoundSet.getDNACompoundSet());

        DNASequence query = new DNASequence(str2,
                AmbiguityDNACompoundSet.getDNACompoundSet());

        NeedlemanWunsch aligner = new NeedlemanWunsch(query, target, gap, matrix);

        System.out.println("getScore: " + aligner.getScore());
        System.out.println("getMaxScore: " + aligner.getMaxScore());
        System.out.println("getMinScore: " + aligner.getMinScore());
        System.out.println("getSimilarity: " + aligner.getSimilarity());
    }            
}

The output for that is:

Lengths: 6800 6700
getScore: 3440.0
getMaxScore: -31536.0
getMinScore: -13520.0
getSimilarity: -0.9413854351687388

Where it is clear that the max score is overflowing and the similarity value is then wrong.

@josemduarte josemduarte added the bug Bugs and bugfixes label Nov 6, 2014
josemduarte added a commit to josemduarte/biojava that referenced this issue Nov 7, 2014
josemduarte added a commit to josemduarte/biojava that referenced this issue Nov 7, 2014
@josemduarte josemduarte mentioned this issue Nov 7, 2014
josemduarte added a commit that referenced this issue Nov 12, 2014
josemduarte added a commit to josemduarte/biojava that referenced this issue Jan 16, 2015
Alignment was still not working properly for very large sequences:
tested with sequence of PDB 2y9r (102 residues) against UniProt Q8WZ42
(34350 residues), a null pointer exception was happening
The key in this patch is the modification in AbstractMatrixAligner:
score has to be initialized to Integer.MIN_VALUE and not to
Short.MIN_VALUE
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs and bugfixes
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant