# Branch and Bound
## Specification
- Name: Template for Problem 16, 17, 18 in Rosalind
- Name your notebooks as: problem16.ipynb, problem17.ipynb, problem18.ipynb
- options: none
- input: filename passed as first parameter to main
- output: a text file. ( using print ( .... file=someFileObject) is a handy way to do this after you have opened someFileObject as a text file). I find it handy to name these files by creating a string by concatenating the string named infile with ".out" ... rosalind4.txt.out ( for example).
-Rosalind Problem Names:
    - Find Substrings of a Genome Encoding a Given Amino Acid String
    - Generate the Theoretical Spectrum of a Cyclic Peptide
    - Find a Cyclic Peptide with Theoretical Spectrum Matching an Ideal Spectrum

As always, include an Inspection Intro Markdown that describes your specific algorithm at the beginning of the notebook, and another Inspection Results markdown at the end of the notebook that documents: your inspection team, the findings of the team, and your resolution of those findings.

Please submit your three notebooks, an example of one of the Rosalind files that you ran and passed, and the output that your program generated as a text file.

## Description
These are drawn from material presented in Ch. 4 of Compeau and Pevzner, with a focus on the Branch and Bound algorithm.

## Hints
1) Make sure to write your code with a well defined class that you instantiate an object from in your main() function
Here is a template to consider.

## Inspection Intro

To solve this problem we:
1. Determine the length of the DNA sequence of the target peptide by multiplying it by 3, because each amino acid is 3 bases. Call this length N
2. We iterate the input DNA sequence, scanning each reading frame generating chunks of size N. We translate the chunk to its peptide chain and check if it matches our target.
3. Save the matches and then output findings after iterating the DNA sequence. 

In [6]:
def reverseComplement(seq: str) -> str:
        """Generates reverse complement for given sequence.

        Args:
            seq (str): Kmer of alphabet {A,G,C,T}

        Returns:
            str: Reverse complement of input seq
        """
        return seq.translate(str.maketrans("AGCT", "TCGA"))[::-1]

class Peptides:
    """
    Class Peptides for solving Rosalind #16

    Parameters: 
        seq: str - The DNA string
    """
    def __init__(self, seq):
        self.seq = seq
        self.revSeq = reverseComplement(seq)
        self.rnaCodonTable = {
        # RNA codon table
        # U
        'UUU': 'F', 'UCU': 'S', 'UAU': 'Y', 'UGU': 'C',  # UxU
        'UUC': 'F', 'UCC': 'S', 'UAC': 'Y', 'UGC': 'C',  # UxC
        'UUA': 'L', 'UCA': 'S', 'UAA': '-', 'UGA': '-',  # UxA
        'UUG': 'L', 'UCG': 'S', 'UAG': '-', 'UGG': 'W',  # UxG
        # C
        'CUU': 'L', 'CCU': 'P', 'CAU': 'H', 'CGU': 'R',  # CxU
        'CUC': 'L', 'CCC': 'P', 'CAC': 'H', 'CGC': 'R',  # CxC
        'CUA': 'L', 'CCA': 'P', 'CAA': 'Q', 'CGA': 'R',  # CxA
        'CUG': 'L', 'CCG': 'P', 'CAG': 'Q', 'CGG': 'R',  # CxG
        # A
        'AUU': 'I', 'ACU': 'T', 'AAU': 'N', 'AGU': 'S',  # AxU
        'AUC': 'I', 'ACC': 'T', 'AAC': 'N', 'AGC': 'S',  # AxC
        'AUA': 'I', 'ACA': 'T', 'AAA': 'K', 'AGA': 'R',  # AxA
        'AUG': 'M', 'ACG': 'T', 'AAG': 'K', 'AGG': 'R',  # AxG
        # G
        'GUU': 'V', 'GCU': 'A', 'GAU': 'D', 'GGU': 'G',  # GxU
        'GUC': 'V', 'GCC': 'A', 'GAC': 'D', 'GGC': 'G',  # GxC
        'GUA': 'V', 'GCA': 'A', 'GAA': 'E', 'GGA': 'G',  # GxA
        'GUG': 'V', 'GCG': 'A', 'GAG': 'E', 'GGG': 'G'  # GxG
    }
        self.dnaCodonTable = {key.replace('U', 'T'): value for key, value in self.rnaCodonTable.items()}
    def translateDnaToAA(self, dna):
        """
        Transaltes DNA to codons then finally to amino acids.
        """
        aa = ""
        x = 0 
        while x < len(dna):
            s = dna[x:x+3]
            a = self.dnaCodonTable[s]
            aa += a 
            x += 3
        return aa
    def findOccurences(self, peptide):
        """
        Finds occurences of input peptide in input sequence.
        """
        found = []
        
        peptideSeqLen = len(peptide)*3
        # iterate each reading frame
        for frame in range(0,3):
            p = frame
            while p <= len(self.seq):
                # generate candidate seq
                candidateSeq = self.seq[p: p+peptideSeqLen]
                if len(candidateSeq) == peptideSeqLen:
                    # translate to peptide
                    revCandidateSeq = reverseComplement(candidateSeq)
                    forwardPep = self.translateDnaToAA(candidateSeq)
                    reversePep = self.translateDnaToAA(revCandidateSeq)
                    # check peptides against target
                    if forwardPep == peptide:
                        found.append(candidateSeq)

                    if reversePep == peptide:
                        found.append(candidateSeq)
                p += 3
        
        return found



def main(inFile = None):
    '''
    Do the main thing
    '''
    with open(inFile) as fh:
        lines = [l.strip() for l in fh.readlines()]
        seq = lines[0]
        targetPeptide = lines[1]
    
    peptide = Peptides(seq)
    with open("cmirchandani_rosalind16_out.txt", "w") as f:
        for p in peptide.findOccurences(targetPeptide):
            print(p)
            print(p, file=f)
    
    
    
if __name__ == "__main__":
    main(inFile = 'rosalind_ba4b.txt') 

TGTCGGTAGGCTGCCCCATGATTG
CAATCATGGGGATCTTTGCCTACG
AGTCGGAAGTGAACCCCATGATTG
CAAAGTTGGGGCTCGTTACCGACG
GGTTGGAAGAGAACCCCAAGATTG
CAGTCGTGGGGTTCACTACCAACA
CGTTGGCAGGGAGCCCCAACTCTG
AGTAGGCAGCGAACCCCATGACTG
CAGTCCTGGGGATCTCTACCCACT
CAGTCGTGGGGGTCCTTGCCAACG
GGTAGGTAGAGAGCCCCATGATTG
CAGAGTTGGGGAAGTCTACCAACG
TGTGGGCAGGGAACCCCAGCTCTG
GGTAGGAAGACTTCCCCAGGATTG
CGTCGGCAGCGATCCCCAGCTCTG
TGTGGGTAAGGATCCCCAACTTTG
TGTGGGCAGAGATCCCCAACTTTG
CAGTCATGGGGTTCACTCCCGACG
CAATCCTGGGGCTCGCTCCCGACT
GGTTGGTAGCGAACCCCAGCTTTG


## Inspection Results

Inspection group: Gabe P., Jodie J.,

- Added print to stdout as well as file
- Added more inline comments to explain code 
- Removed unused code