# Branch and Bound
## Specification
- Name: Template for Problem 16, 17, 18 in Rosalind
- Name your notebooks as: problem16.ipynb, problem17.ipynb, problem18.ipynb
- options: none
- input: filename passed as first parameter to main
- output: a text file. ( using print ( .... file=someFileObject) is a handy way to do this after you have opened someFileObject as a text file). I find it handy to name these files by creating a string by concatenating the string named infile with ".out" ... rosalind4.txt.out ( for example).
-Rosalind Problem Names:
    - Find Substrings of a Genome Encoding a Given Amino Acid String
    - Generate the Theoretical Spectrum of a Cyclic Peptide
    - Find a Cyclic Peptide with Theoretical Spectrum Matching an Ideal Spectrum

As always, include an Inspection Intro Markdown that describes your specific algorithm at the beginning of the notebook, and another Inspection Results markdown at the end of the notebook that documents: your inspection team, the findings of the team, and your resolution of those findings.

Please submit your three notebooks, an example of one of the Rosalind files that you ran and passed, and the output that your program generated as a text file.

## Description
These are drawn from material presented in Ch. 4 of Compeau and Pevzner, with a focus on the Branch and Bound algorithm.

## Hints
1) Make sure to write your code with a well defined class that you instantiate an object from in your main() function
Here is a template to consider.

## Inspection Intro

To solve this problem:
1. Generate a list of candidate peptides by keeping a list of list of peptide masses.
2. Check if each candidate peptide matches the target spectrum, or if the candidate could be a possible match by checking if its a subset of the target spectrum.
3. If it matches we keep it and remove from candidates. If it doesn't match and is not consistent, remove from candidates. If it is consistent we keep it in the cadidates list.
4. Repeat until no more candidates exist. 

In [43]:
def reverseComplement(seq: str) -> str:
        """Generates reverse complement for given sequence.

        Args:
            seq (str): Kmer of alphabet {A,G,C,T}

        Returns:
            str: Reverse complement of input seq
        """
        return seq.translate(str.maketrans("AGCT", "TCGA"))[::-1]

class Peptides:
    """
    Class Peptides for solving Rosalind #18

    Parameters: 
        spectrum: str - The input peptide spectrum
    """
    def __init__(self, spectrum):
        self.s = list(spectrum.split(' '))
        self.spectrum = list(map(int, self.s))
        self.massTable = {
            'G': 57, 'A': 71, 'S': 87,
            'P': 97, 'V': 99, 'T': 101,
            'C': 103, 'I': 113, 'L': 113,
            'N': 114, 'D': 115, 'K': 128,
            'Q': 128, 'E': 129, 'M': 131,
            'H': 137, 'F': 147, 'R': 156,
            'Y': 163, 'W': 186
        }
    
    def cylcoSpectrum(self, seq):
        """
        Generates the cyclospectrum given a list of masses of a cyclic peptide.
        Parameters:
            seq: list[int] - List of masses of a cyclic peptide
        Returns:
            out: list[int] - Sorted list of the cyclospectrum of the input.
        """
        seqCycle = seq*2
        out = [0, sum(seq)]

        for i in range(len(seq)):
            for j in range(1, len(seq)):
                subseq = seqCycle[i:i+j]
                out.append(sum(subseq))
        return sorted(out)

    def linearSpectrum(self, peptide):
        """
        Generates the linear spectrum given a list of masses of a linear peptide.
        Parameters:
            peptide: list[int] - List of masses of a linear peptide
        Returns:
            spec: list[int] - Sorted list of the linear spectrum of the input.
        """
        spec = [0]
        for x in range(1,len(peptide)):
            for i in range(len(peptide)):
                if i+x <= len(peptide):
                    spec.append(sum(peptide[i:i+x]))
        spec.append(sum(peptide))
        
        return sorted(spec)
     
    def isConsistent(self, peptide):
        """
        Determines if candidate linear peptide is consistent with input spectrum

        Parameters:
            peptide: list[int] - List of masses of a linear peptide
        Returns
            bool: - A peptide is consistent if all of its masses are in the spectra being compared to.
        """
        linSpec = self.linearSpectrum(peptide)
        for p in linSpec:
            if p not in self.spectrum:
                return False
        return True
    
    def expand(self, peps):
        """
        Expands the peptides in input by 1 amino acid mass

        Parameters:
            peps: list[list[int]] - The current list of candidate peptides

        Returns:
            newPeps: list[list[int]] - All peptides in peps, but with 1 amino acid added.
        """
        newPeps = []
        for item in peps:
            for mass in set(self.massTable.values()):
                newPeps.append(item + [mass])
        return newPeps
    
    def CylcopeptideSequencing(self):
        """
        Determines every amino acid string Peptide such that Cyclospectrum(Peptide) = Spectrum (if such a string exists).
        """
        result = set()
        peptides = [[]]
        
        # only search while we have peptides to test
        while peptides:
            peptides = self.expand(peptides)  # branching step, add new peptides 
            for peptide in peptides: 
                # check if mass of current peptide is equal to input 
                if sum(peptide) == self.spectrum[-1]: 
                    # check if the cyclospectrums match
                    if self.cylcoSpectrum(peptide) == self.spectrum:
                        # we have a match so add it 
                        result.add(tuple(peptide))
                    # regardless if matched or not, we get rid of the peptide from our search
                    peptides = [p for p in peptides if p != peptide]
                # check for consistency, if not then get rid of peptide.
                elif not self.isConsistent(peptide):
                    peptides = [p for p in peptides if p != peptide]
             
        return result
  
    
def main(inFile = None):
    '''
    Do the main thing
    '''
    
    with open(inFile, "r") as f:
        line = f.readline().strip()
    
    peptide = Peptides(line)
    masses = peptide.CylcopeptideSequencing()
    output = ["-".join(map(str,p)) for p in masses]
    
    
    with open("cmirchandani_rosalind18_out.txt", "w") as out:
        print(*output, sep=" ")
        print(*output, sep=" ", file=out)
    
    
    
    
if __name__ == "__main__":
    main(inFile = 'rosalind_ba4e.txt') 

97-101-87-156-71-103-114-97-115 97-114-103-71-156-87-101-97-115 71-103-114-97-115-97-101-87-156 101-97-115-97-114-103-71-156-87 103-71-156-87-101-97-115-97-114 115-97-101-87-156-71-103-114-97 87-101-97-115-97-114-103-71-156 156-87-101-97-115-97-114-103-71 156-71-103-114-97-115-97-101-87 114-97-115-97-101-87-156-71-103 114-103-71-156-87-101-97-115-97 71-156-87-101-97-115-97-114-103 103-114-97-115-97-101-87-156-71 115-97-114-103-71-156-87-101-97 101-87-156-71-103-114-97-115-97 97-115-97-114-103-71-156-87-101 97-115-97-101-87-156-71-103-114 87-156-71-103-114-97-115-97-101


## Inspection Results

Inspection group: Gabe P., Jodie J.,

- Added print to stdout as well as file
- Added more inline comments to explain code 
- Used map() to convert input to ints, and to make output str