# PROBLEM 17

Inspection Intro:
    Most of the problem was straight forward except for one part which took up quite a bit of time. What I especially struggled with was finding a way to split the peptide into linear fragments. Initially my working solution was to use itertools and simply use the combinations() function in order to get my fragments, but I found after consulting the textbook that this doesn't work because combinations() produces combos of every possible element of the provided string, which waaaay exceeded the 12 possible combos from the provided example LEQN/NQEL.
    What I sought to accomplish was finding a way to "cycle" over the given string. At first I wanted to Macgyver together a way to check the indices then subtract those values and add another string on top- something along those lines- but a simpler solution I ended up thinking of was to just simulate the cycling itself by doubling up the string, adding it together with itself. In the end the final solution was iterating over it in a nested for loop, with the first loop iterating over the entire length of the sequence (x) and the second loop iterating over everything after the first amino (y). The resulting fragment would then be the indices of x and x+y, so it wouldnt ever be iterating over two equal indices and also wouldn't ever go past a combo of 3 amino acids. This method also caps the indices so it wouldn't ever cycle across the peptide completely, meaning no duplicates. I don't want to think anymore.
    
    LEQNLEQN
    
    x:0, y:1     x:0, y:3
    LEQN[0:1]... LEQN[0:3]
    (L)         (LEQ)
    
    x:1, y:1     x:1, y:3
    LEQN[1:2]... LEQN[1:4]
    (E)         (EQN)   
    
    x:2, y:1     x:2, y:3
    LEQN[2:3]... LEQN[2:5] LE *QNL* EQN
    (Q)          (QNL)
    
    x:3, y:1     x:3, y:3
    LEQN[3:4]... LEQN[3:6]
    (N)          (NLE)
    

In [24]:
amino_to_mass = {
    "G":57, "A":71, "S":87, "P":97, "V":99,
    "T":101, "C":103, "I":113, "L":113, "N":114, 
    "D":115, "K":128, "Q":128, "E":129, "M":131,
    "H":137, "F":147, "R":156, "Y":163, "W":186
}

def peptide_mass_calc(peptide):
    '''returns mass of peptide by giving the sum of the masses of its amino acids'''
    mass = 0
    for amino in peptide:
        if amino in amino_to_mass.keys():
            mass += amino_to_mass[amino]
        
    return mass


def cyclic_fragment(string):
    '''splits a provided string into fragments in a cyclic fashion,
        that is at every possible connecting bond'''   
    fragments = []
    
    str_len = len(string)
    string_cycle = string + string # LEQN + LEQN = LEQNLEQN lets me cycle over when im iterating
    
    for x in range(str_len):
        for y in range(1,str_len):    
            fragments.append(string_cycle[x:x+y])
    
    return fragments


def cyclospectrum(peptide):
    '''returns the masses of all possible subpeptides in a cyclic peptide'''
    spectrum = [0, peptide_mass_calc(peptide)]
    
    pep_len = len(peptide)
    
    combos = cyclic_fragment(peptide)
                    
    for x in combos:
        spectrum.append(peptide_mass_calc(x))

        
    return sorted(spectrum, key=lambda x: x)
        
        
    

In [25]:
cheese = cyclospectrum('NRTDYDPCWFEAV')

s = ' '.join(repr(item) for item in cheese)
print(s)

0 71 97 99 101 103 114 115 115 129 147 156 163 170 186 200 200 212 213 216 257 270 276 278 278 284 289 299 315 333 347 369 371 372 375 379 386 393 413 436 440 446 462 470 478 486 490 494 501 533 533 535 541 560 565 569 585 591 593 632 636 648 649 650 656 662 664 670 694 716 733 735 746 747 748 764 777 779 785 811 817 819 832 848 849 850 861 863 880 902 926 932 934 940 946 947 948 960 964 1003 1005 1011 1027 1031 1036 1055 1061 1063 1063 1095 1102 1106 1110 1118 1126 1134 1150 1156 1160 1183 1203 1210 1217 1221 1224 1225 1227 1249 1263 1281 1297 1307 1312 1318 1318 1320 1326 1339 1380 1383 1384 1396 1396 1410 1426 1433 1440 1449 1467 1481 1481 1482 1493 1495 1497 1499 1525 1596
