**Inferring mRNA from Protein**

For positive integers a and n, a modulo n (written amodn in shorthand) is the remainder when a is divided by n. For example, 29mod11=7 because 29=11×2+7.

Modular arithmetic is the study of addition, subtraction, multiplication, and division with respect to the modulo operation. We say that a and b are congruent modulo n if amodn=bmodn; in this case, we use the notation a≡bmodn.

Two useful facts in modular arithmetic are that if a≡bmodn and c≡dmodn, then a+c≡b+dmodn and a×c≡b×dmodn. To check your understanding of these rules, you may wish to verify these relationships for a=29, b=73, c=10, d=32, and n=11.

As you will see in this exercise, some Rosalind problems will ask for a (very large) integer solution modulo a smaller number to avoid the computational pitfalls that arise with storing such large numbers.

In [1]:
codon_number = {'A':4,
                'C':2,
                'D':2,
                'E':2,
                'F':2,
                'G':4,
                'H':2,
                'I':3,
                'K':2,
                'L':6,
                'M':1,
                'N':2,
                'P':4,
                'Q':2,
                'R':6,
                'S':6,
                'T':4,
                'V':4,
                'W':1,
                'Y':2}

In [7]:
filepath = "/mnt/c/Data/ROSALIND_download/rosalind_mrna.txt"
with open(filepath) as f:
    sequnece = ""
    for line in f.readlines():
        line = line.rstrip()
        sequnece += line
        
sequnece

'MENLQQIAFIQEYPVWAECYYRFIGGRNYKCMIDGTKSTVQVLVNSIVRMWCHHVSNGKGYVMRVILVWPIYTFHEFHVCNHELIFITLTCPKQIPKMQKTYMKQSPIHWFEEGMWKYDLHQYQCDCDGDMKTEGPCHLYIPHRLPMWRCSYQIFTGRMAQPAIFVCECLLKVLRGWHGMDGIPSKPCPTAQKDPGNSLSFKAHASFTQMSKITPWTSIQQFKRMYQTPQMMPKAIQIYRAQHGHAIFEVCELVEGPTDAKANFRTGWHADLKPDSNECAYLKWQEPVLALPTILDRMCELCKDKPRHGRRKSCWHPNLRFDDQRMYWCGPKYCQGWFPHLKPLAKNGEIEVICQDAWVMVLDDGPTFVAFEWCPRKYDFRPESHYNSVCRSKTPEWPAFAFMYDWPFCNFTYHDIFIPLVPHGLGNMPGNAVAMFLYQATSDHAGFCSCWSTMLCRHNKLHSPTRLFVILENWVTETPEQSMIPSGAWPSSMGHGKFPLNSSYQQFHTMCEVDSFGGFYTKEWTLMICPHTEVCFVTECHFWYELQAPFITEAAPNLLVDWYGSGVATQAPKVSTVYRFNKWKFELYVFQPMDDFHSLKVWPNLIYRMWVQIYEFEDIITVILELSTSVLTCTWDMGVVFTHIYECIEDFNFKWVPNFKVDVANRVHAMCFWHGSCNMMFIAPFPLWGDERPWDNTCFQHSWFEYYFLSLKSGKPLGLPLHECDWFRPATGGPHDWWHGLDQAKGCITGGPAPFFLACYFCDDKPEQQWGWKCHIMHQDVTMLAVHYMCYLMPHAMRDDPGPFLHSHQARWCKWDYSSETEGTMNHTIDPDRTISPMFHFACGCNEWCSRAFVETGDVLIIIMIWVSPANCCEKEWHCLCSKYMQLPAPIVKTGSRVIQGHDADNWMAKWPRFKCMAGYEKSVMHSTKGEHCTMRAQITVKWEDQHHRYVMQTWKENSGTRTTIWTQIEHSPSEKCLQARKSTVWLSAHPTSYIMQES

In [8]:
seq_num = [codon_number[i] for i in sequnece]
seq_num.append(3)
seq_num

[1,
 2,
 2,
 6,
 2,
 2,
 3,
 4,
 2,
 3,
 2,
 2,
 2,
 4,
 4,
 1,
 4,
 2,
 2,
 2,
 2,
 6,
 2,
 3,
 4,
 4,
 6,
 2,
 2,
 2,
 2,
 1,
 3,
 2,
 4,
 4,
 2,
 6,
 4,
 4,
 2,
 4,
 6,
 4,
 2,
 6,
 3,
 4,
 6,
 1,
 1,
 2,
 2,
 2,
 4,
 6,
 2,
 4,
 2,
 4,
 2,
 4,
 1,
 6,
 4,
 3,
 6,
 4,
 1,
 4,
 3,
 2,
 4,
 2,
 2,
 2,
 2,
 2,
 4,
 2,
 2,
 2,
 2,
 6,
 3,
 2,
 3,
 4,
 6,
 4,
 2,
 4,
 2,
 2,
 3,
 4,
 2,
 1,
 2,
 2,
 4,
 2,
 1,
 2,
 2,
 6,
 4,
 3,
 2,
 1,
 2,
 2,
 2,
 4,
 1,
 1,
 2,
 2,
 2,
 6,
 2,
 2,
 2,
 2,
 2,
 2,
 2,
 2,
 4,
 2,
 1,
 2,
 4,
 2,
 4,
 4,
 2,
 2,
 6,
 2,
 3,
 4,
 2,
 6,
 6,
 4,
 1,
 1,
 6,
 2,
 6,
 2,
 2,
 3,
 2,
 4,
 4,
 6,
 1,
 4,
 2,
 4,
 4,
 3,
 2,
 4,
 2,
 2,
 2,
 6,
 6,
 2,
 4,
 6,
 6,
 4,
 1,
 2,
 4,
 1,
 2,
 4,
 3,
 4,
 6,
 2,
 4,
 2,
 4,
 4,
 4,
 2,
 2,
 2,
 4,
 4,
 2,
 6,
 6,
 6,
 2,
 2,
 4,
 2,
 4,
 6,
 2,
 4,
 2,
 1,
 6,
 2,
 3,
 4,
 4,
 1,
 4,
 6,
 3,
 2,
 2,
 2,
 2,
 6,
 1,
 2,
 2,
 4,
 4,
 2,
 1,
 1,
 4,
 2,
 4,
 3,
 2,
 3,
 2,
 6,
 4,
 2,
 2,
 4,
 2,
 4,
 3,
 2,
 2,
 4,


In [None]:
total_num = 1
for i in seq_num:
    total_num *= i
    if total_num > 1000000:
        total_num = total_num % 1000000
    
total_num

## 나는 매번 mod 계산하는게 비효율적이라고 생각했음. mod 계산도 결국 계산인데, 이 계산을 매번 수행하는건 자원 낭비라고 생각함
## 근데 그렇지 않은 것 같음. 작은 수의 mod 연산 비용은 크지 않음. 숫자가 과도하게 커지는 것이 연산에 더 큰 부하로 이어짐.
## 매번 그냥 mod 계산하는게 최적화나 가독성에서도 더 좋을듯.

179456