## **1. Secret Message**

Doni and Dini likes to share message with each others. But they know that Dono always try to read their message. So they decide to encrypt their message.

To encrypt the message, Doni and Dini create a mapping table. e.g.

| From | To |
|------|----|
| A    | C  |
| B    | V  |
| C    | R  |
... and so on

When Doni or Dini wants to send a message, they will replace each character in the message with the corresponding character in the mapping table. For example, if they wants to send the word `ABC`, they will replace it with `CVR`.

And to decrypt the message, Doni or Dini will reverse the mapping table. e.g. `C` will be `A`, `V` will be `B`, and `R` will be `C`. So message `CVR` will be decrypted to `ABC`.

Dono knows about this, so he tries to decrypt the message. But he doesn't know the mapping table. Hmm, so how?

### **Clue**

Dono knows that the message is in Bahasa Indonesia. So he tries to find the most common character in Bahasa Indonesia. He found that the most common character is `A`, followed by `N`, `I`, `E`, `T`, and so on. So he assumes that the most common character in the message is `A`, and replace it with `A`. Then he tries to find the second most common character, and replace it with `N`. And so on.

Here is the frequency table of Bahasa Indonesia character Dono found:

```python
letter_freq = [('A', 19.64), ('N', 9.87), ('I', 8.28), ('E', 8.20), ('T', 5.40), 
               ('M',  5.12), ('K', 4.80), ('S', 4.64), ('U', 4.56), ('D', 4.20), 
               ('R',  4.04), ('O', 3.23), ('P', 3.10), ('B', 2.92), ('H', 2.76), 
               ('G',  2.48), ('L', 2.40), ('Y', 1.88), ('J', 0.91), ('C', 0.64), 
               ('F',  0.36), ('W', 0.24), ('V', 0.16), ('Z', 0.04), ('X', 0.02), ('Q', 0.01)
            ]
```

### **RULESET**
Of course the frequency of the character in the message will not be exactly the same as the frequency of the character in Bahasa Indonesia. But Dono assumes that it will be close enough. So, he just sort the character in the message by its frequency, and decrypt it with the character in Bahasa Indonesia with the same index.

So, `ZZZZBBBCC` will be decrypted to `AAAANNNII`.

#### **Tie**

He found that _sometimes_ some characters in the message have the same frequency. For example, the message `ZZZABAB`. Both of the `A` and `B` have the same frequency.** For such cases, he will decrypt it to `_` (to indicate that he doesn't know the character). So `ZZZABAB` will be decrypted to `AAA____`**

#### **Non Alphabet Character**

**It turned out non alphabet character is NOT encrypted**. So `ZZZ!@#$%^&*()90` can be decrypted to `AAA!@#$%^&*()90`. All alphabets are in uppercase.

### **Challenge**

**Given an encrypted message and letter frequency, please help Dono to decrypt the message.**
### **Solution**

Multiple string manipulation steps are involved in solving the challenge, in summary the solution steps are as follow:
1. Create letter frequency list for input message.
2. Map encrypting table by matching the frequency order of letters in message and in the entirety of Bahasa Indonesia.
3. Translate message using encrypting table from step 2.

In [1]:
message = """
AP RLGNL MGHQCGNQCPLD CGKGRPG AP NTZIGK JQNG HGZALZI, WTZTMPNP HTCZGDG HLAP NTZIGK HTJTCYG JTCGR DTZUPWNGJGZ GP NTCUGZIIPK RTWGZYGZI DGRG. APG DTZVTHLNZVG 'GZNQ PZNTMTYTZ'. NTNGWP, VGZI NPAGJ APJTNGKLP QMTK HLAP GAGMGK HGKOG RTMGDG WCQRTR WTDHLGNGZ, APG NGZWG RTZIGYG DTZLDWGKJGZ JQWP JT AGMGD DTRPZ GZNQ.

JTNPJG HLAP DTZIGJNPFJGZ GZNQ LZNLJ WTCNGDG JGMPZVG, KGRPMZVG... GZTK. GMPK-GMPK DTZVGWG ATZIGZ "KGMQ, NLGZJL HLAP," RTWTCNP VGZI APKGCGWJGZ, GZNQ DGMGK HTCJGNG, "KGMQ, RTMGDGN WGIP! DGL JQWP MGIP?"

HLAP NTCJTYLN AGZ DTZUQHG DTDHTCP PZRNCLJRP MGPZ. "GZNQ, NQMQZI KPNLZI HTCGWG KGRPM AGCP ALG APNGDHGK ALG!" NGWP YGOGHGZ GZNQ DGMGK, "DGGF, RGVG RTAGZI RPHLJ DTZVPGWJGZ JQWP. DGL VGZI DGZPR GNGL WGKPN?"

RTDGJPZ FCLRNCGRP, HLAP HTCNGZVG, "GZNQ, GWG VGZI NTCYGAP ATZIGZDL? JTZGWG JGDL NTCQHRTRP ATZIGZ JQWP?" GZNQ ATZIGZ WQMQRZVG DTZYGOGH, "DGGF, RGVG KGZVG DTCGRG MTHPK KPALW ATZIGZ JQWP. JQWP GAGMGK JQAT AGRGC RGVG."

KGCP-KGCP HTCPJLNZVG WTZLK ATZIGZ NPZIJGK MGJL GZTK GZNQ. APG RTCPZI JGMP DTDHTCP CTJQDTZAGRP UGFT NTCHGPJ AP HGZALZI JTNPJG APNGZVG RQGM DGNTDGNPJG, GNGL DTZVTAPGJGZ CTRTW JQWP RGGN APDPZNG LZNLJ DTZTCYTDGKJGZ HGKGRG. MGHQCGNQCPLD RTCPZI JGMP NTCUPLD GCQDG JQWP DTRJPWLZ NPAGJ GAG JQWP VGZI RTAGZI APRTALK.

WGAG RLGNL KGCP, HLAP DTDLNLRJGZ LZNLJ DTDHGOG GZNQ JT RTDPZGC GP PZNTCZGRPQZGM. APG HTCWPJPC, "DLZIJPZ WGCG GKMP AP RGZG HPRG DTDHGZNL." ZGDLZ, RTMGDG WCTRTZNGRP, GMPK-GMPK DTZYGOGH WTCNGZVGGZ AGCP GLAPTZR, GZNQ DGMGK DTZGOGCJGZ JQWP JTWGAG RTDLG VGZI KGAPC.

RTQCGZI WCQFTRQC AGCP YTWGZI, WCQF. RGNQ, DTZATJGNP HLAP RTNTMGK WCTRTZNGRP. "RGVG JPCG RGVG NGKL DGRGMGKZVG," JGNGZVG RGDHPM DTZLZYLJJGZ IGDHGC DTRPZ ATZIGZ ZQAG JQWP. "GP GZAG NGDWGJZVG NTMGK 'DTCGRGJGZ' JQWP."

HLAP, ATZIGZ DLJG DTDTCGK, DTZIGJLP JTRGMGKGZZVG. WCQF. RGNQ, ATZIGZ RTZVLD CGDGK, DTZVGCGZJGZ GIGC HLAP DTCTRNGCN GZNQ RTNTMGK DTDHTCRPKJGZZVG. ZGDLZ, GAG DGRGMGK. GZNQ, ATZIGZ JTUPZNGGZZVG WGAG JQWP, DTZQMGJ LZNLJ APDGNPJGZ.

"RGVG NPAGJ DGL! RGVG HLNLK JQWP!" JGNG GZNQ ATZIGZ ZGAG DTZATRGJ. HLAP DTZUQHG RTIGMG UGCG, DLMGP AGCP DTDHTCPZVG 'WGMRL' JQWP KPZIIG DTZYGZYPJGZ MPHLCGZ JT WTCJTHLZGZ JQWP, NTNGWP GZNQ NTNGW DTZQMGJ.

GJKPCZVG, HLAP DTDPMPJP PAT. "GZNQ, HGIGPDGZG YPJG RGVG DTDHLGNJGZDL STCRP 'DGZLRPG' AGCP JQWP? JGL HPRG DTCGRGJGZ JQWP JGWGZ RGYG NGZWG KGCLR DPZLD."

GZNQ, VGZI WTZGRGCGZ, RTNLYL. HLAP HTJTCYG JTCGR AGZ DTZUPWNGJGZ RTZRQC VGZI AGWGN DTDHTCPJGZ RTZRGRP DTZPJDGNP JQWP NGZWG KGCLR DTDPZLDZVG. AGZ NTCZVGNG, PNL HTCKGRPM!

ATZIGZ RTZRQC HGCLZVG, GZNQ DTZYGAP GP VGZI MTHPK TFPRPTZ, ZGDLZ NTNGW ATZIGZ UPCP JKGR JTUPZNGGZZVG WGAG JQWP. APG HGKJGZ DTDHLGN RTHLGK WCQICGD APDGZG RTDLG GP AP ALZPG HPRG "DTZPJDGNP" JQWP SPCNLGM.

NPAGJ MGDG JTDLAPGZ, GZNQ DTZYGAP NTCJTZGM AP RTMLCLK ALZPG. HLJGZ KGZVG RTHGIGP GP VGZI UGZIIPK, NTNGWP YLIG RTHGIGP ALNG HTRGC JQWP SPCNLGM.

AGZ HTIPNLMGK, AP NTZIGK RTDLG JTDGYLGZ NTJZQMQIP AGZ JTUTCAGRGZ HLGNGZ, RTHLGK UPZNG RTATCKGZG WGAG JQWP DGDWL DTZUPWNGJGZ CTSQMLRP. AGZ NTZNL RGYG, HLAP AGZ GZNQ RTMGML DTZPJDGNP RTUGZIJPC JQWP HTCRGDG RTNPGW WGIPZVG. RTHLGK NGZAG WTCRGKGHGNGZ GZNGCG DGZLRPG AGZ GP.
"""

In [13]:
# First we need to clean up the message from non-alphabetical characters before counting the frequency
# We can do this by 2 Methods

# Method 1: Cleansing message without regex
other_chars  = ''.join(c for c in map(chr, range(256)) if not c.isalnum())
map_table    = str.maketrans(other_chars, ' '*len(other_chars)) # Make translation table to remove other characters
text_cleaned = message.lower().translate(map_table).strip().replace(' ','')
print(len(text_cleaned))

# Method 2: Cleansing message with regex (USED FOR SOLUTION)
from collections import Counter
import re

text_cleaned = re.findall(r"[A-Z]", message)
print(len(text_cleaned))

2500
2500


In [15]:
# Create letter frequencies mapping from reference
ref_letter_freq = [('A', 19.64), ('N', 9.87), ('I', 8.28), ('E',  8.20), ('T', 5.40), ('M', 5.12),
                      ('K',  4.80), ('S', 4.64), ('U', 4.56), ('D',  4.20), ('R', 4.04), ('O', 3.23),
                      ('P',  3.10), ('B', 2.92), ('H', 2.76), ('G',  2.48), ('L', 2.40), ('Y', 1.88),
                      ('J',  0.91), ('C', 0.64), ('F', 0.36), ('W',  0.24), ('V', 0.16), ('Z', 0.04),
                      ('X',  0.02), ('Q', 0.01)
                      ]

def CreateDecryptDict(message: str, ref_letter_freq: list[tuple]) -> dict:
    # Count letter frequencies in message
    c = Counter(re.findall(r"[A-Z]", message))
    msg_char_freq = [[w, freq] for w, freq in c.most_common()]
    # Count duplicates(tie) in message letter frequencies
    char_freq_counter = Counter([x[1] for x in msg_char_freq])

    decryption_dict = {}
    for idx, char in enumerate(msg_char_freq):
      # If frequency is tied, replace char with '_'
      if char_freq_counter[char[1]] > 1:
        decryption_dict[char[0]] = "_"
      else:
        decryption_dict[char[0]] = ref_letter_freq[idx][0]
        
    return decryption_dict
  
# Create decryption table from dictionary
decryption_table = str.maketrans(CreateDecryptDict(message, ref_letter_freq))
print(map_table)
print(message.translate(decryption_table))

{0: 32, 1: 32, 2: 32, 3: 32, 4: 32, 5: 32, 6: 32, 7: 32, 8: 32, 9: 32, 10: 32, 11: 32, 12: 32, 13: 32, 14: 32, 15: 32, 16: 32, 17: 32, 18: 32, 19: 32, 20: 32, 21: 32, 22: 32, 23: 32, 24: 32, 25: 32, 26: 32, 27: 32, 28: 32, 29: 32, 30: 32, 31: 32, 32: 32, 33: 32, 34: 32, 35: 32, 36: 32, 37: 32, 38: 32, 39: 32, 40: 32, 41: 32, 42: 32, 43: 32, 44: 32, 45: 32, 46: 32, 47: 32, 58: 32, 59: 32, 60: 32, 61: 32, 62: 32, 63: 32, 64: 32, 91: 32, 92: 32, 93: 32, 94: 32, 95: 32, 96: 32, 123: 32, 124: 32, 125: 32, 126: 32, 127: 32, 128: 32, 129: 32, 130: 32, 131: 32, 132: 32, 133: 32, 134: 32, 135: 32, 136: 32, 137: 32, 138: 32, 139: 32, 140: 32, 141: 32, 142: 32, 143: 32, 144: 32, 145: 32, 146: 32, 147: 32, 148: 32, 149: 32, 150: 32, 151: 32, 152: 32, 153: 32, 154: 32, 155: 32, 156: 32, 157: 32, 158: 32, 159: 32, 160: 32, 161: 32, 162: 32, 163: 32, 164: 32, 165: 32, 166: 32, 167: 32, 168: 32, 169: 32, 171: 32, 172: 32, 173: 32, 174: 32, 175: 32, 176: 32, 177: 32, 180: 32, 182: 32, 183: 32, 184: 32,