## Chalenge 1
### Convert hex to base 64

I have given an hex string(base16), so i will convert it to bytes and then encode it to base64. 

In [1]:
import base64
def hexToB64(hexstr:str)->str:
    b64Bytes = base64.b64encode((bytes.fromhex(hexstr)))
    return b64Bytes.decode()

In [2]:
## Check
hexstr = "49276d206b696c6c696e6720796f757220627261696e206c696b65206120706f69736f6e6f7573206d757368726f6f6d"
base64str = "SSdtIGtpbGxpbmcgeW91ciBicmFpbiBsaWtlIGEgcG9pc29ub3VzIG11c2hyb29t"
print(hexToB64(hexstr)==base64str)

True


## Chalenge 2
### Fixed XOR
We have given two hex strings we have to calculate bitwise xor of those two strings

In [3]:
def xorCalc(str1:str,str2:str)->str:
    str1bytes = bytes.fromhex(str1)
    str2bytes = bytes.fromhex(str2)
    xorbytes = bytes([a^b for a,b in zip(str1bytes,str2bytes)])
    return xorbytes.hex()

In [4]:
# Check
hexstr1 = "1c0111001f010100061a024b53535009181c"
hexstr2 = "686974207468652062756c6c277320657965"
xorres = "746865206b696420646f6e277420706c6179"

print(xorCalc(hexstr1,hexstr2)==xorres)

True


## Chalenge 3
### Single-byte XOR cipher

a hex encoded string has been xored against a single character(key), i assumed that the string has been xored by repeating key to match the length of string
a single character has `8bits = 1byte` and a single hexcharacter has `4bits` so to match their length i have to multiply the character by `len(hexstr)/2`. if `len(hexstr)` is not divisible we will add a `0 bit` at starting of hexstring

In [5]:
def xor_with_character(hexstr:str,char:str)->str:
    '''This returns a string in readable format'''
    if len(hexstr)%2 != 0: hexstr = '0'+hexstr
    charbytes = bytes(char*(len(hexstr)//2),encoding='ascii')
    hexStrBytes = bytes.fromhex(hexstr)
    xored_result_int = [a^b for a,b in zip(hexStrBytes,charbytes)]
    for i in xored_result_int:
        if (not chr(i).isprintable()) and i!=10 and i!=32:return False
    return bytes(xored_result_int).decode()

In [6]:
ciphertext = '1b37373331363f78151b7f2b783431333d78397828372d363c78373e783a393b3736'

for i in range(ord('A'),ord('Z')+1):
    print(xor_with_character(ciphertext,chr(i))," <= result for character ",chr(i))

False  <= result for character  A
False  <= result for character  B
False  <= result for character  C
False  <= result for character  D
False  <= result for character  E
False  <= result for character  F
\pptvqx?R\8l?svtz?~?opjq{?py?}~|pq  <= result for character  G
False  <= result for character  H
False  <= result for character  I
Q}}y{|u2_Q5a2~{yw2s2b}g|v2}t2psq}|  <= result for character  J
False  <= result for character  K
False  <= result for character  L
Vzz~|{r5XV2f5y|~p5t5ez`{q5zs5wtvz{  <= result for character  M
False  <= result for character  N
Txx|~yp7ZT0d7{~|r7v7gxbys7xq7uvtxy  <= result for character  O
Kggcafo(EK/{(dacm(i(xg}fl(gn(jikgf  <= result for character  P
Jffb`gn)DJ.z)e`bl)h)yf|gm)fo)khjfg  <= result for character  Q
False  <= result for character  R
Hdd`bel+FH,x+gb`n+j+{d~eo+dm+ijhde  <= result for character  S
False  <= result for character  T
Nbbfdcj-@N*~-adfh-l-}bxci-bk-olnbc  <= result for character  U
Maaeg`i.CM)}.bgek.o.~a{`j.ah.loma`  <= result for char

so for character `X` we got our decrypted message `Cooking MC's like a pound of bacon`. now for designing a metric i will use wikipedia most used characters in english language and Bhattacharya's coefficient(inspired from `pranavmaneriker`).

In [7]:
import math

def calculate_bc(eng_str:str)->float:
    '''This function gives a value between 0 and 1, which represents similarity of eng_str with eng language'''
    char_freq = {
        'a': 0.08166999999999999,
        'b': 0.01492,
        'c': 0.02782,
        'd': 0.04253,
        'e': 0.12702,
        'f': 0.02228,
        'g': 0.02015,
        'h': 0.06094,
        'i': 0.06966,
        'j': 0.00153,
        'k': 0.00772,
        'l': 0.04025,
        'm': 0.02406,
        'n': 0.06749,
        'o': 0.07507,
        'p': 0.01929,
        'q': 0.00095,
        'r': 0.05987,
        's': 0.06326999999999999,
        't': 0.09055999999999999,
        'u': 0.02758,
        'v': 0.00978,
        'w': 0.0236,
        'x': 0.0015,
        'y': 0.01974,
        'z': 0.00074
    }
    str_chr_freq = {a:0 for a in char_freq}
    eng_str = eng_str.lower()

    for c in eng_str:
        if c in str_chr_freq:
            str_chr_freq[c] += 1
    
    BC = 0
    for i in char_freq: BC += math.sqrt(char_freq[i]*str_chr_freq[i]/len(eng_str))
    return BC

In [8]:
def sorted_xor_ciphers(ciphertext):
    poss_soln = []
    for c in range(128):
        try:
            decoded_str = xor_with_character(ciphertext,chr(c))
            if not decoded_str: continue
            poss_soln.append((chr(c),decoded_str,calculate_bc(decoded_str)))
        except UnicodeDecodeError:
            continue
    poss_soln.sort(key=lambda x:x[2],reverse=True)
    return poss_soln
print(sorted_xor_ciphers(ciphertext)[0])

('X', "Cooking MC's like a pound of bacon", 0.7102457869430675)


## Challenge 4
### Find the XOR Ciphered string

In file `1.in` hex strings are ciphered same as above, i will decipher all the strings and will sort acc to their BC.

In [9]:
with open('1.in','r') as f:
    input_str = f.read().strip().splitlines()

output_str = []
for s in input_str:
    s = s.strip()
    output_str.extend(sorted_xor_ciphers(s))

output_str.sort(key=lambda x:x[2],reverse=True)
print(output_str[0:5])

[('5', 'Now that the party is jumping\n', 0.769583058499731), ('b', 'wpUs{gnns F8b-aT2*zRAh{Gj>xuED', 0.6548860350981618), ('t', 'afCemqxxe6P.t;wB$<lDW~mQ|(ncSR', 0.627097811672793), ('m', '[$LrLi;hsDbv!5 j$OaEfG=>rKDl=S', 0.6220477507384194), ('c', 'vqTrzfoor!G9c,`U3+{S@izFk?ytDE', 0.6060819316694385)]


For key = `5` we get `Now that the party is jumping\n` as our deciphered text.

## Challenge 5
### Implement repeating-key XOR
In repeating-key XOR, you'll sequentially apply each byte of the key; the first byte of plaintext will be XOR'd against I, the next C, the next E, then I again for the 4th byte, and so on. 

In [10]:
def repeating_key_xor(eng_str:str,key:str):
    "".join(eng_str.splitlines())
    xorbytes = []
    for i,ch in enumerate(eng_str):
        keyIndex = key[(i%len(key))]
        xorbyte = hex(ord(ch)^ord(keyIndex))[2:]
        if len(xorbyte)%2 != 0 : xorbyte = "0"+xorbyte
        xorbytes.append(xorbyte)
    res = "".join(xorbytes)
    return res

In [11]:
eng_txt = '''Burning 'em, if you ain't quick and nimble
I go crazy when I hear a cymbal'''

ciphertext = "0b3637272a2b2e63622c2e69692a23693a2a3c6324202d623d63343c2a26226324272765272a282b2f20430a652e2c652a3124333a653e2b2027630c692b20283165286326302e27282f"

print(repeating_key_xor(eng_txt,"ICE")==ciphertext)

True


## Challenge 6
### Breaking Repeating Key Cipher
I will first implement the `editing/hamming distance` function which returns the number of different bits in two strings.

In [12]:
def ascii_to_binary(str1:str):
    res = ""
    for c in str1:
        res += bin(ord(c))[2:].zfill(8)
    return res

def hamming_distance(str1:str,str2:str)->int:
    '''Here str1 and str2 are bit strings'''
    res = 0
    for a,b in zip(str1,str2):
        if a!=b:res+=1
    return res

#Testing
print(hamming_distance(ascii_to_binary('this is a test'),ascii_to_binary('wokka wokka!!!'))==37)

True


Now continuing the steps. We have our base64 ciphered text in file `2.in`. first i will open it. Now i will write some helping functions and will proceed with the steps

In [36]:
import binascii
with open('2.in') as f:
    ciphertext_b64 = f.read().strip().splitlines()

def b64_to_b16(b64str):
    return binascii.b2a_hex(binascii.a2b_base64(b64str)).decode()

def b16_to_b2(hexstr):
    '''this will give binary form of data in form of bytes'''
    res = ""
    if len(hexstr)%2 != 0: hexstr = '0'+hexstr
    hexbytes = bytes.fromhex(hexstr)
    for i in hexbytes: res+= bin(i)[2:].zfill(8)
    return res

ciphertext_b16 = [b64_to_b16(a) for a in ciphertext_b64]
ciphertext_b2 = [b16_to_b2(a) for a in ciphertext_b16]
ciphertext_joined_b2 = "".join(ciphertext_b2)

key_hamming_distances = []

for keysize in range(2,100):
    keylen = 8*keysize
    s1 = ciphertext_joined_b2[0:keylen]
    s2 = ciphertext_joined_b2[keylen:2*keylen]
    s3 = ciphertext_joined_b2[2*keylen:3*keylen]
    s4 = ciphertext_joined_b2[3*keylen:4*keylen]
    avgHamming = ((hamming_distance(s1,s2)+hamming_distance(s3,s4))/2)/keysize
    key_hamming_distances.append((keysize,avgHamming))

key_hamming_distances.sort(key=lambda x:x[1])
print(key_hamming_distances[:10])


[(2, 2.0), (5, 2.5), (3, 2.6666666666666665), (13, 2.730769230769231), (87, 2.810344827586207), (58, 2.8275862068965516), (11, 2.909090909090909), (29, 2.9310344827586206), (31, 2.967741935483871), (48, 2.96875)]


so for `KEYSIZE in [2,5,3,13,11,87]` hamming distance is least. now for proceeding as written in steps. i will divide ciphertext in blocks of `keysize*8 bits` length. i modified the `xor_with_character` function such that it returns false if any string is non readable.

In [37]:
for keysize in [2,5,3,13,87,58,11,29,31]:
    blocksize = keysize*8
    blocks = []
    for i in range(len(ciphertext_joined_b2)):
        if i%blocksize == 0: blocks.append("")
        blocks[i//blocksize] += ciphertext_joined_b2[i]

    # transposed_blocks are blocks which are xored with same bit, like first bit of every block, then second...
    transposed_blocks = [""]*(blocksize//8)
    for i in range(len(blocks)):
        block = blocks[i]
        for j in range(len(block)//8):
            transposed_blocks[j] += block[8*j:8*(j+1)]
    print("Size: ",keysize)
    cracked = []
    for trans_block in transposed_blocks:
        hex_block = hex(int(trans_block,2))[2:]
        if len(hex_block)%2 != 0: hex_block = "0"+hex_block
        try:
            cracked.append(sorted_xor_ciphers(hex_block)[0])
        except IndexError:
            pass
    
    try:
        key = ""
        text = [""]*len(cracked[0][1])
        for l in cracked:
            key += l[0]
            for i in range(len(l[1])):
                text[i]+=l[1][i]

        print("Key = ",key)
        print()
        print("Decoded message: \n", "".join(text))
        
        print("***")
    except IndexError:
        pass


Size:  2
Size:  5
Size:  3
Size:  13
Size:  87
Key =  Terminator X: Bring the noiseTerminator X: Bring the noiseTerminator X: Bring the noise

Decoded message: 
 I'm back and I'm ringin' the b ll 
A rockin' os the mike while the fly giHls yellh
In acstasy in the back of me 
Wellnthat's my DJ Deihay cuttin' all them Z's 
mittin' 'ard dnd the girlies goin' crazy 
Vauilla's on the m
ke, man I'm not lazy. 

I'y lettint my  rug kick in 
It controls my mo'th and I begin !To just let it flow, let me concepws gob
My posse's to the side yellinn, Go Vanilla Go  

Smooth 'cause that's th' way I  ill re 
And if you don't give a damn, then 
Why youistarin' at me 
So get off scause Ihcontool the stage 
There's no dissin' allowed 
I'm hn my own phase 
The girlie  sa y taey l ve me and that is ok 
And I cao dance better tean any kid n' play 

Stageo2 -- Yef theione ya' wanna listen to 
It's -ff my head so l t the beat play through 
Sk I can eunk rt up and make it sound good 
1i2-3 Yo -- Knockcon some 

for kesize = 29, i get the key = "Terminator X: Bring the noise" and the decoded string matches the below decoded file. so i guess it is correct.

## Challenge 7
### AES in ECB mode
 The Base64-encoded content in `3.in` has been encrypted via AES-128 in ECB mode under the key
"YELLOW SUBMARINE".
(case-sensitive, without the quotes; exactly 16 characters; I like "YELLOW SUBMARINE" because it's exactly 16 bytes long, and now you do too).
Decrypt it. You know the key, after all. 

In [22]:
key = "YELLOW SUBMARINE"
from Crypto.Cipher import AES
cipher = AES.new(bytes(key,encoding='ascii'),AES.MODE_ECB)
with open('3.in') as f:
    data = f.read().strip()

base16data = b64_to_b16(data)
decrypted = (cipher.decrypt(bytes.fromhex(base16data))).decode()
print(decrypted)

I'm back and I'm ringin' the bell 
A rockin' on the mike while the fly girls yell 
In ecstasy in the back of me 
Well that's my DJ Deshay cuttin' all them Z's 
Hittin' hard and the girlies goin' crazy 
Vanilla's on the mike, man I'm not lazy. 

I'm lettin' my drug kick in 
It controls my mouth and I begin 
To just let it flow, let my concepts go 
My posse's to the side yellin', Go Vanilla Go! 

Smooth 'cause that's the way I will be 
And if you don't give a damn, then 
Why you starin' at me 
So get off 'cause I control the stage 
There's no dissin' allowed 
I'm in my own phase 
The girlies sa y they love me and that is ok 
And I can dance better than any kid n' play 

Stage 2 -- Yea the one ya' wanna listen to 
It's off my head so let the beat play through 
So I can funk it up and make it sound good 
1-2-3 Yo -- Knock on some wood 
For good luck, I like my rhymes atrocious 
Supercalafragilisticexpialidocious 
I'm an effect and that you can bet 
I can take a fly girl and make her wet. 


## Challenge 8
### Detect AES in ECB mode
`4.in` file are a bunch of hex-encoded ciphertexts.
One of them has been encrypted with ECB.
Detect it.
Remember that the problem with ECB is that it is stateless and deterministic; the same 16 byte plaintext block will always produce the same 16 byte ciphertext. 

i will divide a line in 32nibbles length each and create a dictionary with nibbles as key. and will print the line which have a nibbles more than 1 time

In [3]:
with open('4.in') as f:
    data = f.read().strip().splitlines()

for i,line in enumerate(data,start=1):
    poss = {}
    for r in range(len(line)//32):
        part = line[32*r:32*(r+1)]
        if not part in poss:poss[part] = 0
        poss[part] += 1
    for key in poss:
        if poss[key]>1:
            print(f'line {i} is solution\n{line}')

line 133 is solution
d880619740a8a19b7840a8a31c810a3d08649af70dc06f4fd5d2d69c744cd283e2dd052f6b641dbf9d11b0348542bb5708649af70dc06f4fd5d2d69c744cd2839475c9dfdbc1d46597949d9c7e82bf5a08649af70dc06f4fd5d2d69c744cd28397a93eab8d6aecd566489154789a6b0308649af70dc06f4fd5d2d69c744cd283d403180c98c8f6db1f2a3f9c4040deb0ab51b29933f2c123c58386b06fba186a
