# Transposition Cipher

Transposition cipher is a cipher technique where we shuffle the characters position of plain text in blocks. For example, if we choose a key length of 5, and decide the key as -

(3, 4, 1, 5, 2)

Then, on each block of 5 in the plain text, 1st character goes to position 3, 2nd to 4 and so on.

## Implementation

Firstly we define an encoding function that will be used to take plaintext and convert it to a 26 character encoding. (By converting all letters to upper case and discarding all remaining characters). We will also add padding character Z to make the length a multiple of M.

We also define M = 5 which will be the key length.

In [154]:
# key length
M = 5

In [155]:
def encode(string):
    result = ''
    for letter in string:
        if letter.isalpha():
            result += letter.upper()
    padding_length = (M - len(result)) % M
    return result + padding_length * 'Z'

# remove padding characters
def decode(string):
    return string.rstrip('Z')

Lets declare a plain text that we would need to encrypt.

In [156]:
P = 'Enemy might attack tonight. Stay alert!'

Encoding this string, we get -

In [157]:
T = encode(P)
print(T)

ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ


Since we will be only having 26 characters, we declare Zp as closed ring of 26 integers.

In [158]:
Zp = Integers(26)
print(Zp)

Ring of integers modulo 26


We now declare a key which is a permutation of numbers from 1 to M represented as sequence of characters.

In [159]:
characters = [chr(i + ord('A')) for i in range(M)]
key = ''
while len(characters) > 0:
    i = randint(0, len(characters) - 1)
    key += characters[i]
    del characters[i]

In [160]:
print('Key -', key)

Key - CEDBA


In [161]:
def transpositioncipher(text, cipher_key):
    cipher = ''
    for i in range(0, len(text), M):
        # iterate over each block
        block = text[i:i+M]
        new_block = ['A' for _ in range(M)]
        for j in range(M):
            k = ord(cipher_key[j]) - ord('A')
            new_block[j] = block[k]
        for c in new_block:
            cipher += c
    return cipher
    

def transpositiondecipher(cipher_text, cipher_key):
    text = ''
    for i in range(0, len(cipher_text), M):
        # iterate over each block
        block = cipher_text[i:i+M]
        new_block = ['A' for _ in range(M)]
        for j in range(M):
            k = ord(cipher_key[j]) - ord('A')
            new_block[k] = block[j]
        for c in new_block:
            text += c
    return text
    

Now we test the cipher and decipher algorithms by encrypting and decrypting the text P

In [162]:
T = encode(P)
C = transpositioncipher(T, key)
D = transpositiondecipher(C, key)
D_stripped = decode(D)
print(f'Given text - "{P}"')
print(f'Encoded - {T}')
print(f'Key - {key}')
print(f'Cipher text - {C}')
print(f'Decipher text - {D}')
print(f'Stripped text - {D_stripped}')

Given text - "Enemy might attack tonight. Stay alert!"
Encoded - ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ
Key - CEDBA
Cipher text - EYMNEGTHIMTCATAOINTKTTSHGAELYAZZZTR
Decipher text - ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ
Stripped text - ENEMYMIGHTATTACKTONIGHTSTAYALERT


## Test Against Builtin Cipher

Now, we can test the result against the built in Hill Cipher in sagemath.

In [163]:
A = TranspositionCryptosystem(AlphabeticStrings(), M)
E = A.encoding(encode(P))
# sagemath cryptosystem expects key as an array
K = [ord(c) - ord('A') + 1 for c in key]
print(f'Text - {P}')
print(f'Encoded - {E}')
print(f'Key -\b{key}')
C_test = A.enciphering(K, E)

# calculate the inverse key for K
Ki = [1] * len(K)
for i in range(len(K)):
    Ki[K[i] - 1] = i + 1
print(Ki)

# encipher with inverse key is same as decipher with key
D_test = A.enciphering(Ki, C_test)
# convert to python string
C_test = str(C_test)
D_test = str(D_test)
D_test_stripped = decode(D_test)
print(f'Cipher text - {C_test}')
print(f'Decipher text - {D_test}')
print(f'Stripped text - {D_test_stripped}')

Text - Enemy might attack tonight. Stay alert!
Encoded - ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ
Key -CEDBA
[5, 4, 1, 3, 2]
Cipher text - EYMNEGTHIMTCATAOINTKTTSHGAELYAZZZTR
Decipher text - ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ
Stripped text - ENEMYMIGHTATTACKTONIGHTSTAYALERT


Comparing the built in cipher result with our implementation -

In [164]:
print('Results \t Implementation \t Built-in')
print('-' * 80)
print(f'Cipher Text \t {C} \t {C_test}')
print(f'Decipher Text \t {D} \t {D_test}\n')
if C_test == C and D_test == D:
    print('Implementation is CORRECT')
else:
    print('Implementatiokn is INCORRECT')

Results 	 Implementation 	 Built-in
--------------------------------------------------------------------------------
Cipher Text 	 EYMNEGTHIMTCATAOINTKTTSHGAELYAZZZTR 	 EYMNEGTHIMTCATAOINTKTTSHGAELYAZZZTR
Decipher Text 	 ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ 	 ENEMYMIGHTATTACKTONIGHTSTAYALERTZZZ

Implementation is CORRECT


## Cryptoanalysis

### Brute Force

A brute force attack can be executed on transposition cipher as the key domain is not very big. We can generate all possible keys by generating all possible permutations of numbers from 1 to M.

In [165]:
from itertools import permutations
char_set = [chr(i + ord('A')) for i in range(M)]
key_list = permutations(char_set)

We must now get a list of english words that can be used to detect existence of english words in our bruteforced decipher text.

A good list of 3000 most used english words is here -
https://github.com/aneeshsharma/EnglishWords/raw/main/common3000.txt

We download the list of words and convert it to a list

In [166]:
import requests
url = 'https://github.com/aneeshsharma/EnglishWords/raw/main/common3000.txt'

words_file = requests.get(url, allow_redirects=True)
words_file_obj = open('words.txt', 'wb')
words_file_obj.write(words_file.content)
words_file_obj.close()

In [167]:
words = open('words.txt').read().split()
words = [word.upper() for word in words]

In [168]:
print(f'Number of words in dictionary - {len(words)}')

Number of words in dictionary - 3000


A function can be defined to find all substrings in a string that are among the 3000 most common english words. This can give us a measure of the likelihood of the string being an english sentance.

In [169]:
# function to find english words in a string according to word list
def find_words(string):
    l = len(string)
    found = []
    for i in range(l):
        for j in range(i, l):
            word = string[i:j+1]
            if len(word) <= 1:
                continue
            if word in words:
                found.append(string[i:j+1])
    return found

Now, we must try to decipher the encrypted text using the list of keys we have and try to compare and count any english words found in the text. More the words detected, more likely is it that the key is correct.

In [170]:
keys = {}
max_words = 0
for candidate in key_list:
    candidate = ''.join(candidate)
    candidate_text = decode(transpositiondecipher(C, candidate))
    found = find_words(candidate_text)
    if len(found) > 1:
        if len(found) > max_words:
            max_words = len(found)
        keys[candidate] = len(found)

print('Key \t\t Likelihood')
for likely_key in keys:
    print(f'{likely_key} \t\t {keys[likely_key]}')

Key 		 Likelihood
ABCDE 		 5
ABCED 		 6
ABDCE 		 2
ABDEC 		 6
ABEDC 		 3
ACBDE 		 4
ACBED 		 6
ACDBE 		 4
ACDEB 		 8
ACEBD 		 2
ACEDB 		 3
ADBCE 		 5
ADBEC 		 6
ADCBE 		 2
ADCEB 		 6
ADEBC 		 2
ADECB 		 4
AEBCD 		 6
AEBDC 		 7
AECBD 		 2
AECDB 		 6
AEDBC 		 4
AEDCB 		 4
BACDE 		 5
BACED 		 5
BADEC 		 5
BAEDC 		 3
BCADE 		 4
BCAED 		 6
BCDAE 		 5
BCDEA 		 5
BCEAD 		 3
BCEDA 		 3
BDACE 		 7
BDAEC 		 7
BDCAE 		 8
BDCEA 		 5
BDEAC 		 5
BDECA 		 5
BEACD 		 5
BEADC 		 6
BECAD 		 4
BECDA 		 6
BEDAC 		 5
BEDCA 		 3
CABDE 		 8
CABED 		 8
CADBE 		 4
CADEB 		 7
CAEBD 		 3
CAEDB 		 3
CBADE 		 2
CBAED 		 4
CBDAE 		 5
CBDEA 		 5
CBEAD 		 3
CBEDA 		 2
CDABE 		 5
CDAEB 		 4
CDBAE 		 8
CDBEA 		 4
CDEAB 		 4
CDEBA 		 6
CEABD 		 4
CEADB 		 3
CEBAD 		 5
CEBDA 		 7
CEDAB 		 5
CEDBA 		 10
DABCE 		 7
DABEC 		 5
DACBE 		 6
DACEB 		 7
DAEBC 		 3
DAECB 		 5
DBACE 		 5
DBAEC 		 3
DBCAE 		 9
DBCEA 		 7
DBEAC 		 5
DBECA 		 4
DCABE 		 5
DCAEB 		 3
DCBAE 		 5
DCBEA 		 2
DCEAB 		 4
DCEBA 		 5
DEABC 		 4
DEACB 		 4
DE

Now that we have a list of keys and their likelihood of being correct, we can display the keys and the possible plain text that are the most likely to be correct.

In [173]:
text_list = [[] for _ in range(max_words + 1)]
for likely_key in keys:
    count = keys[likely_key]
    text_list[count].append(decode(transpositiondecipher(C, likely_key)))

print('Most likely strings -')
for text in text_list[max_words]:
    print(f'{text}')

Most likely strings -
ENEMYMIGHTATTACKTONIGHTSTAYALERT


Hence, we get a string/list of strings that are most likely to be the correct plain text.