### Challenge 19: Break fixed-nonce CTR mode using substitutions

[Back to Index](CryptoPalsWalkthroughs_Cobb.ipynb)

In [1]:
# %% Initialize

import cryptopals as cp
import base64 as b64
from numpy.random import randint
import numpy as np

<div class="alert alert-block alert-info">

Take your CTR encrypt/decrypt function and fix its nonce value to 0. Generate a random AES key.

</div>

In [2]:
key = bytes(list(randint(0, 256, 16)))
nonce = [0]*8

<div class="alert alert-block alert-info">
    
In successive encryptions (not in one big running CTR stream), encrypt each line of the base64 decodes of the following, producing multiple independent ciphertexts:

```
SSBoYXZlIG1ldCB0aGVtIGF0IGNsb3NlIG9mIGRheQ==
Q29taW5nIHdpdGggdml2aWQgZmFjZXM=
RnJvbSBjb3VudGVyIG9yIGRlc2sgYW1vbmcgZ3JleQ==
RWlnaHRlZW50aC1jZW50dXJ5IGhvdXNlcy4=
SSBoYXZlIHBhc3NlZCB3aXRoIGEgbm9kIG9mIHRoZSBoZWFk
T3IgcG9saXRlIG1lYW5pbmdsZXNzIHdvcmRzLA==
T3IgaGF2ZSBsaW5nZXJlZCBhd2hpbGUgYW5kIHNhaWQ=
UG9saXRlIG1lYW5pbmdsZXNzIHdvcmRzLA==
QW5kIHRob3VnaHQgYmVmb3JlIEkgaGFkIGRvbmU=
T2YgYSBtb2NraW5nIHRhbGUgb3IgYSBnaWJl
VG8gcGxlYXNlIGEgY29tcGFuaW9u
QXJvdW5kIHRoZSBmaXJlIGF0IHRoZSBjbHViLA==
QmVpbmcgY2VydGFpbiB0aGF0IHRoZXkgYW5kIEk=
QnV0IGxpdmVkIHdoZXJlIG1vdGxleSBpcyB3b3JuOg==
QWxsIGNoYW5nZWQsIGNoYW5nZWQgdXR0ZXJseTo=
QSB0ZXJyaWJsZSBiZWF1dHkgaXMgYm9ybi4=
VGhhdCB3b21hbidzIGRheXMgd2VyZSBzcGVudA==
SW4gaWdub3JhbnQgZ29vZCB3aWxsLA==
SGVyIG5pZ2h0cyBpbiBhcmd1bWVudA==
VW50aWwgaGVyIHZvaWNlIGdyZXcgc2hyaWxsLg==
V2hhdCB2b2ljZSBtb3JlIHN3ZWV0IHRoYW4gaGVycw==
V2hlbiB5b3VuZyBhbmQgYmVhdXRpZnVsLA==
U2hlIHJvZGUgdG8gaGFycmllcnM/
VGhpcyBtYW4gaGFkIGtlcHQgYSBzY2hvb2w=
QW5kIHJvZGUgb3VyIHdpbmdlZCBob3JzZS4=
VGhpcyBvdGhlciBoaXMgaGVscGVyIGFuZCBmcmllbmQ=
V2FzIGNvbWluZyBpbnRvIGhpcyBmb3JjZTs=
SGUgbWlnaHQgaGF2ZSB3b24gZmFtZSBpbiB0aGUgZW5kLA==
U28gc2Vuc2l0aXZlIGhpcyBuYXR1cmUgc2VlbWVkLA==
U28gZGFyaW5nIGFuZCBzd2VldCBoaXMgdGhvdWdodC4=
VGhpcyBvdGhlciBtYW4gSSBoYWQgZHJlYW1lZA==
QSBkcnVua2VuLCB2YWluLWdsb3Jpb3VzIGxvdXQu
SGUgaGFkIGRvbmUgbW9zdCBiaXR0ZXIgd3Jvbmc=
VG8gc29tZSB3aG8gYXJlIG5lYXIgbXkgaGVhcnQs
WWV0IEkgbnVtYmVyIGhpbSBpbiB0aGUgc29uZzs=
SGUsIHRvbywgaGFzIHJlc2lnbmVkIGhpcyBwYXJ0
SW4gdGhlIGNhc3VhbCBjb21lZHk7
SGUsIHRvbywgaGFzIGJlZW4gY2hhbmdlZCBpbiBoaXMgdHVybiw=
VHJhbnNmb3JtZWQgdXR0ZXJseTo=
QSB0ZXJyaWJsZSBiZWF1dHkgaXMgYm9ybi4=
```
    
(This should produce 40 short CTR-encrypted ciphertexts).

In [3]:
s_list = ['SSBoYXZlIG1ldCB0aGVtIGF0IGNsb3NlIG9mIGRheQ==',
          'Q29taW5nIHdpdGggdml2aWQgZmFjZXM=',
          'RnJvbSBjb3VudGVyIG9yIGRlc2sgYW1vbmcgZ3JleQ==',
          'RWlnaHRlZW50aC1jZW50dXJ5IGhvdXNlcy4=',
          'SSBoYXZlIHBhc3NlZCB3aXRoIGEgbm9kIG9mIHRoZSBoZWFk',
          'T3IgcG9saXRlIG1lYW5pbmdsZXNzIHdvcmRzLA==',
          'T3IgaGF2ZSBsaW5nZXJlZCBhd2hpbGUgYW5kIHNhaWQ=',
          'UG9saXRlIG1lYW5pbmdsZXNzIHdvcmRzLA==',
          'QW5kIHRob3VnaHQgYmVmb3JlIEkgaGFkIGRvbmU=',
          'T2YgYSBtb2NraW5nIHRhbGUgb3IgYSBnaWJl',
          'VG8gcGxlYXNlIGEgY29tcGFuaW9u',
          'QXJvdW5kIHRoZSBmaXJlIGF0IHRoZSBjbHViLA==',
          'QmVpbmcgY2VydGFpbiB0aGF0IHRoZXkgYW5kIEk=',
          'QnV0IGxpdmVkIHdoZXJlIG1vdGxleSBpcyB3b3JuOg==',
          'QWxsIGNoYW5nZWQsIGNoYW5nZWQgdXR0ZXJseTo=',
          'QSB0ZXJyaWJsZSBiZWF1dHkgaXMgYm9ybi4=',
          'VGhhdCB3b21hbidzIGRheXMgd2VyZSBzcGVudA==',
          'SW4gaWdub3JhbnQgZ29vZCB3aWxsLA==',
          'SGVyIG5pZ2h0cyBpbiBhcmd1bWVudA==',
          'VW50aWwgaGVyIHZvaWNlIGdyZXcgc2hyaWxsLg==',
          'V2hhdCB2b2ljZSBtb3JlIHN3ZWV0IHRoYW4gaGVycw==',
          'V2hlbiB5b3VuZyBhbmQgYmVhdXRpZnVsLA==',
          'U2hlIHJvZGUgdG8gaGFycmllcnM/',
          'VGhpcyBtYW4gaGFkIGtlcHQgYSBzY2hvb2w=',
          'QW5kIHJvZGUgb3VyIHdpbmdlZCBob3JzZS4=',
          'VGhpcyBvdGhlciBoaXMgaGVscGVyIGFuZCBmcmllbmQ=',
          'V2FzIGNvbWluZyBpbnRvIGhpcyBmb3JjZTs=',
          'SGUgbWlnaHQgaGF2ZSB3b24gZmFtZSBpbiB0aGUgZW5kLA==',
          'U28gc2Vuc2l0aXZlIGhpcyBuYXR1cmUgc2VlbWVkLA==',
          'U28gZGFyaW5nIGFuZCBzd2VldCBoaXMgdGhvdWdodC4=',
          'VGhpcyBvdGhlciBtYW4gSSBoYWQgZHJlYW1lZA==',
          'QSBkcnVua2VuLCB2YWluLWdsb3Jpb3VzIGxvdXQu',
          'SGUgaGFkIGRvbmUgbW9zdCBiaXR0ZXIgd3Jvbmc=',
          'VG8gc29tZSB3aG8gYXJlIG5lYXIgbXkgaGVhcnQs',
          'WWV0IEkgbnVtYmVyIGhpbSBpbiB0aGUgc29uZzs=',
          'SGUsIHRvbywgaGFzIHJlc2lnbmVkIGhpcyBwYXJ0',
          'SW4gdGhlIGNhc3VhbCBjb21lZHk7',
          'SGUsIHRvbywgaGFzIGJlZW4gY2hhbmdlZCBpbiBoaXMgdHVybiw=',
          'VHJhbnNmb3JtZWQgdXR0ZXJseTo=',
          'QSB0ZXJyaWJsZSBiZWF1dHkgaXMgYm9ybi4=']

ciphertexts = []
for msg in s_list:
    
    # print(b64.b64decode(msg))
    ciphertexts.append(cp.AESEncrypt(b64.b64decode(msg), key, 'CTR', nonce))

<div class="alert alert-block alert-info">

Because the CTR nonce wasn't randomized for each encryption, each ciphertext has been encrypted against the same keystream. This is very bad.

Understanding that, like most stream ciphers (including RC4, and obviously any block cipher run in CTR mode), the actual "encryption" of a byte of data boils down to a single XOR operation, it should be plain that:

```CIPHERTEXT-BYTE XOR PLAINTEXT-BYTE = KEYSTREAM-BYTE```
<br><br>
And since the keystream is the same for every ciphertext:

```CIPHERTEXT-BYTE XOR KEYSTREAM-BYTE = PLAINTEXT-BYTE``` (ie, "you don't
say!")

Attack this cryptosystem piecemeal: guess letters, use expected English language frequence to validate guesses, catch common English trigrams, and so on.
    
<div class="alert alert-block alert-warning">

### Don't overthink it.
        
Points for automating this, but part of the reason I'm having you do this is that I think this approach is suboptimal.
        
</div>

</div>

In [4]:
english_chars = b'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ .,'  
key_scores = np.zeros((40,256))

for text_idx in range(len(ciphertexts)):  
    for byte_idx in range(40):      
        if len(ciphertexts[text_idx]) > byte_idx:              
            for key_byte_guess in range(256):          
                # Score key guesses based on english character count if 
                # all of the ciphertexts are decrypted with that guess.
                if (ciphertexts[text_idx][byte_idx] ^ key_byte_guess) in english_chars:              
                    key_scores[byte_idx, key_byte_guess] += 1
  
key_stream = bytearray(40) 

for ii in range(40):
    key_stream[ii] = key_scores[ii,:].argmax()       
                
for ct in ciphertexts:
    for ii in range(len(ct)):
        pt_byte = key_stream[ii] ^ ct[ii]
        print(f'{chr(pt_byte)}', end='')
    print()

I have met them at close of day
Coming with vivid faces
From counter or desk among grey
Eighteenth-century houses.
I have passed with a nod of the  egy
Or polite meaningless words,
Or have lingered awhile and said
Polite meaningless words,
And thought before I had done
Of a mocking tale or a gibe
To please a companion
Around the fire at the club,
Being certain that they and I
But lived where motley is worn:
All changed, changed utterly:
A terrible beauty is born.
That woman's days were spent
In ignorant good will,
Her nights in argument
Until her voice grew shrill.
What voice more sweet than hers
When young and beautiful,
She rode to harriers?
This man had kept a school
And rode our winged horse.
This other his helper and friend
Was coming into his force;
He might have won fame in the en,,
So sensitive his nature seemed,
So daring and sweet his thought.
This other man I had dreamed
A drunken, vain-glorious lout.
He had done most bitter wrong
To some who are near my heart,
Yet I number 

 Well, that was mostly right -- some of the words at the end aren't right because most of the sub-strings weren't long enough, so our scoring system was making its guess based on very few decryptions.       

[Back to Index](CryptoPalsWalkthroughs_Cobb.ipynb)