#### Challenge 51: Compression Ratio Side-Channel Attacks

[Back to Index](CryptoPalsWalkthroughs_Cobb.ipynb)

In [1]:
from Crypto.Random import random
import cryptopals as cp
import zlib

<div class="alert alert-block alert-info">   

Internet traffic is often compressed to save bandwidth. Until recently, this included HTTPS headers, and it still includes the contents of responses.

Why does that matter?

Well, if you're an attacker with:

1. Partial plaintext knowledge and
2. Partial plaintext control and
3. Access to a compression oracle

You've got a pretty good chance to recover any additional unknown plaintext.

What's a compression oracle? You give it some input and it tells you how well the full message compresses, i.e. the length of the resultant output.

This is somewhat similar to the timing attacks we did way back in set 4 in that we're taking advantage of incidental side channels rather than attacking the cryptographic mechanisms themselves.

Scenario: you are running a MITM attack with an eye towards stealing secure session cookies. You've injected malicious content allowing you to spawn arbitrary requests and observe them in flight. (The particulars aren't terribly important, just roll with it.)
</div>

    
<div class="alert alert-block alert-info">   
    
So! Write this oracle:

`oracle(P) -> length(encrypt(compress(format_request(P))))`

Format the request like this:

```
POST / HTTP/1.1
Host: hapless.com
Cookie: sessionid=TmV2ZXIgcmV2ZWFsIHRoZSBXdS1UYW5nIFNlY3JldCE=
Content-Length: ((len(P)))
((P))
```
    
(Pretend you can't see that session id. You're the attacker.)

Compress using zlib or whatever.

Encryption... is actually kind of irrelevant for our purposes, but be a sport. Just use some stream cipher. Dealer's choice. Random key/IV on every call to the oracle.

And then just return the length in bytes.
    
</div>

In [2]:
TRUE_SESSION_ID = 'TmV2ZXIgcmV2ZWFsIHRoZSBXdS1UYW5nIFNlY3JldCE='

In [3]:
def challenge51_oracle(P):
    
    key = random.Random.get_random_bytes(32)
    IV  = random.Random.get_random_bytes(8)
    
    request = 'POST / HTTP/1.1\n' \
              'Host: hapless.com\n' \
              'Cookie: sessionid=TmV2ZXIgcmV2ZWFsIHRoZSBXdS1UYW5nIFNlY3JldCE=\n' \
              'Content-Length: ((' + str(len(P)) + '))\n' \
              '((' + P + '))'
    
    #print(request)
    c_request = zlib.compress(request.encode())
    e_request = cp.AESEncrypt(c_request, key, 'CTR', IV)
    
    return(len(e_request))

<div class="alert alert-block alert-info">   
    
Now, the idea here is to leak information using the compression library. A payload of `sessionid=T` should compress just a little bit better than, say, `sessionid=S`.

There is one complicating factor. The DEFLATE algorithm operates in terms of individual bits, but the final message length will be in bytes. Even if you do find a better compression, the difference may not cross a byte boundary. So that's a problem.

You may also get some incidental false positives.

But don't worry! I have full confidence in you.

Use the compression oracle to recover the session id.

I'll wait.
    
</div>

Some Resources:

**1. IACR Paper:**  [Compression and Information Leakage of Plaintext](https://iacr.org/archive/fse2002/23650264/23650264.pdf)   
2. Wikipedia article on [CRIME Attack](https://en.wikipedia.org/wiki/CRIME)     
3. Wikipedia article on [BREACH Attack](https://en.wikipedia.org/wiki/BREACH)   
4. Blackhat [slides on BREACH Attack](https://media.blackhat.com/us-13/US-13-Prado-SSL-Gone-in-30-seconds-A-BREACH-beyond-CRIME-Slides.pdf)   
5. [Paper on BREACH Attack](http://breachattack.com/resources/BREACH%20-%20SSL,%20gone%20in%2030%20seconds.pdf)   
6. [Thomas Pornin Blog Post](https://security.stackexchange.com/questions/19911/crime-how-to-beat-the-beast-successor/19914#19914) that predated the CRIME presentation   
7. [POC Code](https://gist.github.com/stamparm/3698401) by xorninja

This is going to be an _adaptive chosen text attack_, using side channel information leaked by the compression function to infer the unknown session id.

We know:

- Format of the compressed data and contents of the request, with the exception of the `sessionid`.
- The cookie is a `length=44` base64-encoded string



In [96]:
# My first attempt -- assumes no false positives.
base_64_chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/='
guess = ''
scores = {}
idx = 0

for chunk in range(22):
    for char1 in base_64_chars:
        print('.', end='')
        for char2 in base_64_chars:
            this_guess = char1+char2 # +char3
            scores[this_guess] = challenge51_oracle('Cookie: sessionid=' + guess + this_guess + '~~'*(22-chunk))
            # scores[this_guess] = challenge51_oracle('Cookie: sessionid=' + guess + this_guess)
            idx += 1
    guess += min(scores, key=scores.get)
    print('Guess so far: ', guess)
    
print()
print(f"My Guess: {guess}")
print(f"Actual:   {TRUE_SESSION_ID}")
      
assert(guess == TRUE_SESSION_ID)
      
print("Congrats -- you own my sessionid")

.................................................................Guess so far:  Tm
.................................................................Guess so far:  TmV2
.................................................................Guess so far:  TmV2ZX
.................................................................Guess so far:  TmV2ZXIg
.................................................................Guess so far:  TmV2ZXIgcm
.................................................................Guess so far:  TmV2ZXIgcmV2
.................................................................Guess so far:  TmV2ZXIgcmV2ZW
.................................................................Guess so far:  TmV2ZXIgcmV2ZWFs
.................................................................Guess so far:  TmV2ZXIgcmV2ZWFsIH
.................................................................Guess so far:  TmV2ZXIgcmV2ZWFsIHRo
.................................................................Guess so far:  

---

<div class="alert alert-block alert-info">  
Got it? Great.

Now swap out your stream cipher for CBC and do it again.

</div>

In [5]:
import pdb
from numpy.random import randint

def challenge51_oracle_CBC(P):
    
    key = random.Random.get_random_bytes(32)
    IV  = random.Random.get_random_bytes(16)

    #pdb.set_trace()

    request = 'POST / HTTP/1.1\n' \
              'Host: hapless.com\n' \
              'Cookie: sessionid=TmV2ZXIgcmV2ZWFsIHRoZSBXdS1UYW5nIFNlY3JldCE=\n' \
              'Content-Length: ((' + str(len(P)) + '))\n' \
              '((' + P + '))'
    
    c_request = zlib.compress(request.encode())
    e_request = cp.AESEncrypt(c_request, key, 'CBC', IV, True)
    
    return(len(e_request))

TRUE_SESSION_ID = 'TmV2ZXIgcmV2ZWFsIHRoZSBXdS1UYW5nIFNlY3JldCE='

In [87]:
# Trying the algorithm described in the CRIME presentation

KNOWN_PRE_TEXT = 'POST / HTTP/1.1\n' \
                 'Host: hapless.com\n' \
                 'Cookie: sessionid='
#KNOWN_PRE_TEXT = 'sessionid='

guess = ''
for byte_idx in range(44):
    
    junk_length = 32700
    char_found = False

    while not(char_found):
        
        good_guesses = []
        print('.',end='')
        for guess_char in base_64_chars:
            print('.', end='')
            this_guess = KNOWN_PRE_TEXT + guess + guess_char 
            junk = str(bytes([randint(36, 128) for _ in range(junk_length)]))

            guess1 = junk + this_guess
            guess1_len = challenge51_oracle_CBC(guess1)

            guess2 = this_guess + junk
            guess2_len = challenge51_oracle_CBC(guess2)

            if guess1_len != guess2_len:
                good_guesses += guess_char
                
        if len(good_guesses) == 1:
            char_found = True
            guess += good_guesses[0]
            print(guess)
        else:
            junk_length += 1
            
        if junk_length > 2**16:
            raise Exception

........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................

KeyboardInterrupt: 

In [14]:
base_64_chars = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789+/='

N_tests = 25
guess_so_far = ''

KNOWN_PRE_TEXT = 'POST / HTTP/1.1\n' \
                 'Host: hapless.com\n' \
                 'Cookie: sessionid='

KNOWN_POST_TEXT = '\nContent-Length: (('

for byte_idx in range(44):

    scores = {}
    good_guess_found = False
    pad_length = 0
    while not(good_guess_found):
        
#        for ii in range(pad_length):
#            random_padding += chr(randint(32, 126))
            
        for guess_char in base_64_chars:
    
            padding = guess_char * pad_length
            print('.', end='')
            n_bytes_left = 44 - byte_idx 
            
            P = KNOWN_PRE_TEXT + guess_so_far + guess_char*n_bytes_left + \
                '~'*n_bytes_left + KNOWN_POST_TEXT + padding
            
            scores[guess_char] = challenge51_oracle_CBC(P)
        
        best_guess_candidate = min(scores, key=scores.get)
        average_score = sum(scores.values()) // 65
        
        if (scores[best_guess_candidate] < average_score) and \
           (list(scores.values()).count(scores[best_guess_candidate]) == 1):
            good_guess_found = True
        else:
            # pdb.set_trace()
            pad_length += 1
            if pad_length > 500:
                raise Exception
            print(f"Pad Length: {pad_length}")
        
    guess_so_far += min(scores, key=scores.get)
    print(guess_so_far)

print()
print(f"My Guess: {guess_so_far}")
print(f"Actual:   {TRUE_SESSION_ID}")
      
assert(guess_so_far == TRUE_SESSION_ID)
      
print("Congrats -- you own my sessionid")

.................................................................Pad Length: 1
.................................................................Pad Length: 2
.................................................................Pad Length: 3
.................................................................Pad Length: 4
.................................................................Pad Length: 5
.................................................................Pad Length: 6
.................................................................Pad Length: 7
.................................................................Pad Length: 8
.................................................................Pad Length: 9
.................................................................Pad Length: 10
.................................................................Pad Length: 11
.................................................................Pad Length: 12
.................................................

AssertionError: 

[Back to Index](CryptoPalsWalkthroughs_Cobb.ipynb)