Note: most links below are just links to files within this repo for your convenience.
Set 1: Basics
Trivial introduction to hex parsing to get you going. I implemented b64decode out of curiosity.
Introduction to XOR between strings. Implementation in xor.py.
Ahh. Crypto! To break it, loop through all 256 possible bytes to find which of the keys produces the decrypted text that looks the most like English.
frequency.py has code to score how much a string looks like English (testing how close to a certain distribution our data is).
xor.py has code to encrypt/decrypt using a single-byte XOR cipher.
Very similar to challenge #3. Break all ciphertexts, pick the one with the best "English score".
Simply cycle the key when XOR-ing with the plaintext.
Interesting! Breaking it amounts to breaking a single-byte XOR cipher for the 1st character of every repeated key, then for the 2nd, the 3rd, etc.
Guessing the keysize is done by picking the keysize that minimizes the normalized hamming distance (
distance / keysize) between "blocks" of the repeated key. This is because we can expect 2 different blocks encrypted with the right keysize to have similar bit patterns (e.g. matching characters), so minimizing the normalized hamming distance gives us the keysize that produces the most similar blocks. We need to normalize because e.g. a keysize of 4 would have 2 times as many bits as a keysize of 2.
Intro to using the pycrypto API. aes.py has the code that handles AES encryption/decryption.
ECB mode encrypts the blocks independently. This means that two identical blocks of plaintext at different positions will produce the same ciphertext block. All that we need to do is find the ciphertext that has the most (or any, really) duplicate ciphertext blocks, which is very unlikely with 16 bytes of otherwise random data.
Set 2: Block crypto
We pad the plaintext so that it has a length that is a multiple of the blocksize (e.g. 16). If it already is, we add a full block of padding. The padding byte to use is equal to the length of the padding.
pad('0123456789') = '0123456789' + '666666' pad('') = '\x10' * 16
Wikipedia's image on the matter is enough of a description to implement this. Basically, every encrypted block is XORed with the previous block (or the IV if this is the first block). The result is that 2 identical plaintext blocks will no longer automatically encrypt to the same ciphertext block (compared to ECB).
When using ECB, two identical plaintext blocks will encrypt to the same ciphertext block.
Therefore, a block that contains duplicate plaintext blocks will contain duplicate ciphertext blocks once encrypted.
Our oracle checks if the ciphertext contains duplicate blocks. If it does, we consider the ciphertext to be encrypted using ECB. Otherwise, we consider that it used CBC.
Detecting the blocksize is relatively easy. We start by noting
len(ciphertext). We continue adding prefix bytes until we notice a change in
len(ciphertext). When that happens, it means that we have crossed a block boundary and we know the blocksize to be
len(after) - len(start).
Using the ECB property that we exploited earlier (identical plaintext blocks encrypt to identical ciphertext blocks), we can bruteforce individual bytes at the end of blocks.
For example, assume that we want to bruteforce the (unknown) plaintext:
hello this is a test.
At first, we want to isolate the first character (
h, but we don't know that yet) at the end of a block, through a prefix of
As. Like this:
We get the encrypted result of this (call this
C0). We can try all possible bytes that could follow our prefix, like the following:
AAAAAAAAAAAAAAAa AAAAAAAAAAAAAAAb AAAAAAAAAAAAAAAc ...
Eventually, we will get an encrypted block that will match
C0, and thus we will know the first byte of our plaintext.
We can bruteforce the second byte (
e) similarly. We start by getting
Then by trying all possible bytes:
AAAAAAAAAAAAAAha AAAAAAAAAAAAAAhb AAAAAAAAAAAAAAhc ...
Until we find a block that encrypts to
C1. And so on, for all bytes in the first block.
For the following blocks, the idea is the same, but a little more thought must be put into where we get our
Ciand what should be put as a prefix.
Ciis merely the index of the block we already are at with the next byte that we want to bruteforce. The reason is quite simple: with our padding, we will only be pushing that byte at the end of its current block, not produce a next block.
The padding only has to be enough to put the next bruteforced byte at the end of a block. When bruteforcing, however, the bytes before the tried byte must be the last 15 (blocksize - 1) known bytes, to fit
This is all clearer through an example:
plaintext: hello this is a (...unknown....) |---block 0----||---block 1----| bruteforcing: t (unknown yet) We will pad with As: padded: AAAAAAAAAAAAAAAhello this is a t(...unknown....) |---block 0----||---block 1----||---block 2----| Ci = block 1 Bruteforce tries: ello this is a a ello this is a b ello this is a c ...
We repeat the steps up until the point where we reach the last block. Then, we will eventually hit the padding.
If the padding happens to be 1 byte, we will successfully find it (this is just like bruteforcing a normal byte that just happens to be
If the padding is anything else, we will first successfully "bruteforce" a
0x01in the plaintext (because we will be padding the plaintext until it is only missing a byte, so it will be padded with
0x01). Then, we will try to bruteforce the next byte, but fail. The reason is that the padding will now be
[0x02, 0x02], while we are trying
[0x01, 0x??]. Once that happens, we know that we have successfully bruteforced the whole plaintext. We can remove our undesirable bruteforced
0x01padding and enjoy reading our plaintext.
Using the same exploitable ECB property as before, we craft multiple messages and pick-and-choose the parts that we want.
email=aaaaaaaaaa (A0) adminPPPPPPPPPPP (A1) (P is the padding that will be needed as a final block) &uid=10&role=use (A2) rPPPPPPPPPPPPPPP (A3)
email=aaaaaaaaaa (B0) aaa&uid=10&role= (B1) userPPPPPPPPPPPP (B2)
By crafting the ciphertext
B0 + B1 + A1, we get:
email=aaaaaaaaaa (B0) aaa&uid=10&role= (B1) adminPPPPPPPPPPP (A1)
Which grants us admin access to log in.
This attack is the same as the challenge #12, but with some required initial work, offsets and padding to apply.
We first need to find out how long the prefix is. We do this by generating 2 blocks of fixed data (e.g.
[0xA] * 32) and gradually increasing the size of the fixed data until we find 2 neighbour duplicate blocks in the ciphertext. This indicates that we have fully padded the last block of the prefix and that we have produced two blocks of our own input after that. To make sure that we don't just have identical blocks because the prefix happened to end with our fixed value (therefore fooling us into thinking that we have padded 1 more byte than we really have), we can try with 2 different fixed values, e.g.
 * 32and
 * 32.
Then, one can use the index where the duplicate blocks begin to find where the first block after the prefix starts. With that information, we can find the amount of padding that was required to pad the prefix to a multiple of blocksize through
len(fixed_data) - 2 * blocksize. The length of the prefix is then
index of first of the duplicates - padding length.
With the length of the prefix, we just use our algorithm from challenge #12, but prefixing our input with some padding to pad the prefix to a blocksize multiple. We also need to offset any index in the produced ciphertext by the amount of blocks in the prefix.
We get the last byte of the plaintext as
nand make sure that the last
nbytes of the plaintext are equal to
n. If it is not the case, raise an exception.
Start out by encrypting a normal token for a block of 16 bytes. This will be where we will inject our crafted block. Call this encrypted block
We want to inject
target = ";admin=true;abc=".
We know that the plaintext block following our input is:
next_plain = ";comment2=%20lik".
When decrypting with CBC, the following is done:
next_plain = next_block_pre_xor ^ current_block
We can calculate
next_block_pre_xor = next_plain ^ current_block.
next_block_pre_xor ^ crafted_blockto yield
target, so we choose:
crafted_block = target ^ next_block_pre_xor.
Then, all we need is to swap
crafted_blockto get admin access. The decryption of
current_blockwill yield scrambled plaintext, but it is not a problem since it only modifies
Set 3: Block & stream crypto
If we have an oracle that tells us if the padding on a ciphertext is valid or not, we are able to recover the plaintext.
Keep in mind that decryption happens like this:
block_pre_xor xxxxxxxxxxxxxxxx ^ prev_block pppppppppppppppp = plaintext tttttttttttttttt
We can play with the
prev_blockto search for a byte that results in valid padding:
block_pre_xor xxxxxxxxxxxxxxx? ^ prev_block pppppppppppppppB = (we bruteforce 'B') plaintext ttttttttttttttt1
Once we have the
prev_blockbyte that gives valid padding, we can deduce
Once we know the last
Nbytes of the
pre_xor, we can use that information to craft the ending of the padding in just the right way to bruteforce the
N+1th byte (starting from the right).
With all the
pre_xorbytes recovered from a block, we can XOR them with out real
prev_blockto find the plaintext.
However, we must keep the following tricky thing in mind:
The last block (and other plaintext blocks, if we're unlucky) can produce two valid
pre_xorbytes, instead of just one: E.g. block
...55555We'll find these 2 valid
- the one that produces
- the one that produces
When this happens, we can just alter the 2nd last byte to produce junk, so that only the padding ending with a
1will pass: E.g. changing
- the one that produces
There isn't much to say here... We generate a keystream with:
aes(key, nonce + counter)
counterare 64-bits integers encoded in little-endian and
counteris the amount of blocks generated so far.
We reuse our
english_testfunction and pick the key that has the best score. In some cases, we pick a key that is not as good as some others because we notice (by analysis of the plaintexts obtained) that the key is not the right one.
As mentioned by the challenge description, this is a bit manual indeed.
Very similar to the previous challenge, except that we can heavily reuse code that we've written before.
This task surprisingly took a little longer than just copying Wikipedia's pseudocode. I spent some time wondering why Python's random would yield different numbers (and state!) from me, so I investigated.
I compared with C++'s std::mt19937, which yielded the same results as me. I couldn't find posts complaining about this specifically so I took a look at CPython's source. It turns out that the underlying C module seeds in a different way from what we see on Wikipedia. It assumes that the seed could be greater than 32-bits and just uses all of the bits of the seed in a different procedure. The result: it ends up with a different state from us.
Just to make sure I looked at how
numpy's random and it does generate the same state as us when seeding. Therefore, the tests for this one use values that I extracted with C++'s implementation.
Pretty straightforward. Bruteforce all sensible seeds (a couple of seconds in the past) and pick the one that matches.
The state is directly coming from the next output, so it is really easy to recover the next state from a given output.
Essentially, we need to look at
624transformations of the state to fully recover the MT state:
state = [untemper(output) for output in outputs[-624:]]
untemperneeds to reverse the right and left shift operations.
Let's look at a case with 8-bits:
y = 10101010
If we're shifting right by 3 bits, we'll do the following:
10101010 y 00010101 y >> 3 10111111 y ^ y >> 3
We notice that the first 3 bits of the result will match the first 3 bits of the original
y. Then, those original 3 bits will be used (once shifted) to xor with the original
y. By using the known first 3 bits, we can recover the next 3 bits of
yby doing this:
10111111 y ^ y >> 3 00010100 known_bits >> 3 10101011 (y ^ y >> 3) ^ (known_bits >> 3)
We can then recover the whole
ythis way. The left shift is very similar, but we
&with a constant before xoring.
This challenge was very similar to challenge 22. Basically, we bruteforce the seed space to find one that gives an expected decryption/token.
Set 4: Stream crypto and randomness
Here we're doing a terrible mistake: we are reusing our keystream on a different plaintext. Solution? We provide
00000000...and it gets directly xored with the keystream (yielding the keystream). We xor that with the original ciphertext and we get the secret back!
This one is really straightforward. We know the plaintext, so we can find the keystream and use it to encrypt our token the way that we want.
When we split the obtained ciphertext in 3 blocks:
C1, C2, C3
We can then produce a ciphertext in the following way:
C1, 0, C1
and capture the produced plaintext from the error message.
This means that
P1is the result of
KEY ^ P3(since
P3is unchanged by xoring with
0). We can recover the key through:
P1 ^ P3 = (P3 ^ KEY) ^ P3 = KEY.
Not much to do apart from adding a python sha1 implementation here...