# RHme3 White Box Unboxing qualifier

This example shows how to recover the key from the [whitebox qualifier challenge of Rhme 2017](https://github.com/Riscure/Rhme-2017/tree/master/prequalifications/White%20Box%20Unboxing). The challenge can be completely solved using the SideChannelMarvels framework as described in [this Deadpool writeup](https://github.com/SideChannelMarvels/Deadpool/tree/master/wbs_aes_rhme3_prequal). Here we do it somewhat differently.

We will use the wrapper from Deadpool to trace the whitebox binary with Intel Pin. For recovery, we will use Jlsca to illustrate trace pre-processing techniques that it offers. For this toy binary, the effect of these techniques is not so pronounced, however it is significant for more serious challenges.

Before computing the correlation, we perform pre-processing on the traces to automatically remove samples that are irrelevant for the analysis. Such point-of-interest selection  drastically reduces the length of traces without the visual inspection of the trace graph and manual filter configuration. As you can see from the log lines staring with `Reduction`, what remains are about 20 bits per key byte. This is what goes into correlation computation. As a result, the *total* time of the attack (and the amount of human input) is reduced.

The detailed description of these pre-procesing techniques is available in https://eprint.iacr.org/2018/095

The correlation part of the attack is the same as in [Daredevil](https://github.com/SideChannelMarvels/Deadpool). The output of the pre-processing could be fed out to Daredevil. However, the point selection is different per key byte, so we would need to script 16 separate Daredevil runs with different tracesets. You can do it as an exercise though.

In case you do not feel like tracing the binary yourself, tracesets for analysis are available alongside this notebook in [rhme2017-qual-wb-traces.tar.bz2](rhme2017-qual-wb-traces.tar.bz2).

## 0. Tracing the binary

We do it in a standard Deadpool way based on the examples therein. The acquisition script is leaving default filters on acquired ranges.

```python
#!/usr/bin/env python
import sys
sys.path.insert(0, '../../')
from deadpool_dca import *
def processinput(iblock, blocksize):
    return (None, ['--stdin < <(echo %0*x|xxd -r -p)' % (2*blocksize, iblock)])
def processoutput(output, blocksize):
    return int(''.join([x for x in output.split(' ')]), 16)
T=TracerPIN('./whitebox', processinput, processoutput, ARCH.amd64, blocksize=16, shell=True)
T.run(100)
bin2trs(None, None, False) # get the bit-unpacked trs, keeping the originals
bin2daredevil()            # get the daredevil "split binary", erasing the originals
```
We execute this script in the environment provided by the [Orka docker image](https://github.com/SideChannelMarvels/Orka) refreshed to the latest state. From several output files, for further steps we need `mem_addr1_rw1_100_42808.trace` and `mem_addr1_rw1_100_42808.input`. Other memory ranges can be analysed in the same manner.

## 1. Converting the files

Though Jlsca accepts the "split binary" format directly, we will convert the traces to bit-packed representation and save it as trs. For the short traces of this example it hardly matters, so just as an illustration.

Due to the current limitations of the converter we add only the input. This is enough for the attack but we will not be able to verify the key. For this challenge, the key will be distinguishable by its entropy, but in general the converter deserves improvement. :)

Deadpool's `bin2trs` converter from `deadpool_dca.py` can also be used (see the tracing script above), it just does not pack the bits. As an excercise, you can run the attack below on the `mem_addr1_rw1_100_42808.trs` and see what happens.

In [1]:
using Jlsca.Trs

# the true parameter in the end tells the converter to pack the bits 
splitbin2trs("rhme2017-qual-wb-traces/mem_addr1_rw1_100_42808.input", 16, "rhme2017-qual-wb-traces/mem_addr1_rw1_100_42808.trace", 42808, UInt8, 100, true)
run(`mv output_UInt8_100t.trs rhme2017-qual-wb-traces/mem_addr1_rw1_100_42808_bitpacked.trs`)

Creating Inspector trs file output_UInt8_100t.trs
#samples: 5351
#data:    16
type:     UInt8
Wrote 100 traces in output_UInt8_100t.trs


## 2. Recovering the key

Here we will run the analysis with the pre-processing implemented in Jlsca.

In [2]:
using Jlsca.Sca
using Jlsca.Trs
using Jlsca.Aes

filename = "rhme2017-qual-wb-traces/mem_addr1_rw1_100_42808_bitpacked.trs"

# attack configuration
attack = AesSboxAttack()   # attacking AES S-box
attack.keyLength = KL128   # attacking AES-128
attack.mode = CIPHER       # encryption (INVCIPHER would have been for decryption)
attack.direction = FORWARD # attacking from input
analysis = CPA()           # use correlation as a distinguisher
analysis.leakages = [Bit(i) for i in 0:7] # absolute-sum DPA with bitwise "leakages"
params = DpaAttack(attack, analysis) # tie the attack and analysis together
params.dataOffset = 1      # data starts from the very first byte (remember Julia is 1-based)

# pre-processing setup 
trs = InspectorTrace(filename, true)   # open traceset with efficient readout of packed bits
addSamplePass(trs, tobits)             # add bit-unpacking pass over samples
setPostProcessor(trs, CondReduce())    # add automated point selection
# TODO: expalin the difference between a sample pass and a post-processor in Jlsca readme

# attack using all available traces
rankData = sca(trs, params, 1, length(trs))
key = getKey(params, rankData)

Opened rhme2017-qual-wb-traces/mem_addr1_rw1_100_42808_bitpacked.trs, #traces 100, #samples 5351 (UInt8), #data 16

Jlsca running in Julia version: 0.6.1, 1 processes/1 workers/2 threads per worker

DPA parameters
attack:       AES Sbox
mode:         CIPHER
key length:   KL128
direction:    FORWARD
analysis:     CPA
leakages:     bit0,bit1,bit2,bit3,bit4,bit5,bit6,bit7
maximization: abs global max
combination:  +
data at:      1

phase: 1 / 1, #targets 16

Running processor "Cond reduce" on trace range 1:1:100, 1 data passes, 1 sample passes


[32mProcessing traces 1:100..   3%|█                        |  ETA: 0:00:33[39m[32mProcessing traces 1:100.. 100%|█████████████████████████| Time: 0:00:01[39m


Reduction for 1: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 20 left after sample reduction
Reduction for 2: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 19 left after sample reduction
Reduction for 3: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 20 left after sample reduction
Reduction for 4: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 19 left after sample reduction
Reduction for 5: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 20 left after sample reduction
Reduction for 6: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 19 left after sample reduction
Reduction for 7: 5962 left after global dup col removal, 5097 left after removing the inv dup cols, 20 left after sample reduction
Reduction for 8: 5962 left after global dup col removal, 5097 left after removing t

recovered key: 61316c5f7434623133355f525f6f5235


16-element Array{UInt8,1}:
 0x61
 0x31
 0x6c
 0x5f
 0x74
 0x34
 0x62
 0x31
 0x33
 0x35
 0x5f
 0x52
 0x5f
 0x6f
 0x52
 0x35

As said before, we did not include the output into the traceset. But apparently this key has entropy low enough to see that it is the right one. :) 

In [3]:
print(String(key))

a1l_t4b135_R_oR5