# Lab 1 - Power Analysis with Passwords

**SUMMARY:** *This tutorial will introduce you to breaking devices by determining when a device is performing certain operations. Our target device will be performing a simple password check, and we will demonstrate how to perform a basic power analysis.*

**LEARNING OUTCOMES:**

* How power can be used to determine timing information.
* Plotting multiple iterations while varying input data to find interesting locations.
* Using difference of waveforms to find interesting locations.
* Performing power captures with ChipWhisperer hardware (hardware only)


## Prerequisites

Hold up! Before you continue, check you've done the following tutorials:

* ☑ Jupyter Notebook Intro (you should be OK with plotting & running blocks).
* ☑ SCA101 Intro (you should have an idea of how to get hardware-specific versions running).

## Overview

Now that we've seen that power analysis can be used to break a simple password check, let's put it into practice! The firmware we'll be using for this lab is `basic-passwdcheck.c`, located in `chipwhisperer/hardware/victims/firmware/basic-passwdcheck`. If you open that file, you'll see that the firmware does the following:

* Prints some stuff
* Waits for a password to be sent over serial
* Checks the password
* Responds based on whether or not the password is correct

Let's take a closer look at the password check:

```C
for(uint8_t i = 0; i < sizeof(correct_passwd); i++){
    if (correct_passwd[i] != passwd[i]){
        passbad = 1;
        break;
    }
}
```

As you can see, the target only checks the password until it finds an incorrect character, then breaks out of the password check loop. This means there should be a big difference between a correct character and the first incorrect byte.

## Power Trace Gathering

At this point you've got to insert code to perform the power trace capture. There are two options here:
* Capture from physical device.
* Read from a file.

You get to choose your adventure - see the two code blocks `#SIMULATED` and `#HARDWARE` to continue. The `#SIMULATED` block will load privously captured power traces, while the `#HARDWARE` block will capture power traces using a connected ChipWhisperer.

Be sure you get the `"✔️ OK to continue!"` print once you run the next cell, otherwise things will fail later on!

First you'll need to select which hardware setup you have. You'll need to select both a `SCOPETYPE` and a `PLATFORM`. `SCOPETYPE` can either be `'OPENADC'` for the CWLite/CW1200 or `'CWNANO'` for the CWNano. `PLATFORM` is the target device, with `'CWLITEARM'`/`'CW308_STM32F3'` being the best supported option, followed by `'CWLITEXMEGA'`/`'CW308_XMEGA'`, then by `'CWNANO'`. As of CW 5.4, you can select the SimpleSerial version
used. For example:

```python
SCOPETYPE = 'OPENADC'
PLATFORM = 'CWLITEARM'
SS_VER = 'SS_VER_2_1'
```

If you're using the `#SIMULATED` block, you can skip this:

## \#HARDWARE

In [None]:
SCOPETYPE="OPENADC"
PLATFORM='CW308_SAM4S'
CRYPTO_TARGET='NONE'
VERSION='HARDWARE'
SS_VER="SS_VER_2_1"

This code will connect the scope and do some basic setup. We're now just going to use a special setup script to do this.

In [None]:
%run "../Setup_Scripts/Setup_Generic.ipynb"

The following code will build the firmware for the target.


In [None]:
%%bash -s "$PLATFORM" "$SS_VER"
cd ../../hardware/victims/firmware/basic-passwdcheck
make PLATFORM=$1 CRYPTO_TARGET=NONE SS_VER=$2 -j

Finally, all that's left is to program the device, which can be done with the following line:


In [None]:
cw.program_target(scope, prog, "../../hardware/victims/firmware/basic-passwdcheck/basic-passwdcheck-{}.hex".format(PLATFORM))

To make interacting with the hardware easier, let's define a function to attempt a password and return a power trace:

In [None]:
def cap_pass_trace(pass_guess):
    reset_target(scope)
    num_char = target.in_waiting()
    while num_char > 0:
        target.read(num_char, 10)
        time.sleep(0.01)
        num_char = target.in_waiting()

    scope.arm()
    target.write(pass_guess)
    ret = scope.capture()
    if ret:
        print('Timeout happened during acquisition')

    trace = scope.get_last_trace()
    return trace

In [None]:
scope.adc.samples = 3000

In [None]:
trace_test = cap_pass_trace("h\n")

#Basic sanity check
assert(len(trace_test) == 3000)
print("✔️ OK to continue!")

## \#SIMULATED

This sends a password guess to the target device, and returns a power trace associated with the guess in question. So for example you could run:

```python
    cap_pass_trace("abcde\n")
```
    
To get a power trace of `abcde`.

Instead, we have a function that uses pre-recorded data. Run the following block and it should give you access to a function that uses pre-recorded data. While how you use the function is the same, note the following limitations:

* Not every combination is stored in the system -- instead it stores similar power traces.
* 100 traces are stored for each guess, and it randomly returns one to still give you the effect of noise.

In [None]:
import chipwhisperer as cw
%run "traces/password_sim.ipynb"

trace_test = cap_pass_trace("h\n")

#Basic sanity check
assert(len(trace_test) == 3000)
print("✔️ OK to continue!")

## Exploration

So what can we do with this? While first off - I'm going to cheat, and tell you that we have a preset password that starts with `h`, and it's 5 characters long. But that's the only hint so far - what can you do? While first off, let's try plotting a comparison of `h` to something else.

You can use the `cw.plot()` function to plot traces.

The following cell shows you how to capture one power trace with `h` sent as a password. From there:

1. Try adding the plotting code and see what it looks like.
2. Send different passwords to the device. We're only going to look at the difference between a password starting with `h` and something else right now.
3. Plot the different waveforms.

In [None]:
#Example - capture 'h' - end with newline '\n' as serial protocol expects that
trace_h = cap_pass_trace("h\n")

print(trace_h)

# ###################
# Add your code here (Code Block 1)
# ###################
raise NotImplementedError("Add your code here, and delete this.")

For reference, the output should look something like this:
<img src="img/spa_password_h_vs_0_overview.png" alt="SPA of Power Analysis" width="450"/>

What you want to notice is there is two code paths taken, depending on a correct or incorrect path. Here for example is a correct & incorrect character processed:
<img src="img/spa_password_h_vs_0_zoomed.png" alt="SPA of Power Analysis" width="450"/>

OK interesting -- what's next? Let's plot every possible password character we could send.

Our password implementation only recognizes characters in the list `abcdefghijklmnopqrstuvwxyz0123456789`, so we're going to limit it to those valid characters for now.

Write some code in the following block that implements the following algorithm:

```python
plot = cw.plot()
for CHARACTER in LIST_OF_VALID_CHARACTERS:
    trace = cap_pass_trace(CHARACTER + "\n")
    plot *= cw.plot(trace)
display(plot)
```
        
The above isn't quite valid code - so massage it into place! You also may notice the traces are way too long - you might want to make a more narrow plot that only does the first say 500 samples of the power trace.

In [None]:
# ###################
# Add your code here (Code Block 2)
# ###################
raise NotImplementedError("Add your code here, and delete this.")

The end result should be if you zoom in, you'll see there is a location where a single "outlier" trace doesn't follow the path of all the other traces. That is great news, since it means we learn something about the system from power analysis.

<img src="img/spa_password_list_char1.png" alt="SPA of Power Analysis against all inputs" width="450"/>

## Automating an Attack against One Character

To start with - we're going to automate an attack against a **single** character of the password. Since we don't know the password (let's assume), we'll use a strategy of comparing all possible inputs together.

An easy way to do this might be to use something that we know can't be part of the valid password. As long as it's processed the same way, this will work just fine. `0x00` is actually ignored at the start of passwords, so for now, let's use a password as `0x01` (i.e., an invalid byte). We can compare this byte to processing something else:

In [None]:
ref_trace = cap_pass_trace("\x01\n")[0:1000]
other_trace = cap_pass_trace("c\n")[0:1000]

cw.plot(ref_trace) * cw.plot(other_trace)

This will plot a trace with an input of "\x01". This is an invalid character, and seems to be processed as any other invalid password.

Let's make this a little more obvious, and plot the difference between a known reference & every other capture. You need to write some code that does something like this:

```python
ref_trace = cap_pass_trace( "\x01\n")
plot = cw.plot()

for CHARACTER in LIST_OF_VALID_CHARACTERS:
    trace = cap_pass_trace(CHARACTER + "\n")
    plot *= cw.plot(trace - ref_trace)
display(plot)
```

Also notice in the above example how I reduced the number of samples.

In [None]:
# ###################
# Add your code here (Code Block 3)
# ###################
raise NotImplementedError("Add your code here, and delete this.")

OK great - hopefully you now see one major "difference". It should look something like this:
    
<img src="img/spa_password_diffexample.png" alt="SPA with Difference" width="450"/>
    

What do we do now? Let's make this thing automatically detect such a large difference. Some handy stuff to try out is the `np.sum()` and `np.abs()` function.

The first one will get absolute values:

```python
import numpy as np
np.abs([-1, -3, 1, -5, 6])

    Out[]: array([1, 3, 1, 5, 6])
```

The second one will add up all the numbers.

```python
import numpy as np    
np.sum([-1, -3, 1, -5, 6])

    Out[]: -2
```

Using just `np.sum()` means positive and negative differences will cancel each other out - so it's better to do something like `np.sum(np.abs(DIFF))` to get a good number indicating how "close" the match was.


In [None]:
import numpy as np
np.abs([-1, -3, 1, -5, 6])

In [None]:
import numpy as np
np.sum([-1, -3, 1, -5, 6])

In [None]:
np.sum(np.abs([-1, -3, 1, -5, 6]))

Taking your above loop, modify it to print an indicator of how closely this matches your trace. Something like the following should work:

```python
ref_trace = cap_pass_trace( "\x01\n")

for CHARACTER in LIST_OF_VALID_CHARACTERS:
    trace = cap_pass_trace(CHARACTER + "\n")
    diff = SUM(ABS(trace - ref_trace))

    print("{:1} diff = {:2}".format(CHARACTER, diff))
```

In [None]:
# ###################
# Add your code here (Code Block 4)
# ###################
raise NotImplementedError("Add your code here, and delete this.")

Now the easy part - modify your above code to automatically print the correct password character. This should be done with a comparison of the `diff` variable - based on the printed characters, you should see one that is 'higher' than the others.

## Running a Full Attack

Finally - let's finish this off. Rather than attacking a single character, we need to attack each character in sequence.

If you go back to the plotting of differences, you can try using the correct first character & wrong second character. The basic idea is exactly the same as before, but now we loop through 5 times, and just build up the password based on brute-forcing each character.

Another way you could attack this is by running through all the characters and picking the one with the largest difference.

Take a look at the following for the basic pseudo-code:

```python
guessed_pw = "" #Store guessed password so far

do a loop 5 times (max password size):

    ref_trace = capture power trace(guessed_pw + "\x01\n")

    for CHARACTER in LIST_OF_VALID_CHARACTERS:
        trace = capture power trace (guessed_pw + CHARACTER + newline)
        diff = SUM(ABS(trace - ref_trace))

        if diff > THRESHOLD:

            guessed_pwd += c
            print(guessed_pw)

            break
```

In [None]:
# ###################
# Add your code here (Code Block 5)
# ###################
raise NotImplementedError("Add your code here, and delete this.")

You should get an output that looks like this:

```
    h
    h0
    h0p
    h0px
    h0px3
```

If so - 🥳🥳🥳🥳🥳🥳🥳🥳🥳🥳🥳🥳🥳 Congrats - you did it!!!!

If not - check some troubleshooting hints below. If you get really stuck, check the `SOLN` version (there is one for both with hardware and simulated).

## Troubleshooting - Always get 'h'

Some common problems you might run into - first, if you get an output which keeps guessing the first character:

```
    h
    hh
    hhh
    hhhh
    hhhhh
```

Check that when you run the `cap_pass_trace` inside the loop (checking the guessed password), are you updating the prefix of the password? For example, the old version of the code (guessing a single character) looked like this:

```python
    trace = cap_pass_trace(c + "\n")
```

But that is always sending our first character only! So we need to send the "known good password so far". In the example code something like this:

```python
    trace = cap_pass_trace(guessed_pw + c + "\n")
```

Where `guessed_pw` progressively grows with the known good start of the password.

#### Troubleshooting - Always get 'a'

This looks like it's always matching the first character:

```
    h
    ha
    haa
    haaa
    haaaa
```

Check that you update the `ref_trace` - if you re-use the original reference trace, you won't be looking at a reference where the first N characters are good, and the remaining characters are bad. An easy way to do this is again using the `guessed_pw` variable and appending a null + newline:

```python
    trace = cap_pass_trace(guessed_pw + "\x01\n")
```

## A Better Password Solution

There's a few different ways fix this attack vector. The most obvious is to make the password check take a constant amount of time:

```C
// change correct_passwd[] to correct_passwd[32]
// make sure not to use the length of the correct password to avoid leaking password length
uint8_t passlen = strnlen(passwd, 31); 
for(uint8_t i = 0; i < passlen; i++){
    passbad |= correct_passwd[i] ^ passwd[i];
}
```

Another fix is to use a secure hash function, such as SHA256, to transform the correct and submitted passwords before comparing them. This has the advantage of not requiring the firmware to store the correct password, just its hash. Passwords are often combined with "salt" to protect against additional attacks.

In [None]:
assert guessed_pw == 'h0px3', "Failed to break password"