Add a distributed implementation of the correlation-enhanced collision attack #28

alphan · 2020-12-16T15:07:10Z

This PR adds a distributed implementation of the correlation-enhanced power analysis collision attack described in "Correlation-Enhanced Power Analysis Collision Attack" by A. Moradi, O. Mischke, and T. Eisenbarth (https://eprint.iacr.org/2010/297.pdf).

In addition to using Ray (https://docs.ray.io/en/master/index.html) for distributed computation, this implementation also uses a modified version of Dijkstra's shortest path algorithm to utilize all available information on differences between key bytes. This implementation also optimizes several operations for improved performance, e.g. various numpy operations and encrypting with multiple keys in parallel using scared to check if the attack was successful, etc.

I get the following numbers when I run this on my workstation (12 cores @ 3.7 GHz according to /proc/cpuinfo) using the traces in aes_unhardened_1mio_traces_cw305.tar.bz2 captured by @vogelpi:

> ./ceca.py -f projects/opentitan_simple_aes.cwp -n 400000 -w 10 -a 117 127 -d output -s 3

2020-12-16 09:18:45,414 INFO services.py:1090 -- View the Ray dashboard at http://127.0.0.1:8265
2020-12-16 09:18:48,020 INFO ceca.py:489 -- Will use 397502 traces (99.4% of all traces)
2020-12-16 09:18:48,586 INFO _timer.py:57 -- compute_pairwise_diffs_and_scores took 0.4s
2020-12-16 09:18:48,588 INFO ceca.py:497 -- Difference values (delta_0_i): [  0 196  41 120  25  62 245  89  49 239 220  24 102 179 220 118]
2020-12-16 09:18:48,669 INFO ceca.py:502 -- Recovered AES key: 2b7e151628aed2a6abf7158809cf4f3c
2020-12-16 09:18:48,670 INFO _timer.py:57 -- perform_attack took 1.8s
2020-12-16 09:18:48,670 INFO _timer.py:57 -- main took 3.8s

> ./ceca.py -f projects/opentitan_simple_aes.cwp -n 1000000 -w 10 -a 110 130 -d output -s 3

2020-12-16 09:19:12,316 INFO services.py:1090 -- View the Ray dashboard at http://127.0.0.1:8265
2020-12-16 09:19:15,437 INFO ceca.py:489 -- Will use 989639 traces (99.0% of all traces)
2020-12-16 09:19:16,181 INFO _timer.py:57 -- compute_pairwise_diffs_and_scores took 0.4s
2020-12-16 09:19:16,183 INFO ceca.py:497 -- Difference values (delta_0_i): [  0 196  41 120  25  62 245  89  49 239 220  24 102 179 220 118]
2020-12-16 09:19:16,264 INFO ceca.py:502 -- Recovered AES key: 2b7e151628aed2a6abf7158809cf4f3c
2020-12-16 09:19:16,265 INFO _timer.py:57 -- perform_attack took 2.5s
2020-12-16 09:19:16,265 INFO _timer.py:57 -- main took 4.4s

The difference between main() and perform_attack() is Ray's initialization overhead.

I am aware that some of the code can be moved to utility libraries, I plan to revisit those after running this on GCP.

Looking forward to your feedback!

Signed-off-by: Alphan Ulusoy <alphan@google.com>

vogelpi

Thanks @alphan for this PR, it's awesome work!

LGTM, I just have a couple of minor comments.

As an extension in a follow up PR, do you think it would be possible to implement a sweep over the number of traces? This would provide the useful information of when the attack starts to succeed, i.e., what the resistance in terms of number of traces actually is. Anyway, worst case we can just wrap your attack with a sweep script. Yours is so incredibly fast now that this would still be okay I guess.

cw/cw305/ceca.py

vogelpi · 2020-12-18T11:02:18Z

cw/cw305/ceca.py

+            diff_corrcoefs = corrcoefs[alphas, 256 + betas].sum(axis=1)
+            best_diff = diff_corrcoefs.argmax()


It's not fully clear how your implementation works but you basically managed to skip three more loops (compared to mine) which is great 👍

What I am wondering: you don't seem to normalize the correlation coefficients BEFORE doing the argmax. If I remember correctly, this was necessary in my implementation. But still yours works correctly....

What I am wondering: you don't seem to normalize the correlation coefficients BEFORE doing the argmax. If I remember correctly, this was necessary in my implementation. But still yours works correctly....

IIUC, you are referring to this snippet (lines 98-104 in correlation-enhanced_collision_attack.py):

for i in range(256): for j in range(256): delta = i ^ j rho_avg_delta[delta] += rho_mat[i, j] rho_avg_delta_num[delta] += 1 for delta in range(256): # redundant rho_avg_delta[delta] /= rho_avg_delta_num[delta] # redundant # Normalize correlations. mean = rho_avg_delta.mean() rho_avg_delta /= mean

Unless I'm missing something, it looks like the loop marked by redundant doesn't really do anything, maybe except for introducing some tiny numerical error due to floating point operations. Since rho_avg_delta_num[delta] is 256 for all values of delta, i.e. [0, 255] and rho_avg_delta is normalized again, this loop is basically dividing all entries by some constant before they are normalized.

It's not fully clear how your implementation works but you basically managed to skip three more loops (compared to mine) which is great 👍

Thanks :) The main point here is to minimize the back and forth between python and numpy as much as possible, e.g. by avoiding unnecessary (nested) loops that can be handled by numpy much more efficiently. In a nutshell:

alphas is a (256,) row array, diffs is a (256, 1) col array (so that broadcasting works out), and betas is a (256, 256) array where betas[i,j] = diffs[i] ^ alphas[j].

Defining betas this way is important because this lets us pick the values we want from the np.corrcoeff result using just a single indexing operation and a sum(axis=1) as opposed to two range(0, 256) nested loops.

I hope this helps and please let me know if you have any other questions.

Thanks for the clarification @alphan .

Regarding the redundant loop above: I added this (before doing the real normalization) because I wasn't sure if there are equally many combinations of i and j that result in the same delta. But yes, I think with the real normalization in place this loop can actually be dropped.

vogelpi · 2020-12-18T11:13:16Z

cw/cw305/ceca.py

+            # TODO: Analyze the effect of /diff_corrcoefs.mean() below.
+            pairwise_diffs_scores[(a, b), (b, a)] = (best_diff,
+                                                     diff_corrcoefs[best_diff] /
+                                                     diff_corrcoefs.mean())


Ah, here comes the normalization. I think you need this to make all the pairwise_diffs_scores comparable. There are pairs for which the correlation is almost flat but constantly higher than the highest peaks of another pair. You need the normalization to "downvote" this flat correlation for find_best_diffs().

I think so too but I left a TODO nevertheless because IIRC I was able to recover the key without this a couple of times and would like to check how the results look like when running batch attacks.

moidx

LGTM, great work!

moidx · 2021-01-08T03:33:37Z

cw/cw305/ceca.py

+"""
+
+
+def timer():


I just noticed we are not being consistent across scripts on indentation. Should we use 2 or 4 spaces?

I would prefer 4 spaces to align with PEP8 and leave these to automated tools. I think black is a good choice. What do you think?

cw/cw305/ceca.py

python-requirements.txt

…n attack This change adds a distributed implementation of the correlation-enhanced power analysis collision attack described in "Correlation-Enhanced Power Analysis Collision Attack" by A. Moradi, O. Mischke, and T. Eisenbarth (https://eprint.iacr.org/2010/297.pdf). Signed-off-by: Alphan Ulusoy <alphan@google.com>

alphan · 2021-01-19T05:19:02Z

Thanks for your reviews @vogelpi and @moidx!
In the last push:

Addressed comments
Reformatted code with black.
Fixed some minor typos in help strings.

Add missing python dependency

c05c868

Signed-off-by: Alphan Ulusoy <alphan@google.com>

alphan requested review from moidx and vogelpi December 16, 2020 15:07

vogelpi approved these changes Dec 18, 2020

View reviewed changes

moidx approved these changes Jan 8, 2021

View reviewed changes

alphan mentioned this pull request Jan 18, 2021

Report the number of correctly guessed differences in ceca.py. #30

Closed

alphan force-pushed the cepaca branch from 7e89204 to 0fb291b Compare January 19, 2021 05:13

alphan merged commit 13b0710 into lowRISC:master Jan 19, 2021

alphan deleted the cepaca branch January 19, 2021 05:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a distributed implementation of the correlation-enhanced collision attack #28

Add a distributed implementation of the correlation-enhanced collision attack #28

alphan commented Dec 16, 2020

vogelpi left a comment

vogelpi Dec 18, 2020

vogelpi Dec 18, 2020

alphan Jan 19, 2021

alphan Jan 19, 2021

vogelpi Jan 19, 2021

vogelpi Dec 18, 2020

alphan Jan 19, 2021

moidx left a comment

moidx Jan 8, 2021

alphan Jan 18, 2021 •

edited

alphan commented Jan 19, 2021

		diff_corrcoefs = corrcoefs[alphas, 256 + betas].sum(axis=1)
		best_diff = diff_corrcoefs.argmax()

		"""


		def timer():

Add a distributed implementation of the correlation-enhanced collision attack #28

Add a distributed implementation of the correlation-enhanced collision attack #28

Conversation

alphan commented Dec 16, 2020

vogelpi left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

moidx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alphan Jan 18, 2021 • edited

Choose a reason for hiding this comment

alphan commented Jan 19, 2021

alphan Jan 18, 2021 •

edited