-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add correlation-enhanced collision attack #18
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @vogelpi! Just some initial feedback:
|
||
from util import plot | ||
|
||
# Open trace file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a if __name__ == '__main__'
at the bottom and move this code to main()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's done now.
# Create a local copy of the traces. This makes the remaining operations much | ||
# faster. | ||
traces = np.empty((num_traces, num_samples_use), np.double) | ||
for i_trace in range(num_traces): | ||
traces[i_trace] = project.waves[i_trace][start_sample_use:stop_sample_use] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this because cw loads traces lazily? I wonder if there is something in the API to do this for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can do the following:
traces = np.array(project.waves)[:num_traces,start_sample:stop_sample]
I recommend the following:
import pandas as pd
traces = pd.DataFrame(np.array(project.waves)[:num_traces,start_sample:stop_sample])
# Then you can do the following
mean = traces.mean(axis=0)
std = traces.std(axis=0)
# the following requires pandas_bokeh
pd.set_option("plotting.backend", "pandas_bokeh")
mean.plot()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem is that project.waves
is a memory mapped object that is not dense/contiguous and only indirectly accessed. This is really slow for big data sets.
I played around with this when trying to parallelize the computation. The way the memory mapping is implemented in the CW API, parallelization using multiprocessing
etc. fails in most cases. Either the unpickling failed or I got errors that too many files where open (when giving each thread just the part of the project.waves
it needs, this actually creates a new memory mapping for every thread). In the end, I ended up opening the project file separately for every thread. This worked, but it wasn't really efficient.
In contrast, creating this local dense copy is orders of magnitude faster. For example, the whole script now takes around 80 seconds on my machine. Working on the memory mapped traces such as with the approach of @moidx , the filtering alone takes around 3 - 4 minutes.
stop_sample = start_sample + num_samples | ||
mean_trace = np.zeros(num_samples, np.double) | ||
mean_sq_trace = np.zeros(num_samples, np.double) | ||
for i in range(len(traces)): | ||
mean_trace += traces[i][start_sample:stop_sample] | ||
mean_sq_trace += (traces[i][start_sample:stop_sample]**2) | ||
mean_trace /= num_traces | ||
mean_sq_trace /= num_traces | ||
return mean_trace, mean_sq_trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can use something like np.mean(traces, axis=0)[start_sample:stop_sample]
instead of the loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's changed, thanks for the suggestion.
stop_sample = start_sample + num_samples | ||
mean_trace = np.zeros(num_samples, np.double) | ||
mean_sq_trace = np.zeros(num_samples, np.double) | ||
for i in range(len(traces)): | ||
mean_trace += traces[i][start_sample:stop_sample] | ||
mean_sq_trace += (traces[i][start_sample:stop_sample]**2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
traces
is already truncated above using start_sample_use
and stop_sample_use
. Calls seem to be OK but can you consider removing this for the sake of simplicity?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
mean_trace, mean_sq_trace = get_mean_sq_traces(traces, num_samples_use, 0) | ||
sigma_trace = np.sqrt(mean_sq_trace - (mean_trace**2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just use numpy.std
and remove mean_sq_trace
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes absolutely. I wasn't aware that these functions exist...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One initial comment. I will continue tomorrow
# Create a local copy of the traces. This makes the remaining operations much | ||
# faster. | ||
traces = np.empty((num_traces, num_samples_use), np.double) | ||
for i_trace in range(num_traces): | ||
traces[i_trace] = project.waves[i_trace][start_sample_use:stop_sample_use] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you can do the following:
traces = np.array(project.waves)[:num_traces,start_sample:stop_sample]
I recommend the following:
import pandas as pd
traces = pd.DataFrame(np.array(project.waves)[:num_traces,start_sample:stop_sample])
# Then you can do the following
mean = traces.mean(axis=0)
std = traces.std(axis=0)
# the following requires pandas_bokeh
pd.set_option("plotting.backend", "pandas_bokeh")
mean.plot()
993ce82
to
285fc0b
Compare
This attack is supposed to work both on unmasked AES implementations and implementations using the masked Canright S-Box. It uses the existing simple_capture_traces.py for the capture stage. Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
This attack can be performed on the first round (recovering the deltas between bytes in the initial key) or the last round (recovering the deltas between bytes in the final round key). Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
285fc0b
to
1df3e2d
Compare
Hey @moidx and @alphan , $ ./correlation-enhanced_collision_attack.py
Will work with 997740/1000000 traces.
known_key: b'2b7e151628aed2a6abf7158809cf4f3c'
key guess: b'2b7e151628aed2a6abf7158809cf4f3c'
SUCCESS!
86/120 deltas guessed correctly. I also needed to normalize the correlations such that we indeed select the relationships with the strongest peaks. Below you can see the correlation plot for the last round key - Byte 0 xor Byte 1 (green), Byte 0 xor Byte 2 (orange), Byte 0 xor Byte (blue): Could you please take another look at the PR? |
…tack This commit adds a possibility to sweep the number of traces used for the attack, which allows to determine the minimum number of traces used to successfully perform the attack. Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
When sweeping the number of traces used, most of the time is spent in computing the m_alpha_j-s. This commit parallelizes the corresponding code to speed up the attack. Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
Update: I've added a functionality to sweep the number traces used for the attack. This allows to produce a plot like this: x-axis is the number of traces in steps of 100000, the y-axis is the percentage of correct guesses. The orange curve is the key bytes (max 16), the green curve is the number of key byte differences (max 120). |
Based on yesterday's discussion, should we go ahead and merge this @alphan ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you and kudos for successfully implementing the attack @vogelpi! I think we should go ahead and merge this. We will revisit this as we make progress on the distributed implementation anyway.
for byte in range(15): | ||
# Take the most promising delta that involves the available | ||
# key bytes. | ||
for rho, delta, a, b in max_rho_deltas: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a comment here saying that this is a single-pass heuristic? Since we choose the delta
value for each byte sequentially from 0 to 15, we don't consider cases where byte x could use the delta value of byte y where y > x.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sure. I'll add a comment here.
Thanks @alphan! I have now added a comment and force pushed, but the update isn't shown here. I don't understand why. |
This attack is supposed to work both on unmasked AES implementations and implementations using the masked Canright S-Box. It uses the existing
simple_capture_traces.py
for the capture stage.Note: Currently, the attack is not successful. I suspect because of noise in the traces or wrong configuration of the scope. I tried to use the
ResyncSAD
class of the ChipWhispererAPI and also implemented some very basic filtering, both without success. I am now collecting fresh traces with disconnected FTDI.This is related to #11.