Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
rewrite CBC unpad without data-dependent branches, mitigates lucky 13 (s... #49
Maybe I'm missing something, but as far as I can see, there are still data-dependent branches in this code.
Performing the same computation on both branches of a match is not sufficient, as one may still be able to observe branch mispredictions.
Also, pattern matching is optimized:
match x, y with | true , true -> e1 | true , false -> e2 | false, _ -> e3
is compiled into something like:
if x then (if y then e1 else e2) else e3
so the execution time is not the same whether
On lines 288/290, the call to
Finally, on lines 292-294, an allocation is performed in case of success, but not in case of failure.
I'm wondering if this could be solved via Lwt instead of a fine-grained solution like below. Could we wrap a block of code into an Lwt thread that would guarantee that it would take a constant amount of time, for example by dropping into the scheduler and sleeping on a timer wheel? It's a heavyweight approach, but at least responses could only be observed at the resolution of the timer wheel.
On 10 Apr 2014, at 21:07, Jerome Vouillon email@example.com wrote:
Lwt approach might be a little heavy, but my inclination goes in that general direction: instad of making sure all code paths are the same length, sometimes at increasing costs to code simplicity, and often just guessing about what really happens below us, I'm trying to figure out how to introduce revertible random workloads into the pipeline.
I was greatly inspired by this (standard) RSA timing mitigation. It exploits the fact RSA is homeomorphic with respect to multiplication, to fudge random nonce into the ciphertext, decrypt the whole masked lot and then extract the actual result. This way, the otherwise very timeable RSA becomes much more difficult to steal secrets from by using a stopwatch.
I don't have too many concrete ideas yet, but this is the direction my thinking goes in. This kind of mitigation is much more global, and thus more modular: half way to Lwt-level, but still within time bounds related to the algorithm itself, and not an external scheduler.
@pqwy yes, blinding is well known for asymmetric crypto (but then, https://twitter.com/agl__/status/453471651779182592 might be of interest as well)
strategy here should be: define threat model (do we care about local attackers, or only those on the network?); implement attacker; measure consumed time for possible inputs; show results.
relying on an external scheduler will mitigate against a network attacker, but certainly not against a local one (due to other side channels, such as power consumption).
@hannesm Exactly. Threat model.
In my mind, the threat model for the first cut of the code is strictly a network attacker. Cache-eviction, power measurements and other local side channel subtelties are, at least in my view, something to reconsider only after having a full stack capable of working on the network.
This is because some of the primary targets for the code are small arm boxes which run a dedicated process, or cloud vms which can run multiple tennants, where these can't know who's sharing the cpu with whom and which can transparently migrate, thus largely voiding local-attack assumptions.
This is also because subtler side channels are a complex topic in themselves possibly larger than the tls itself. ( If someone can measure cpu power, they can probably spray the ram chips with liquid helium as well, and dump the contents before the resistors lose the information. This particular threat model then includes potential full memory compromise as well. Now what? That's one. The other is that writing code that bets on exact generated instructions departs from the greatest benefits OCaml can confer; we could do it in Intercal then and it would read about the same. )
Doing differential timings (contingent on varying secrets) is something I'm dying to try out. But I firmly believe that ground functionality should be put in place before going there. Else focus disperses and we end up randomly chasing loose ends without making core progress.
So, as for the threat model and mitigations, I would really, really love if we first did a tls that worked correctly on the logic level (threat model is Eve), proceeding into more serious timing mitigations only after that (with Mallory living on the network), and considering local (or physical) side channels firmly as a third (Cpu is Mallory? Or is she the power cord now?).
I'd like to get mundane stuff done first. My fear is that otherwise we'll never finish.
On 13 Apr 2014, at 17:12, pqwy firstname.lastname@example.org wrote:
more recent code, the attempt of Lucky13 mitigation in python: https://github.com/tomato42/tlslite-ng/commit/d7b288316bca7bcdd082e6ccff5491e241305233.patch and https://github.com/tomato42/tlslite-ng/commit/60d4f1c6ea70308f07f4999626bb60a81758eb17.patch