Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feed_string #30

Closed
cfcs opened this issue May 29, 2018 · 11 comments
Closed

feed_string #30

cfcs opened this issue May 29, 2018 · 11 comments
Assignees

Comments

@cfcs
Copy link

cfcs commented May 29, 2018

Would be nice to have a feed_string function in addition to feed_bytes so we don't have to allocate a Bytes.t to hash a string.

@cfcs
Copy link
Author

cfcs commented May 29, 2018

feed_substring and feed_subbytes would also be handy (and so would something that can feed from slices of Bigstring.t, but I can't find the definition of that module right now).

@dinosaure dinosaure self-assigned this May 29, 2018
@dinosaure
Copy link
Member

An interface like this should be better:

type 'a iter = ('a -> unit) -> unit

module type S = sig

  val digest_size : int

  type ctx
  type t = string

  val init: unit -> ctx

  val feed_bytes: ctx -> Bytes.t -> ctx
  val feed_string: ctx -> String.t -> ctx
  val feed_bigstring: ctx -> Bigstring.t -> ctx

  val feedi_bytes: ctx -> Bytes.t iter -> ctx
  val feedi_string: ctx -> String.t iter -> ctx
  val feedi_bigstring: ctx -> Bigstring.t iter -> ctx

  val get: ctx -> t

  val digest_bytes: Bytes.t -> t
  val digest_string: String.t -> t
  val digest_bigstring: Bigstring.t -> t

  val digesti_bytes: Bytes.t iter -> t
  val digesti_string: String.t iter -> t
  val digesti_bigstring: Bigstring.t iter -> t

  val digestv_bytes: Bytes.t list -> t
  val digestv_string: String.t list -> t
  val digestv_bigstring: Bigstring.t list -> t

  val hmac_bytes: key:Bytes.t -> Bytes.t -> a
  val hmac_string: key:String.t -> String.t -> t
  val hmac_bigstring: key:Bigstring.t -> Bigstring.t -> t

  val hmaci_bytes: key:Bytes.t -> Bytes.t iter -> t
  val hmaci_string: key:String.t -> String.t iter -> t
  val hmaci_bigstring: key:Bigstring.t -> Bigstring.t iter -> t

  val hmacv_bytes: key:Bytes.t -> Bytes.t list -> t
  val hmacv_string: key:String.t -> String.t list -> t
  val hmacv_bigstring: key:Bigstring.t -> Bigstring.t list -> t

  val compare: t -> t -> int
  val eq: t -> t -> bool
  val neq: t -> t -> bool

  val pp: Format.formatter -> t -> unit

  val of_hex: string -> t
  val to_hex: t -> string
end

Isn't it ?

@dinosaure
Copy link
Member

Arf, I forget *_sub* functions.

@cfcs
Copy link
Author

cfcs commented May 30, 2018

@dinosaure what does this mean (the other hmac_ functions return t?):

  val hmac_bytes: key:Bytes.t -> Bytes.t -> a

@cfcs
Copy link
Author

cfcs commented May 30, 2018

I think I would make the _sub functions using optional arguments to avoid doubling the amount of functions in the interface?

@cfcs
Copy link
Author

cfcs commented May 30, 2018

did you intentionally leave out feedv_ (feeding strings)?

@cfcs
Copy link
Author

cfcs commented May 30, 2018

@hannesm do you have any comments on this API?

@cfcs
Copy link
Author

cfcs commented May 30, 2018

I'm not sure that neq is needed here, wouldn't it be easier to just wrap eq in not ( .. ) ?

Speaking of interface, it would be useful if eq was constant-time so it could be used in security-sensitive applications (see https://codahale.com/a-lesson-in-timing-attacks/ for an example of a bug caused by non-constant comparison).

EDIT: A naive implementation could look something like

let eq a b =
  if (len a) <> (len b) then false else
  let rec comp acc idx =
    if idx = ~-1 then (acc = 0) else
    let c_eq = (get a idx) lxor (get b idx) in (* 0 if equal *)
    comp (acc lor c_eq) (pred idx)
  in comp 0 (len a -1)

@dinosaure
Copy link
Member

Yeah, if you have any issue about security, it could be the best time to explain it now. The next release will break the API and I'm not aware about security but I can spend my time on it. Then, from what I did, digest{,i,v}, hmac{,i,v} and feed{i,v} are a subset of feedi.

About hmac_bytes, I miswrote, sorry.

Optional argument about _sub could be, indeed, a good solution to avoid duplicate code. My idea is to start from feedi_* function and, by a functor, generate all others functions - and by this way surely provide same functions with same semantics independently from the hash implementation.

@cfcs
Copy link
Author

cfcs commented May 30, 2018

@dinosaure apart from timing attacks I can't think of any security problems as long as the implementations are correct. Timing attacks work by learning information about the value being hashed by observing the time required to perform an operation. If the implementations take different amount of times depending on the nature of the data, the attacker can sometimes extrapolate the data from these timings (which is bad).

  • It is common for implementations to ignore mitigations against learning the length of the input, which is unfortunate, but still "industry practice" (no one is going complain if Digestif does the same), and mitigating this type of attack is not trivial and comes with some overhead. I'd suggest we ignore that.
  • What should be handled by Digestif however is that hashing two equal-length strings should take the same amount of time - hashing \x00\x00\x00\x00\x00 ... should take the same amount of time as \xFF \xFF \xFF \xFF .... Usually that is not very hard to do since the reference implementations are already designed to accomodate this. I would be very surprised if the current implementation of Digestif is vulnerable to this type of timing attack.
  • When I mention eq, it is because a common use of this function would be to check something like: hash secret_key = hash user_input. If the attacker can calculate hash user_input on their own, and they can learn the shared prefix by timing, they can adjust their user_input variable to eventually learn the value of hash secret_key.
    • In the light of this I wonder if it would be a problem to cripple the compare function to use the constant-time eq under the hood and always return 0/-1, or 0/1. It would make it impossible to use Digestif for ordered hash maps, but I'm not sure that these slow cryptographic hashes are useful for ordered maps in the first place, and it would make it harder for users to inadvertantly introduce timing attacks in their applications. What do you think?

@dinosaure
Copy link
Member

Close by #31

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants