Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Organization and naming issues #10

Closed
bford opened this issue Jan 10, 2015 · 10 comments
Closed

Organization and naming issues #10

bford opened this issue Jan 10, 2015 · 10 comments
Assignees

Comments

@bford
Copy link
Contributor

bford commented Jan 10, 2015

There are a number of naming and package-structure changes that probably need to happen at some point, but they'll be somewhat invasive and touch a lot of code, and I'd like to get some discussion and feedback on these before committing to any of them. Roughly from highest-level to lower-level:

  • Top-level organization: Originally the main abstract interfaces such as 'Point' and 'Secret' lived in the top-level 'crypto' package. I moved them down into a package called 'abstract' basically so that the top-level package and its godoc-generated documentation could easily contain high-level "working examples" such as those in dh_test.go and enc_test.go without creating circular dependences. (Those examples need to depend on cipher suites in sub-packages, which in turn need to depend on the basic abstract interfaces; hence those examples and the abstract interfaces cannot live in the same package.) However, I've been having second thoughts as to whether this was a good tradeoff; those examples are probably not important enough to justify relegating the library's most crucial, keystone interfaces and definitions to a sub package that users have to dig for to find. So I'm thinking of moving the high-level examples to a sub-package, e.g., 'examples', and moving the contents of 'abstract' back up to the top-level. This means (if the name of the library/repo doesn't change) we'll need to refer to Point as 'crypto.Point' rather than 'abstract.Point', etc. Comments, preferences?
  • Crypto library and repository name: 'crypto' is very generic and probably suggests to people that it provides functionality at a comparable level to the many other "crypto libraries" such as SSL-crypto, NaCl, etc., which is inaccurate since this library's purpose is to provide much higher-level functionality (building on many of the primitives provided in other crypto libraries). The closest analog I'm aware of is the Charm library for Python (http://charm-crypto.com/Main.html), which wisely has a name other than simply 'crypto library'. Any good naming ideas? My current idea is 'curvy', since it's short and all about doing fun stuff with elliptic curves and (with the above reorganization) you'd get to write 'curvy.Point', 'curvy.Secret', etc.
  • Rename Secret to Scalar, to align the two main interfaces to "group elements" to common ECC nomenclature (where an element of the cryptographic group is a "point" and a secret that you multiply a point with to encrypt is typically called a "scalar"). Originally the library's interfaces used something closer to multiplicative group notation traditionally used in pre-ECC Diffie-Hellman etc, but the library has been gradually becoming more ECC-centric and hence incrementally migrating toward ECC and additive-group-centric nomenclature, but that process is still incomplete.
  • An even more fundamental change that's worth considering and discussing, though at the moment I'm not sure I'm in favor of it, is to combine Point and Scalar into one interface, e.g., 'Element', similar to the way the Stanford PBC (pairing-based crypto) library has only one abstract "group element" interface that's used for ECC points and scalars and everything. On the plus side, this makes some sense mathematically and might reduce code and interface duplication here and there. Code duplication might be reduced in certain functions that can be performed on either "points" or "scalars": some parts of the Verifiable Secret Sharing code in crypto/poly come to mind immediately. A significant downside is that eleminating the Point/Scalar type-distinction will significantly weaken the clarity of the distinction between these two kinds of things - which tend to play very different security roles even if they're mathematically very closely-related structures - thereby making code using these interfaces less "self-documenting" and weakening the compiler's ability to statically check the separation of these roles properly in code. Clearly some significant subjective tradeoffs and judgment calls are involved here.
  • Swap the order of the arguments to Point.Mul, so the Secret/Scalar comes first. This method was originally called Exp and its argument-order corresponded to the pre-ECC Diffie-Hellman expoentiation argument-order: e.g., "Exp(x,y)" means "x^y". But in additive-group ECC terminology and notation the scalar typically comes first in scalar multiplication: e.g., "sP" for multiplying point P by the scalar s. Also, P is often nil, and it just looks a bit better for the second argument to be nil than the first.
  • Fix the inconsistency between the method signatures of Secret.Pick and Point.Pick (the latter supports embedded data, the former does not). This could be done in at least two ways: (a) consider the latter to be a different function and rename it to, say, Embed(); (b) do the same as (a) but also have a one-argument point.Pick(rand) that's equivalent to point.Embed(nil, rand); (c) keep the Pick name for both but add data-embedding functionality to Secret/Scalar (which would be technically pretty trivial though I'm not sure how often this capability would actually get used). Thoughts/preferences?

More such possible changes might come to mind, which I'll add to this thread as they do, and feel free to suggest others. I'd like to decide on a batch of such invasive changes and apply them about the same time, to minimize the painfulness of updating all the code that depends on these. Thanks.

@bford bford assigned bford and davidiw and unassigned bford Jan 10, 2015
@davidiw
Copy link
Contributor

davidiw commented Jan 13, 2015

Long issue, in general, it would probably be easier to address these as
separate issues, but let's see where this goes :)

  • I agree, it would make sense to keep the core functionality as a single
    package and then separate examples into a separate package
    (crypto/examples, or something of the like). Ideally examples should mimic
    user code as much as possible.
  • Is this a "advanced cryptographic primitives for Go"? I haven't followed
    the code as closely as I ought to have, but are Integers gone from the
    repository is it truly only a EC system? I do like integers as they're a
    bit more intuitive, but at the same time, their performance tends to not be
    so great (i.e., suck). I think curvy is a cool name and perhaps it doesn't
    matter much if the official name is intuitive ... after all Charm does not
    bring to mind crypto :)
  • Do something and do it well. Although now that you've gotten as far as
    you have with this library 1) given all the different applications, how
    hard is it to maintain both functionality and 2) how portable are crypto
    theory papers written for one group portable to another, Neff shuffle, for
    example.
  • This is a bit tricky. It has been a while since I had to deal with some
    of these nuances, but I thought to some extent this was handled by your
    AbstractGroup interface. I'm in favor of extra code if it protects users,
    especially for a security library, but reducing very nearly identical code
    can make audits and reviews easier and perhaps improve understanding of the
    code base. Could you show a concrete example of this reduction?
  • Regarding mul vs exp, if this library is to be curve centric, it would
    make more sense. I do recall quite a many Windows APIs that have many NULL
    fields mixed in with non-null fields, so if you keep it this way, you're in
    "good company" :).
  • Pick seems to be analogous to constructor producing an object with random
    values. Where did you get the nomenclature for Pick? If all the generation
    of random keys, secrets, etc all use the name pick, it becomes equivalent
    to "GenerateRandom(...)", right? I personally don't think you need to worry
    about "constructor" consistency. I'd also avoid adding code paths that
    either aren't or ought not to be used. I do find it strange that the
    parameters are not in a consistent order (byte[], cipherstream) vs
    (cipherstream).

Off topic: I'll admit, I'm jumping into these questions being 6 months
removed from Go and the library, but hopefully this pushes discussion along
:). I'll spend some time over the next week or so further diving into these
issues and understanding the code base.

On Sat, Jan 10, 2015 at 11:22 AM, Bryan Ford notifications@github.com
wrote:

There are a number of naming and package-structure changes that probably
need to happen at some point, but they'll be somewhat invasive and touch a
lot of code, and I'd like to get some discussion and feedback on these
before committing to any of them. Roughly from highest-level to lower-level:

Top-level organization: Originally the main abstract interfaces such
as 'Point' and 'Secret' lived in the top-level 'crypto' package. I moved
them down into a package called 'abstract' basically so that the top-level
package and its godoc-generated documentation could easily contain
high-level "working examples" such as those in dh_test.go and enc_test.go
without creating circular dependences. (Those examples need to depend on
cipher suites in sub-packages, which in turn need to depend on the basic
abstract interfaces; hence those examples and the abstract interfaces
cannot live in the same package.) However, I've been having second thoughts
as to whether this was a good tradeoff; those examples are probably not
important enough to justify relegating the library's most crucial, keystone
interfaces and definitions to a sub package that users have to dig for to
find. So I'm thinking of moving the high-level examples to a sub-package,
e.g., 'examples', and moving the content s of 'abstract' back up to the
top-level. This means (if the name of the library/repo doesn't change)
we'll need to refer to Point as 'crypto.Point' rather than
'abstract.Point', etc. Comments, preferences?

Crypto library and repository name: 'crypto' is very generic and
probably suggests to people that it provides functionality at a comparable
level to the many other "crypto libraries" such as SSL-crypto, NaCl, etc.,
which is inaccurate since this library's purpose is to provide much
higher-level functionality (building on many of the primitives provided in
other crypto libraries). The closest analog I'm aware of is the Charm
library for Python (http://charm-crypto.com/Main.html), which wisely
has a name other than simply 'crypto library'. Any good naming ideas? My
current idea is 'curvy', since it's short and all about doing fun stuff
with elliptic curves and (with the above reorganization) you'd get to write
'curvy.Point', 'curvy.Secret', etc.

Rename Secret to Scalar, to align the two main interfaces to "group
elements" to common ECC nomenclature (where an element of the cryptographic
group is a "point" and a secret that you multiply a point with to encrypt
is typically called a "scalar"). Originally the library's interfaces used
something closer to multiplicative group notation traditionally used in
pre-ECC Diffie-Hellman etc, but the library has been gradually becoming
more ECC-centric and hence incrementally migrating toward ECC and
additive-group-centric nomenclature, but that process is still incomplete.

An even more fundamental change that's worth considering and
discussing, though at the moment I'm not sure I'm in favor of it, is to
combine Point and Scalar into one interface, e.g., 'Element', similar to
the way the Stanford PBC (pairing-based crypto) library has only one
abstract "group element" interface that's used for ECC points and scalars
and everything. On the plus side, this makes some sense mathematically and
might reduce code and interface duplication here and there. Code
duplication might be reduced in certain functions that can be performed on
either "points" or "scalars": some parts of the Verifiable Secret Sharing
code in crypto/poly come to mind immediately. A significant downside is
that eleminating the Point/Scalar type-distinction will significantly
weaken the clarity of the distinction between these two kinds of things -
which tend to play very different security roles even if they're
mathematically very closely-related structures - thereby maki ng code using
these interfaces less "self-documenting" and weakening the compiler's
ability to statically check the separation of these roles properly in code.
Clearly some significant subjective tradeoffs and judgment calls are
involved here.

Swap the order of the arguments to Point.Mul, so the Secret/Scalar
comes first. This method was originally called Exp and its argument-order
corresponded to the pre-ECC Diffie-Hellman expoentiation argument-order:
e.g., "Exp(x,y)" means "x^y". But in additive-group ECC terminology and
notation the scalar typically comes first in scalar multiplication: e.g.,
"sP" for multiplying point P by the scalar s. Also, P is often nil, and it
just looks a bit better for the second argument to be nil than the first.

Fix the inconsistency between the method signatures of Secret.Pick and
Point.Pick (the latter supports embedded data, the former does not). This
could be done in at least two ways: (a) consider the latter to be a
different function and rename it to, say, Embed(); (b) do the same as (a)
but also have a one-argument point.Pick(rand) that's equivalent to
point.Embed(nil, rand); (c) keep the Pick name for both but add
data-embedding functionality to Secret/Scalar (which would be technically
pretty trivial though I'm not sure how often this capability would actually
get used). Thoughts/preferences?

More such possible changes might come to mind, which I'll add to this
thread as they do, and feel free to suggest others. I'd like to decide on a
batch of such invasive changes and apply them about the same time, to
minimize the painfulness of updating all the code that depends on these.
Thanks.


Reply to this email directly or view it on GitHub
#10.

@bford
Copy link
Contributor Author

bford commented Jan 13, 2015

On Jan 13, 2015, at 1:03 PM, David Isaac Wolinsy notifications@github.com wrote:

Long issue, in general, it would probably be easier to address these as
separate issues, but let's see where this goes :)

Great, thanks for the feedback. (And anyone else is welcome to chime in too.)

  • I agree, it would make sense to keep the core functionality as a single
    package and then separate examples into a separate package
    (crypto/examples, or something of the like). Ideally examples should mimic
    user code as much as possible.
  • Is this a "advanced cryptographic primitives for Go"? I haven't followed
    the code as closely as I ought to have, but are Integers gone from the
    repository is it truly only a EC system? I do like integers as they're a
    bit more intuitive, but at the same time, their performance tends to not be
    so great (i.e., suck). I think curvy is a cool name and perhaps it doesn't
    matter much if the official name is intuitive ... after all Charm does not
    bring to mind crypto :)

There is still support for integer groups - see crypto/nist/residue.go. I think it’s probably worth keeping this in the library at least for now but I’ve been considering integer-group support to be increasingly “deprecated” since ECC has matured and most everything interesting is gradually shifting that way.

I’d certainly be open to a less purely ECC-centric name for the library though (“Charm” is certainly suitably generic). Another idea that occured to me is “crykit” - which could be pronounced like either “cricket” or “cry-kit” - just meaning an advanced crypto toolkit. But if you or anyone has better naming ideas, by all means please suggest them.

  • Do something and do it well. Although now that you've gotten as far as
    you have with this library 1) given all the different applications, how
    hard is it to maintain both functionality and 2) how portable are crypto
    theory papers written for one group portable to another, Neff shuffle, for
    example.

The nice thing is that with this framework, as far as I know, the high-level crypto functionality like Neff shuffles are completely portable across groups, and there’s already significant functionality in the library to test these functions against multiple groups (e.g., NIST curves versus Edwards curves versus integer residue groups). I definitely want to maintain that.

  • This is a bit tricky. It has been a while since I had to deal with some
    of these nuances, but I thought to some extent this was handled by your
    AbstractGroup interface. I'm in favor of extra code if it protects users,
    especially for a security library, but reducing very nearly identical code
    can make audits and reviews easier and perhaps improve understanding of the
    code base. Could you show a concrete example of this reduction?

What exactly is the “this” that you’re saying is “a bit tricky”? And what are “these nuances” that you’re referring to? What extra code? What “nearly identical code” do you have in mind that can/should be reduced?

  • Regarding mul vs exp, if this library is to be curve centric, it would
    make more sense. I do recall quite a many Windows APIs that have many NULL
    fields mixed in with non-null fields, so if you keep it this way, you're in
    "good company" :).

What would make more sense? I think you’re neglecting to preserve some critical context here.

  • Pick seems to be analogous to constructor producing an object with random
    values. Where did you get the nomenclature for Pick? If all the generation
    of random keys, secrets, etc all use the name pick, it becomes equivalent
    to "GenerateRandom(...)", right? I personally don't think you need to worry
    about "constructor" consistency. I'd also avoid adding code paths that
    either aren't or ought not to be used. I do find it strange that the
    parameters are not in a consistent order (byte[], cipherstream) vs
    (cipherstream).

I settled on the name “Pick” mainly because it seemed to express the idea (picking a group element or curve point, picking a secret), and was shorter and simpler than “GenerateRandom”. :) I’m tentatively in favor of normalizing the argument signature one way or another, perhaps by adding (very simple) data-embedding support to the Secret interface.

Off topic: I'll admit, I'm jumping into these questions being 6 months
removed from Go and the library, but hopefully this pushes discussion along
:). I'll spend some time over the next week or so further diving into these
issues and understanding the code base.

Sounds good, thanks. We don’t need to make decisions on these items quickly; just wanted to get them out there and we can plan how and when to address them for real later.

B

samanklesaria pushed a commit to samanklesaria/crypto that referenced this issue Mar 24, 2015
@bford
Copy link
Contributor Author

bford commented Mar 10, 2017

@ineiti @nikkolasg I just wanted to remind you about this now quite old issue, which brought up a number of questions to consider related to reorganization and stabilization of the crypto library. Some we're already on top of (e.g., renaming the top-level), but there are others that I'm not sure we've actually resolved yet (e.g., consistent argument ordering and Pick methods etc).

@nikkolasg
Copy link
Collaborator

Swap the order of the arguments to Point.Mul, so the Secret/Scalar comes first. This method was originally called Exp and its argument-order corresponded to the pre-ECC Diffie-Hellman expoentiation argument-order: e.g., "Exp(x,y)" means "x^y". But in additive-group ECC terminology and notation the scalar typically comes first in scalar multiplication: e.g., "sP" for multiplying point P by the scalar s. Also, P is often nil, and it just looks a bit better for the second argument to be nil than the first.

I'm definitely OK with that.

Fix the inconsistency between the method signatures of Secret.Pick and Point.Pick (the latter supports embedded data, the former does not). This could be done in at least two ways: (a) consider the latter to be a different function and rename it to, say, Embed(); (b) do the same as (a) but also have a one-argument point.Pick(rand) that's equivalent to point.Embed(nil, rand); (c) keep the Pick name for both but add data-embedding functionality to Secret/Scalar (which would be technically pretty trivial though I'm not sure how often this capability would actually get used). Thoughts/preferences?

I'm definitely more for the c) option. Having Pick which only creates an element from a given stream, and Embed which takes a slice of bytes and stores it internally (depending on the structure: a scalar is would just set its internal representation to the given slice, a point would use something like Elligator encoding if available).
This needs also to be thought in conjunction with the SetBytes method added to both types (scalar + point). The interfaces are starting to be a bit too bloated with bytes-level methods: Pick, Embed, SetBytes (un)MarshalBinary. I'm afraid it will lead to more confusion if we keep all four...

@bford
Copy link
Contributor Author

bford commented May 30, 2017

Agreed on most of the above.

On the proliferation of "bytes-level methods": I agree that this is a concern in principle, but in my current understanding these three sets of bytes-level methods each address real and semantically distinct needs. In my understanding:

  • (un)MarshalBinary is for serializing and deserializing scalars/points to/from an appropriate binary representation, and hence applies equally to Point and Scalar. Marshaled representation is fixed-size: these methods require, and consume, a particular fixed number of bytes, and produce errors if given the wrong number of bytes or if the encoded bytes aren't a proper, canonical representation of a Point or Scalar.

  • Bytes/SetBytes is for accessing the byte-encoded numeric representation of an integer, and hence (I seem to recall) applies only to Scalar and not Point. SetBytes is deliberately more forgiving than UnmarshalBinary in what it takes in: for example, you can give SetBytes a number (much) bigger than the scalar modulus, and it will silently apply the modulus, as commonly needed for checking Ed25519 signatures and the like. These methods could in principle be dispensed with by (for example) first creating a big.Int, applying the modulus first in that context, then exporting that via big.Int.Bytes() into a marshaled representation of the correct type, and turning that into the correct scalar via UnMarshalBinary(). But that's just a bit of a pain, and having Bytes/SetBytes calls in Scalar that mirror those same relevant calls in the standard big.Int package seems to make sense given that a Scalar is just an integer in a particular modular group.

  • Pick/Embed are not for encoding/decoding group elements (either Points or Scalars) but for choosing them either uniformly at random (no embedded data) or so as to encode information (with embedded data). These are semantically very different from Marshal/UnMarshal and Bytes/SetBytes in that there is no expectation that a given "embedded data input" should completely define the output: to the contrary, it is expected (and guaranteed in the case of Point) that the embedded data input effectively underspecifies the output, leaving at least some "leeway" for Pick to use additional randomness to pick a group element that embeds the relevant data. This functionality is also important in various cases, namely ElGamal encryption and creating hash-generators. However, it might well be argued that picking an element uniformly at random might and perhaps should be separated from the functionality of embedding data in a (only-partly-random or even non-random) group element.

@ineiti
Copy link
Member

ineiti commented May 31, 2017

This should be copy/pasted to the documentation!

@nikkolasg
Copy link
Collaborator

Yep I agree with all of the above. And for the moment, let's continue like that but one way to reduce the size of the interfaces would be to do that Element interface for which we've been talking for a while. It would clearly separate the arithmetic capabilities from the encoding/bytes-level capabilities.

@nikkolasg
Copy link
Collaborator

I realized while implementing it that we don't need a Embed method for a Scalar if we keep SetBytes. Their actual implementation are exactly the same, no ?

@nikkolasg
Copy link
Collaborator

This issue is more complex than I thought, I'm realizing it while trying to code it...
I understand now why the random.Stream was needed even to embed a point.
I'm now more in favor of your option b), i.e. having a Pick(rand) which is just a call to Embed(nil,rand) since both are using the same implementation.
Elligator encoding should be defined somewhere else as it's not its purpose. I remember seeing some Hiding interfaces somewhere that may be the place...

@nikkolasg
Copy link
Collaborator

I'm closing this issue as most of its items have been implemented in the sub-issues listed in #151

armfazh pushed a commit to armfazh/kyber that referenced this issue Apr 13, 2023
higher threshold / n for resharing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants