Merge pull request #290 from tomato42/docs-updates

Docs updates
tlsfuzzer · Jul 9, 2022 · 3a8bc4e · 3a8bc4e
2 parents 29a3cd0 + 1943ef3
commit 3a8bc4e
Show file tree

Hide file tree

Showing 11 changed files with 662 additions and 97 deletions.
diff --git a/docs/source/basics.rst b/docs/source/basics.rst
@@ -0,0 +1,162 @@
+======================
+Basics of ECC handling
+======================
+
+The :term:`ECC`, as any asymmetric cryptography system, deals with private
+keys and public keys. Private keys are generally used to create signatures,
+and are kept, as the name suggest, private. That's because possession of a
+private key allows creating a signature that can be verified with a public key.
+If the public key is associated with an identity (like a person or an
+institution), possession of the private key will allow to impersonate
+that identity.
+
+The public keys on the other hand are widely distributed, and they don't
+have to be kept private. The primary purpose of them, is to allow
+checking if a given signature was made with the associated private key.
+
+Number representations
+======================
+
+On a more low level, the private key is a single number, usually the
+size of the curve size: a NIST P-256 private key will have a size of 256 bits,
+though as it needs to be selected randomly, it may be a slightly smaller
+number (255-bit, 248-bit, etc.).
+Public points are a pair of numbers. That pair specifies a point on an
+elliptic curve (a pair of integers that satisfy the curve equation).
+Those two numbers are similarly close in size to the curve size, so both the
+``x`` and ``y`` coordinate of a NIST P-256 curve will also be around 256 bit in
+size.
+
+.. note::
+   To be more precise, the size of the private key is related to the
+   curve *order*, i.e. the number of points on a curve. The coordinates
+   of the curve depend on the *field* of the curve, which usually means the
+   size of the *prime* used for operations on points. While the *order* and
+   the *prime* size are related and fairly close in size, it's possible
+   to have a curve where either of them is larger by a bit (i.e.
+   it's possible to have a curve that uses a 256 bit *prime* that has a 257 bit
+   *order*).
+
+Since normally computers work with much smaller numbers, like 32 bit or 64 bit,
+we need to use special approaches to represent numbers that are hundreds of
+bits large.
+
+First is to decide if the numbers should be stored in a big
+endian format, or in little endian format. In big endian, the most
+significant bits are stored first, so a number like :math:`2^{16}` is saved
+as a three bytes: byte with value 1 and two bytes with value 0.
+In little endian format the least significant bits are stored first, so
+the number like :math:`2^{16}` would be stored as three bytes:
+first two bytes with value 0, than a byte with value 1.
+
+For :term:`ECDSA` big endian encoding is usually used, for :term:`EdDSA`
+little endian encoding is usually used.
+
+Secondly, we need to decide if the numbers need to be stored as fixed length
+strings (zero padded if necessary), or if they should be stored with
+minimal number of bytes necessary.
+That depends on the format and place it's used, some require strict
+sizes (so even if the number encoded is 1, but the curve used is 128 bit large,
+that number 1 still needs to be encoded with 16 bytes, with fifteen most
+significant bytes equal zero).
+
+Public key encoding
+===================
+
+Generally, public keys (i.e. points) are expressed as fixed size byte strings.
+
+While public keys can be saved as two integers, one to represent the
+``x`` coordinate and one to represent ``y`` coordinate, that actually
+provides a lot of redundancy. Because of the specifics of elliptic curves,
+for every valid ``x`` value there are only two valid ``y`` values.
+Moreover, if you have an ``x`` value, you can compute those two possible
+``y`` values (if they exist).
+As such, it's possible to save just the ``x`` coordinate and the sign
+of the ``y`` coordinate (as the two possible values are negatives of
+each-other: :math:`y_1 == -y_2`).
+
+That gives us few options to represent the public point, the most common are:
+
+1. As a concatenation of two fixed-length big-endian integers, so called
+   :term:`raw encoding`.
+2. As a concatenation of two fixed-length big-endian integers prefixed with
+   the type of the encoding, so called :term:`uncompressed` point
+   representation (the type is represented by a 0x04 byte).
+3. As a fixed-length big-endian integer representing the ``x`` coordinate
+   prefixed with the byte representing the combined type of the encoding
+   and the sign of the ``y`` coordinate, so called :term:`compressed`
+   point representation (the type is then represented by a 0x02 or a 0x03
+   byte).
+
+Interoperable file formats
+==========================
+
+Now, while we can save the byte strings as-is and "remember" which curve
+was used to generate those private and public keys, interoperability usually
+requires to also save information about the curve together with the
+corresponding key. Here too there are many ways to do it:
+save the parameters of the used curve explicitly, use the name of the
+well-known curve as a string, use a numerical identifier of the well-known
+curve, etc.
+
+For public keys the most interoperable format is the one described
+in RFC5912 (look for SubjectPublicKeyInfo structure).
+For private keys, the RFC5915 format (also known as the ssleay format)
+and the PKCS#8 format (described in RFC5958) are the most popular.
+
+All three formats effectively support two ways of providing the information
+about the curve used: by specifying the curve parameters explicitly or
+by specifying the curve using ASN.1 OBJECT IDENTIFIER (OID), which is
+called ``named_curve``. ASN.1 OIDs are a hierarchical system of representing
+types of objects, for example, NIST P-256 curve is identified by the
+1.2.840.10045.3.1.7 OID (in dotted-decimal formatting of the OID, also
+known by the ``prime256v1`` OID node name or short name). Those OIDs
+uniquely, identify a particular curve, but the receiver needs to know
+which numerical OID maps to which curve parameters. Thus the prospect of
+using the explicit encoding, where all the needed parameters are provided
+is tempting, the downside is that curve parameters may specify a *weak*
+curve, which is easy to attack and break (that is to deduce the private key
+from the public key). To verify curve parameters is complex and computationally
+expensive, thus generally protocols use few specific curves and require
+all implementations to carry the parameters of them. As such, use of
+``named_curve`` parameters is generally recommended.
+
+All of the mentioned formats specify a binary encoding, called DER. That
+encoding uses bytes with all possible numerical values, which means it's not
+possible to embed it directly in text files. For uses where it's useful to
+limit bytes to printable characters, so that the keys can be embedded in text
+files or text-only protocols (like email), the PEM formatting of the
+DER-encoded data can be used. The PEM formatting is just a base64 encoding
+with appropriate header and footer.
+
+Signature formats
+=================
+
+Finally, ECDSA signatures at the lowest level are a pair of numbers, usually
+called ``r`` and ``s``. While they are the ``x`` coordinates of special
+points on the curve, they are saved modulo *order* of the curve, not
+modulo *prime* of the curve (as a coordinate needs to be).
+
+That again means we have multiple ways of encoding those two numbers.
+The two most popular formats are to save them as a concatenation of big-endian
+integers of fixed size (determined by the curve *order*) or as a DER
+structure with two INTEGERS.
+The first of those is called the :term:``raw encoding`` inside the Python
+ecdsa library.
+
+As ASN.1 signature format requires the encoding of INTEGERS, and DER INTEGERs
+must use the fewest possible number of bytes, a numerically small value of
+``r`` or ``s`` will require fewer
+bytes to represent in the DER structure. Thus, DER encoding isn't fixed
+size for a given curve, but has a maximum possible size.
+
+.. note::
+
+    As DER INTEGER uses so-called two's complement representation of
+    numbers, the most significant bit of the most significant byte
+    represents the *sign* of the number. If that bit is set, then the
+    number is considered to be negative. Thus, to represent a number like
+    255, which in binary representation is 0b11111111 (i.e. a byte with all
+    bits set high), the DER encoding of it will require two bytes, one
+    zero byte to make sure the sign bit is 0, and a byte with value 255 to
+    encode the numerical value of the integer.
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -61,6 +61,9 @@
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ["_static"]
 
+# Example configuration for intersphinx: refer to the Python standard library.
+intersphinx_mapping = {"https://docs.python.org/": None}
+
 autodoc_default_options = {
     "undoc-members": True,
     "inherited-members": True,

diff --git a/docs/source/ec_arithmetic.rst b/docs/source/ec_arithmetic.rst
@@ -0,0 +1,137 @@
+=========================
+Elliptic Curve arithmetic
+=========================
+
+The python-ecdsa also provides generic API for performing operations on
+elliptic curve points.
+
+.. warning::
+
+    This is documentation of a very low-level API, if you want to
+    handle keys or signatures you should look at documentation of
+    the :py:mod:`~ecdsa.keys` module.
+
+Short Weierstrass curves
+========================
+
+There are two low-level implementations for
+:term:`short Weierstrass curves <short Weierstrass curve>`:
+:py:class:`~ecdsa.ellipticcurve.Point` and
+:py:class:`~ecdsa.ellipticcurve.PointJacobi`.
+
+Both of them use the curves specified using the
+:py:class:`~ecdsa.ellipticcurve.CurveFp` object.
+
+You can either provide your own curve parameters or use one of the predefined
+curves.
+For example, to define a curve :math:`x^2 = x^3 + x + 4 \text{ mod } 5` use
+code like this:
+
+.. code:: python
+
+    from ecdsa.ellipticcurve import CurveFp
+    custom_curve = CurveFp(5, 1, 4)
+
+The predefined curves are specified in the :py:mod:`~ecdsa.ecdsa` module,
+but it's much easier to use the helper functions (and proper names)
+from the :py:mod:`~ecdsa.curves` module.
+
+For example, to get the curve parameters for the NIST P-256 curve use this
+code:
+
+.. code:: python
+
+    from ecdsa.curves import NIST256p
+    curve = NIST256p.curve
+
+.. tip::
+
+    You can also use :py:class:`~ecdsa.curves.Curve` to get the curve
+    parameters from a PEM or DER file. You can also use
+    :py:func:`~ecdsa.curves.curve_by_name` to get a curve by specifying its
+    name.
+    Or use the
+    :py:func:`~ecdsa.curves.find_curve` to get a curve by specifying its
+    ASN.1 object identifier (OID).
+
+Affine coordinates
+------------------
+
+After taking hold of curve parameters you can create a point on the
+curve. The :py:class:`~ecdsa.ellipticcurve.Point` uses affine coordinates,
+i.e. the :math:`x` and :math:`y` from the curve equation directly.
+
+To specify a point (1, 1) on the ``custom_curve`` you can use this code:
+
+.. code:: python
+
+    from ecdsa.ellipticcurve import Point
+    point_a = Point(custom_curve, 1, 1)
+
+Then it's possible to either perform scalar multiplication:
+
+.. code:: python
+
+    point_b = point_a * 3
+
+Or specify other points and perform addition:
+
+.. code:: python
+
+    point_b = Point(custom_curve, 3, 2)
+    point_c = point_a + point_b
+
+To get the affine coordinates of the point, call the ``x()`` and ``y()``
+methods of the object:
+
+.. code:: python
+
+    print("x: {0}, y: {1}".format(point_c.x(), point_c.y()))
+
+Projective coordinates
+----------------------
+
+When using the Jacobi coordinates, the point is defined by 3 integers,
+which are related to the :math:`x` and :math:`y` in the following way:
+
+.. math::
+
+   x = X/Z^2 \\
+   y = Y/Z^3
+
+That means that if you have point in affine coordinates, it's possible
+to convert them to Jacobi by simply assuming :math:`Z = 1`.
+
+So the same points can be specified as so:
+
+.. code:: python
+
+    from ecdsa.ellipticcurve import PointJacobi
+    point_a = PointJacobi(custom_curve, 1, 1, 1)
+    point_b = PointJacobi(custom_curve, 3, 2, 1)
+
+
+.. note::
+
+    Unlike the :py:class:`~ecdsa.ellipticcurve.Point`, the
+    :py:class:`~ecdsa.ellipticcurve.PointJacobi` does **not** check if the
+    coordinates specify a valid point on the curve as that operation is
+    computationally expensive for Jacobi coordinates.
+    If you want to verify if they specify a valid
+    point, you need to convert the point to affine coordinates and use the
+    :py:meth:`~ecdsa.ellipticcurve.CurveFp.contains_point` method.
+
+Then all the operations work exactly the same as with regular
+:py:class:`~ecdsa.ellipticcurve.Point` implementation.
+While it's not possible to get the internal :math:`X`, :math:`Y`, and :math:`Z`
+coordinates, it's possible to get the affine projection just like with
+the regular implementation:
+
+.. code:: python
+
+    point_c = point_a + point_b
+    print("x: {0}, y: {1}".format(point_c.x(), point_c.y()))
+
+All the other operations, like scalar multiplication or point addition work
+on projective points the same as with affine representation, but they
+are much more effective computationally.
diff --git a/docs/source/ecdsa.eddsa.rst b/docs/source/ecdsa.eddsa.rst
@@ -0,0 +1,7 @@
+ecdsa.eddsa module
+==================
+
+.. automodule:: ecdsa.eddsa
+   :members:
+   :undoc-members:
+   :show-inheritance:
diff --git a/docs/source/ecdsa.rst b/docs/source/ecdsa.rst
@@ -16,6 +16,7 @@ Submodules
    ecdsa.der
    ecdsa.ecdh
    ecdsa.ecdsa
+   ecdsa.eddsa
    ecdsa.ellipticcurve
    ecdsa.errors
    ecdsa.keys