From 8694e4934cffa1a6606e251ddc81ba12c825701b Mon Sep 17 00:00:00 2001
From: Jeremy Maitin-Shepard <jbms@google.com>
Date: Wed, 10 Aug 2022 09:46:54 -0700
Subject: [PATCH 1/4] Change data type names and change endianness to be
 handled by a codec

---
 docs/codecs.rst    |  29 ++++++++++
 docs/core/v3.0.rst | 131 +++++++++++++--------------------------------
 2 files changed, 67 insertions(+), 93 deletions(-)

diff --git a/docs/codecs.rst b/docs/codecs.rst
index 204b9c28..1b5a1ad2 100644
--- a/docs/codecs.rst
+++ b/docs/codecs.rst
@@ -178,6 +178,35 @@ header. The format of the encoded buffer is defined in [BLOSC]_. The
 reference implementation is provided by the `c-blosc library
 <https://github.com/Blosc/c-blosc>`_.
 
+Endian
+------
+
+Codec URI:
+    https://purl.org/zarr/spec/codec/endian
+
+Encodes array elements using the specified endianness.
+
+Configuration parameters
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+endian:
+    Required.  A string equal to either ``"big"`` or ``"little"``.
+
+Format and algorithm
+~~~~~~~~~~~~~~~~~~~~
+
+Each element of the array is encoded using the specified endian variant of its
+default binary representation.  Array elements are encoded in lexicographical
+order.  For example, with ``endian`` specified as ``big``, the ``int32`` data
+type is encoded as a 4-byte big endian two's complement integer, and the
+``complex128`` data type is encoded as two consecutive 8-byte big endian IEEE
+754 binary64 values.
+
+.. note::
+
+   Single the default binary representation of all data types is little endian,
+   specifying this codec with ``endian`` equal to ``"little"`` is equivalent to
+   omitting this codec.
 
 Deprecated codecs
 =================
diff --git a/docs/core/v3.0.rst b/docs/core/v3.0.rst
index 5ab27003..906e8770 100644
--- a/docs/core/v3.0.rst
+++ b/docs/core/v3.0.rst
@@ -177,8 +177,6 @@ draft.
    We propose to develop a draft implementation with extensions and
    see how far we can go. A possible list of extensions to include:
 
-    - Boolean
-    - Complex
     - Datetime
     - Named dimensions
     - Awkward arrays
@@ -316,8 +314,8 @@ conceptual model underpinning the Zarr format.
 *Data type*
 
     A data type defines the set of possible values that an array_ may
-    contain, and a binary representation (i.e., sequence of bytes) for
-    each possible value. For example, the little-endian 32-bit signed
+    contain, and a default binary representation (i.e., sequence of bytes) for
+    each possible value. For example, the 32-bit signed
     integer data type defines binary representations for all integers
     in the range −2,147,483,648 to 2,147,483,647. This specification
     only defines a limited set of data types, but extensions
@@ -488,101 +486,48 @@ Core data types
 
    * - Identifier
      - Numerical type
-     - Size (no. bytes)
-     - Byte order
+     - Default binary representation
    * - ``bool``
-     - Boolean, with False encoded as ``\\x00`` and True encoded as ``\\x01``
-     - 1
-     - None
-   * - ``i1``
-     - signed integer
-     - 1
-     - None
-   * - ``<i2``
-     - signed integer
-     - 2
-     - little-endian
-   * - ``<i4``
-     - signed integer
-     - 4
-     - little-endian
-   * - ``<i8``
-     - signed integer
-     - 8
-     - little-endian
-   * - ``>i2``
-     - signed integer
-     - 2
-     - big-endian
-   * - ``>i4``
-     - signed integer
-     - 4
-     - big-endian
-   * - ``>i8``
-     - signed integer
-     - 8
-     - big-endian
-   * - ``u1``
-     - unsigned integer
-     - 1
-     - None
-   * - ``<u2``
-     - unsigned integer
-     - 2
-     - little-endian
-   * - ``<u4``
-     - unsigned integer
-     - 4
-     - little-endian
-   * - ``<u8``
-     - unsigned integer
-     - 8
-     - little-endian
-   * - ``>u2``
-     - unsigned integer
-     - 2
-     - big-endian
-   * - ``>u4``
-     - unsigned integer
-     - 4
-     - big-endian
-   * - ``>u8``
-     - unsigned integer
-     - 8
-     - big-endian
-   * - ``<f2``
-     - half precision float: sign bit, 5 bits exponent, 10 bits mantissa
-     - 2
-     - little-endian
-   * - ``<f4``
-     - single precision float: sign bit, 8 bits exponent, 23 bits mantissa
-     - 4
-     - little-endian
-   * - ``<f8``
-     - double precision float: sign bit, 11 bits exponent, 52 bits mantissa
-     - 8
-     - little-endian
-   * - ``>f2``
-     - half precision float: sign bit, 5 bits exponent, 10 bits mantissa
-     - 2
-     - big-endian
-   * - ``>f4``
-     - single precision float: sign bit, 8 bits exponent, 23 bits mantissa
-     - 4
-     - big-endian
-   * - ``>f8``
-     - double precision float: sign bit, 11 bits exponent, 52 bits mantissa
-     - 8
-     - big-endian
+     - Boolean
+     - Single byte, with false encoded as ``\\x00`` and true encoded as ``\\x01``.
+   * - int8
+     - Integer in ``[-2^7, 2^7-1]``
+     - 1 byte two's complement
+   * - int16
+     - Integer in ``[-2^15, 2^15-1]``
+     - 2-byte little endian two's complement
+   * - int32
+     - Integer in ``[-2^31, 2^31-1]``
+     - 4-byte little endian two's complement
+   * - uint8
+     - Integer in ``[0, 2^8-1]``
+     - 1 byte
+   * - uint16
+     - Integer in ``[0, 2^16-1]``
+     - 2-byte little endian
+   * - uint32
+     - Integer in ``[0, 2^32-1]``
+     - 4-byte little endian
+   * - float16 (optionally supported)
+     - IEEE 754 half-precision floating point: sign bit, 5 bits exponent, 10 bits mantissa
+     - 2-byte little endian IEEE 754 binary16 
+   * - float32
+     - IEEE 754 single-precision floating point: sign bit, 8 bits exponent, 23 bits mantissa
+     - 4-byte little endian IEEE 754 binary32 
+   * - float64
+     - IEEE 754 double-precision floating point: sign bit, 11 bits exponent, 52 bits mantissa
+     - 8-byte little endian IEEE 754 binary64
+   * - complex64
+     - real and complex components are each IEEE 754 single-precision floating point
+     - 2 consecutive 4-byte little endian IEEE 754 binary32 values
+   * - complex128
+     - real and complex components are each IEEE 754 double-precision floating point
+     - 2 consecutive 8-byte little endian IEEE 754 binary64 values
    * - ``r*`` (Optional)
      - raw bits,  use for extension type fallbacks
      - variable, given by ``*``, is limited to be a multiple of 8.
      - N/A
 
-
-Floating point types correspond to basic binary interchange formats as
-defined by IEEE 754-2008.
-
 Additionally to these base types, an implementation should also handle the
 raw/opaque pass-through type designated by the lower-case letter ``r`` followed
 by the number of bits, multiple of 8. For example, ``r8``, ``r16``, and ``r24``

From aaeb3a984db508e30465f95fd26f87ca339a1c94 Mon Sep 17 00:00:00 2001
From: Jeremy Maitin-Shepard <jbms@google.com>
Date: Wed, 2 Nov 2022 10:54:33 -0700
Subject: [PATCH 2/4] Clarify meaning of omitting the codec

---
 docs/codecs.rst | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/docs/codecs.rst b/docs/codecs.rst
index 1b5a1ad2..2b0a9742 100644
--- a/docs/codecs.rst
+++ b/docs/codecs.rst
@@ -206,7 +206,9 @@ type is encoded as a 4-byte big endian two's complement integer, and the
 
    Single the default binary representation of all data types is little endian,
    specifying this codec with ``endian`` equal to ``"little"`` is equivalent to
-   omitting this codec.
+   omitting this codec, because if this codec is omitted, the default binary
+   representation of the data type, which is always little endian, is used
+   instead.
 
 Deprecated codecs
 =================

From 13f3d3f4839dd71a76ca6aa3c4e2f4db2e3a9f05 Mon Sep 17 00:00:00 2001
From: Jeremy Maitin-Shepard <jbms@google.com>
Date: Wed, 2 Nov 2022 10:57:40 -0700
Subject: [PATCH 3/4] Add note in data type section about endian codec

---
 docs/codecs.rst    | 2 ++
 docs/core/v3.0.rst | 5 +++++
 2 files changed, 7 insertions(+)

diff --git a/docs/codecs.rst b/docs/codecs.rst
index 2b0a9742..e202ed1c 100644
--- a/docs/codecs.rst
+++ b/docs/codecs.rst
@@ -178,6 +178,8 @@ header. The format of the encoded buffer is defined in [BLOSC]_. The
 reference implementation is provided by the `c-blosc library
 <https://github.com/Blosc/c-blosc>`_.
 
+.. _endian-codec:
+
 Endian
 ------
 
diff --git a/docs/core/v3.0.rst b/docs/core/v3.0.rst
index 906e8770..386f1592 100644
--- a/docs/core/v3.0.rst
+++ b/docs/core/v3.0.rst
@@ -536,6 +536,11 @@ should be understood as fall-back types of respectively 1, 2, and 3 byte length.
 Zarr v3 is limited to type sizes that are a multiple of 8 bits but may support
 other type sizes in later versions of this specification.
 
+.. note::
+
+   While the default binary representation is little endian, the :ref:`endian
+   codec<endian-codec>` may specified to use big endian encoding instead.
+
 
 .. note::
 

From bf0261ed17f28f52926f416e65e2932c45ca64c5 Mon Sep 17 00:00:00 2001
From: Jeremy Maitin-Shepard <jbms@google.com>
Date: Thu, 3 Nov 2022 10:04:26 -0700
Subject: [PATCH 4/4] Fix working

---
 docs/core/v3.0.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/core/v3.0.rst b/docs/core/v3.0.rst
index 386f1592..030ba563 100644
--- a/docs/core/v3.0.rst
+++ b/docs/core/v3.0.rst
@@ -539,7 +539,7 @@ other type sizes in later versions of this specification.
 .. note::
 
    While the default binary representation is little endian, the :ref:`endian
-   codec<endian-codec>` may specified to use big endian encoding instead.
+   codec<endian-codec>` may be specified to use big endian encoding instead.
 
 
 .. note::