urbit · sigilante · Mar 26, 2024 · Apr 3, 2024 · Apr 24, 2024 · mopfel-winrux
diff --git a/UIPS/UIP-0122.md b/UIPS/UIP-0122.md
@@ -0,0 +1,190 @@
+---
+uip: "0122"
+title: "Lagoon IEEE 754 Reals"
+description: Support jetted array operations
+author: ~lagrev-nocfep
+status: Draft
+type: Standards Track
+category: Arvo
+created: 2024-03-26
+---
+
+## Abstract
+
+Nock and its predicated software programs are designed to facilitate computing as a legible state machine. At-scale scientific computing has never really been part of the design criteria for the Urbit ecosystem. While it's hard to imagine density functional theory quantum chemistry packages preferring Hoon, it is far easier to envision a world of machine learning and even LLMs that are built on top of or adjacent to Urbit. At their core, essentially all of the “scientific computing” approaches rely heavily on linear algebra packages: classically BLAS and LAPACK, but many others as well today. It behooves us to provide this to future developers building on Mars.
+
+Lagoon has matured through several refactors into a developer-friendly interface.  It is time to prepare and release a subset of completed array-operational work to the developer community.  Since Lagoon relies on jets to operate reasonably quickly, we will prepare and release `%real`-valued operations first.
+
+## Motivation
+
+Lagoon (Linear AlGebra in hOON) will offer BLAS-like vector and matrix operations (like NumPy's linear algebra essentials, but not solvers).  This will facilitate a common and performant interface for all developers relying on a common lagnuage of data types and operators to represent vector, matrix, and tensor data.  By focusing on `%real`s first, we can vet the interface further in production and lay the groundwork for expansion in further types in subsequent releases.
+
+## Specification
+
+### Data Types
+
+Lagoon defines the following data types for `%real`-valued numbers (which are IEEE 754 floats compatible with corresponding `@r` types in Hoon).  (Note that we are commenting out other prospective types for which some work has been done.)
+
+```hoon
+  ::  /sur/lagoon
+::::  Types for Lagoon compatibility
+::
+|%
++$  ray               ::  $ray:  n-dimensional array
+  $:  =meta           ::  descriptor
+      data=@ux        ::  data, row-major order
+  ==
+::
++$  prec  [a=@ b=@]   ::  fixed-point precision, a+b+1=bloq
++$  meta              ::  $meta:  metadata for a $ray
+  $:  shape=(list @)  ::  list of dimension lengths
+      =bloq           ::  logarithm of bitwidth
+      =kind           ::  name of data type
+      fxp=(unit prec) ::  fixed-point scale
+  ==
+::
++$  kind              ::  $kind:  type of array scalars
+  $?  %real           ::  IEEE 754 float
+      :: %uint           ::  unsigned integer
+      :: %int2           ::  2s-complement integer
+      :: %cplx           ::  BLAS-compatible packed floats
+      :: %unum           ::  unum/posit
+      :: %fixp           ::  fixed-precision
+  ==
+::
++$  baum              ::  $baum:  ndray with metadata
+  $:  =meta           ::
+      data=ndray      ::
+  ==
+::
++$  ndray             ::  $ndray:  n-dim array as nested list
+    $@  @             ::  single item
+    (list ndray)      ::  nonempty list of children, in row-major order
+::
++$  slice  (unit [(unit @) (unit @)])
+--
+```
+
+The core data type is a `+$ray`, which consists of a pair of metadata `+$meta` and data as a bit-aligned atom.  The metadata which we track for any particular array are:
+
+1. `shape=(list @)`, the dimensions of an array.
+2. `bloq`, the block size of an array in standard Hoon $2^n$ terms.
+3. `kind`, whether the value is `%real` or otherwise.  (Only `%real` will be supported in this UIP's release.)
+4. `fxp=(unit prec)`, the binary-point scale of a fixed-precision number as a unit.  (This will always be `~` for `%real`s.)
+
+`data` are in CBLAS-like row-major order, with the leading value of the array in the most signficant position bitwise.  A `+$ray`'s `data` atom has a leading `1` bit at the most significant bit of the array plus one.  This allows us to represent leading zeroes.
+
+There is also a `+$ndray` type corresponding to an unpacked list of the same data.
+
+### Operators
+
+At the current time, Lagoon defines the following operators, a handful of which are for internal library convenience.  This interface is the result of several complete refactors and we do not foresee additional major alterations in the course of producing the final product.
+
+- `++print`
+- `++slog`
+- `++to-tank`
+- `++get-term`
+- `++squeeze  `
+- `++submatrix`
+- `++product`
+- `++gather`
+- `++get-item`
+- `++set-item`
+- `++get-row`
+- `++set-row`
+- `++get-col`
+- `++set-col`
+- `++get-bloq-offset`
+- `++get-item-number`
+- `++strides`
+- `++get-dim`
+- `++get-item-index`
+- `++ravel`
+- `++en-ray`
+- `++de-ray`
+- `++get-item-baum`
+- `++fill`
+- `++spac`
+- `++unspac`
+- `++scalar-to-ray`
+- `++eye`
+- `++zeros`
+- `++ones`
+- `++iota`
+- `++magic`
+- `++range`
+- `++linspace`
+- `++urge`
+- `++scale`
+- `++max`
+- `++argmax`
+- `++min`
+- `++argmin`
+- `++cumsum`
+- `++prod`
+- `++reshape`
+- `++stack`
+- `++hstack`
+- `++vstack`
+- `++transpose`
+- `++diag`
+- `++trace`
+- `++dot`
+- `++mmul`
+- `++abs`
+- `++add-scalar`
+- `++sub-scalar`
+- `++mul-scalar`
+- `++div-scalar`
+- `++mod-scalar`
+- `++add`
+- `++sub`
+- `++mul`
+- `++div`
+- `++mod`
+- `++pow-n`
+- `++pow`
+- `++exp`
+- `++log`
+- `++gth`
+- `++gte`
+- `++lth`
+- `++lte`
+- `++mpow-n`
+- `++is-close`
+- `++any`
+- `++all`
+- `++fun-scalar`
+- `++trans-scalar`
+- `++el-wise-op`
+- `++bin-op`
+
+The IEEE 754 rounding mode is a property of the operation rather than of the value(s) involved.  To effect this correctly, `/lib/lagoon` implements a `++lake` wrapper in a door-like pattern to modify the core's behavior.  This information propagates to gates.
+
+### Jetting
+
+Jets will resolve at the level of (e.g.) `++add` rather than `++bin-op`.
+
+Jets are built on top of the [SoftBLAS](https://github.com/urbit/SoftBLAS) which implements BLAS operations on top of SoftFloat.  The build process for `vere` has been modified to link SoftBLAS (in the `lagoon-jets` branch of `urbit/vere`).
+
+Because jets need to be compiled into the `vere` binary at the current time, `/lib/lagoon` needs to ship on the `%base` desk.  However, we propose a departure from past convention:  jet the library in `/lib` rather than move it into `/sys`.
+
+In the long run, as more possibilities for userspace jets become available, `/lib/lagoon` may be moved out of `%base`.  However, one change in policy at a time is all that we propose now.
+
+### Rationale for `%real`
+
+IEEE 754 `%real`s are a well-understood, well-vetted component in the numerical ecosystem for scientific computing and Urbit's `@r` affordances.
+
+We have comprehensive unit test coverage of `/lib/lagoon` and SoftBLAS, lending credence to verified behavior.
+
+Taken together, these lead us to believe that introducing `%real` to the developer community first is a safe and 
+
+## Resources
+
+Primary development is taking place in [`urbit/numerics`](https://github.com/urbit/numerics).
+
+Details are available in `~mopfel-winrux/numeric-computation-and-machine-learning` and [in summary form](https://gist.github.com/sigilante/f9b0d6b5d5a7675f96415e35c0b22d95).
+
+## Copyright
+
+Copyright and related rights waived via [CC0](../LICENSE.md).