Skip to content

Commit

Permalink
Update paper.
Browse files Browse the repository at this point in the history
  • Loading branch information
skaller committed Mar 17, 2023
1 parent accf541 commit 84b4c50
Show file tree
Hide file tree
Showing 2 changed files with 230 additions and 0 deletions.
Binary file modified pdfs/nulang.pdf
Binary file not shown.
230 changes: 230 additions & 0 deletions src/tex/nulang.tex
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@
\newtheorem{definition}{Definition}
\theoremstyle{plain}
\newtheorem{lemma}{Lemma}
\newtheorem{theorem}{Theorem}
\newtheorem{notation}{Notation}

\mdfdefinestyle{MyFrame}{innerleftmargin=20pt}
\newenvironment{myexample}%
Expand Down Expand Up @@ -181,6 +183,234 @@ \section{Addressing}
to recover the value which we previously took the logarithm of is done
using a decoder which creates a fanout dependent on the base of the logarithm,

\chapter{Type System}
We start needing four classes of types: sequences, groups, rings, and fields.
Each of these type will be considered a finite, linear, subset of the integers
from 0 to $n-1$ where n is the size of the set.

\subsection{Preliminary definitions}
\begin{definition}
A {\em semi-group} is a set together with an associative binary operation; that is
$$(a \cdot b)\cdot c = a \cdot (b \cdot c)$$
where infix $\cdot$ is taken as the symbol for the binary operation.
\end{definition}

Associativity is a crucial property for an operator because it allows concurrent
evaluation of arbitrary subsequences of a sequence of operands. For example given
the operand expression
$$(((1\cdot2)\cdot3)\cdot4)\cdot5$$
we can compute $1\cdot2$ and $4\cdot5$ concurrently:
$$(1\cdot2)\cdot3\cdot(4\cdot5)$$
then then combine 3 to either the LHS or RHS subterm before performing the final combination.
Another way of looking at this property is that the values to be combined can be stored in
the leaves of a tree and any bottom up visitation algorithm can be used to find the total combination.

Associativity means you can add or remove (balanced pairs of) parentheses freely.
In particular it is common practice to leave out the parentheses entirely.

\begin{definition}
A {\em monoid} is a semigroup with a unit $u$, that is
$$x\cdot u = x \rm{\ and\ }u \cdot x = x$$
for all x in the set.
\end{definition}

The existence of a unit means you can freely add or remove units from
anywhere in your computation.

\begin{definition}
A {\em group} is a monoid in which every element has an inverse, that is,
for all $x$ there exists an element $y$ such that
$$x \cdot y = u {\rm\ and\ } y \cdot x = u $$
\end{definition}
where $u$ is the unit of the underlying monoid. For integers of course,
the additive inverse of a value is it's negation.

\begin{definition}
An operation is {\em commutative} if the result is the same with the operands
reversed, that is, for all $a$ and $b$.
$$a \cdot b = b \cdot a$$
A group is said to be commutative if the group operation is commutative.
\end{definition}

Commutativity says you can switch the order of children in the tree representation
of an expression.

If an operation is also associative, commutative, and has a unit, then the operation
is well defined on a set of operands, taking the operation on the empty
set to be the unit.

This means irrespective of what data structure you use to hold the
values to be combined, and what algorithm you use to scan them,
provided you visit each value exactly once, the result of the
operation on them is invariant.

\begin{definition}
A {\em ring} is a set with two operations denoted by $+$ and $*$ such
that the set with $+$ is a group, and the set excluding the additive
unit is a monoid, and the following rule, called the
{\em distributive law} holds for all $a$, $b$ and $c$
$$a * (b + c) = a * b + a * c$$

If the multiplication operation is commutative then it is called
a commutative ring.
\end{definition}

\subsection{The rings $\mathbb{N}_n$}
\begin{definition}
Let $\mathbb{N}_n$ be the subrange of the integers $0..n-1$ with
addtion, subtraction,
multiplication, division and remainder defined as the natural result modulo $n$.
Then $\mathbb{N}_n$ is a commutative ring called a {\em natural ring}.
\end{definition}

The usual linear order is also defined. Negation is defined by
$$-x = n - x$$

Natural computations prior to finding the modular residual present an issue
we resolve by performing these computations in a much larger ring.

\begin{definition}
The {\em size} of a finite ring $R$, written $|R|$, is the number of values of the underlying set.
\end{definition}
\subsection{Representation}
\begin{lemma} The C data types
\begin{minted}{C++}
uint8_t uint16_t uint32_t uint64_t
\end{minted}
with C builtin operations for addition, subtraction, negation, and multiplication
are the rings
\(\mathbb{N}_{2^8}\ \mathbb{N}_{2^{16}}\ \mathbb{N}_{2^{32}}\ \mathbb{N}_{2^{64}} \)
respectively, with the usual comparison operations, unsigned integer division,
and unsigned integer modulus.
\end{lemma}

\begin{theorem}
{\em Representation Theorem}. The values of a ring $\mathbb{N}_n$ can be represented
by values of a ring $\mathbb{N}_{n^2}$ and the operations addition, substraction, negation
multiplication and modulus computed by the respective operations modulo $n$. Comparisions
work without modification.
\end{theorem}
In particular we can use \verb$uint64_t$ to represent rings of index up to
$2^{32}$.

\section{Intrinsics}
An instrinsic is a ring or field which is provided natively by the target system.
We use the following example to show how:
\begin{minted}{felix}
field goldilocks =
intrinsic size = 2^64 - 2^32 - 2,
add = primitive cost 1,
neg = primitive cost 1,
sub = fun (x,y) => add (x, neg y),
mul = primitive cost 1,
udiv = primitive cost 4,
umod = primitive cost 4,
udivmod = primitive cost 4,
recip = primitive cost 1,
fdiv = fun (x,y) => mul (x, recip x),
... // TODO: finish list
;
\end{minted}
Each of the required operations for a field (or ring if a ring is being specified),
must be either implemented in the target natively, in which case the number of
execution cycles required must be specified, or is defined in terms of another
defined operation, in such a way no cyclic dependencies exist. In the latter case
the compile derives the cost from the definition.

\subsection{Versions}
In version 1, the a definition must be given first before it can be used.
In later versions, the a dependency checker will ensure completeness
and consistency. In still later versions defaults may be provided.

\section{Derived structures}
\subsection{One step derivations}
Once we have one or more intrinsic, we can use the type notation
\begin{minted}{felix}
N<n, base>
\end{minted}
to specify a ring of size n, defined using the already defined ring \verb$base$.
The compiler will define all operations automatically, using modular arithmetic
etc, provided $n^2\leq |{\mathtt base}|$, otherwise it will issue a diagnostic
error message that the base ring is not large enough an terminate the compilation.

Note, the base ring does not have to be intrinsic.

To define a new field, we need only a ring sufficiently large for the underlying
ring operations, however we must define the `verb$recip$ operation natively:
\begin{minted}{felix}
field nufield = based goldilocks recip = primitive cost 24;
\end{minted}

\subsection{Families}
All data types derived directly or indirectly on a particular
base for a family, Operations with mixed families require
the specification of isomorphism between abstractly equivalent types,
or embeddings if approproiate. Research is required here to decide
how to handle computations with mixed families.
\begin{minted}{felix}
to be done
\end{minted]

\section{Products}
\section{Compact Products}
We first provide {\em compact linear products}. This is a categorical product
with the type given by the n-ary constructor like
\begin{minted}{felix}
compactproducttype<R0, R1, ... Rnm1)
\end{minted}
and values like
\begin{minted}{felix}
compactproductvalue(v0,v1,... vnm1)
\end{minted}
Note syntactic sugar is yet to be determined.

All the data types must be in the same family, and the product
of their sizes must be less than or equal to the family base size.
All the operators are defined componentwise. However sequencing
is based on the representation.

Projections and slices are provided automatically
\begin{minted}{felix}
compactprojection<index, base> : T -> Rindex
\end{minted}
This is a function which extracts the component selected
by the index. The index must be a constant.

A generalised projection is also provided which accepts an
expression as an argument. Its codomain is the type dual
to the product, that is a sum type consisting of the component
index and value. We note that the representation is uniform because
the constructor arguments are all structures from the same family.

\section{Arrays}
If all the types of a product are the same, it is called an array.
An array projecton accepts an expression as an argument, which cannot
be out of bounds, since the type of the expression must be the index
type of the constructing functor. The array projection is defined by
first applying a generalised projection and then applying the smash
operator which throws out the constructor of a repeated sum.

Note there is a related operator which retains the index value,
so that an iteration will get a index,value pair.

The math is straightforward but the syntax needs to be established.

\section{Sums}
Just as for products, we provide categorical sums including
repeated sums, which are the dual of arrays, along with
injections functions.

However the sum of rings is not a ring in the category or rings.
In the category of types, the operations on the sum of two rings
is instead define by the operations on the representation.

Since our rings are cyclic, the sum of N<3> and N<4> behaves like
N<7>. However note, it is still a proper categorical sum, since we
can decode it to extract a value of one of the injection types.

In a sum of rings, addition is precisely addition with carry,
and multiplication is modulo the size of the sum (which is the
sum of the component sizes).


\end{document}
Expand Down

0 comments on commit 84b4c50

Please sign in to comment.