Skip to content
Go to file
Cannot retrieve contributors at this time
212 lines (155 sloc) 7.3 KB


Over the years I’d developed a terminology for describing the concepts in ERM to myself that have gradually grown to a rather distinct dictionary and taxonomy.

These are not objects. They are message passing functional units with clearly defined properties, in most cases, or fuzzy concepts that map to other concepts.

It might be helpful to understand the CHP hardware design language along the way. Among other things, it has a nice concept in it called a “tree buffer”.

I am also seemingly in love with the letter S for no sound reason I can sink of.

System Types


Gatherers - collectors





Are exactly what they are in conventional electronics. (I might invent a ton of my own terminology, but no point here!). Wavefront arbiters are interesting.







IO output


take something, apply a filter, and send it somewhere else. Conceptually they are a dma engine that can apply a few fairly simple transforms to the data. They can also do joins (assemble data) across multiple sinks to a single dest, using multiple dma engines.

They take a ring buffer of tagged jobs as input. Some kinds are allowed to complete out of order, others, strictly serial. You have a completion ring, rather than cleaning up the tx ring itself.

The tx ring gets cleaned up generationally.


The fundamental operation here is:

r += match == data[++i];

where r is it matched (And, in a loop, how many times it matched). There are a LOT of potential combinations of ways to run forward using this basic operation.

Which … sadly … isn’t all that efficient without hardware that supports these operations in parallel.

In systems that aren’t running ahead this degrades to, as an example:

#define OP(a,b) (a == b)

for(int i = datastart; i<segmentsize; ++i) { if(OP(match,data[i])) send (elsewhere, i); }

where OP can be any of the boolean operations. At higher conceptual levels it can be a function also, but a goal in the data structure designs is that this should be the most complex typical matching routing you have, matching against potentially very large bitfields.

Hence the need for structure assignment.

in NEON, this gets nice with post increment indexing as you just check for the match and store it blindly with the only branch being the counter.


OP is using 4 valued logic, and there are a special set of preprocessed out UTF-8 symbols to express it.


Note that instead of not writing a result to memory, we write it to an unneeded, un-contended (per cpu) area of memory (a 4-16 cacheline-wide wastebin much like /dev/null) (and move on). In some cases, arm NEON conditional instructions can be used to avoid writing the wastebin at all.

In the X86 case we just swap out the dest register:

condition == false ? cmov wastebin, %rdi : ++saving;

write(*rdi++,data); write(*rdi++,data1); write(*rdi++,data2);

then get it back while all that is stalling out


Essentially the “tee” operation


sends an input to a single output. They are a devolved class of assembler.


sends an input through a filter to output(s)


sends an input to as many outputs as are willing to listen

Translators - Filters

Arbs (arbiters)






Translate one input to an output of the same size. An example of this is changing a word from big bit endian to little bit endian.









IO Input




Every memory area is protected by virtual memory and a red zone. There are no inherent checks for running out of bound except that if you run out of space in your area, a memory trap is thrown, and you have to reallocate and start your job over.

Ringbuffers, when the architecture permits, use mmaped on themselves pages so they can free-run. Some ring buffers (like logging errors), are free-running entirely with no checks for overrun. Ringbuffers are strongly typed, and report high and low watermarks in addition to blocking.

The wastebin is both a ringbuffer and a write only memory.

Four Valued logic

Four valued logic concepts are everywhere - if you look. Or maybe I’m just overly sensitive to it.

It’s an essential part of Verilog. (VHDL has 9 states, and I don’t want to talk about it).

36 bit tagged architectures essentially had it, although it was partially wrapped around the separate ideas of garbage collection and higher numerical precision.

The C library sort of has a three or four valued logic - -1 (11111111) usually means an error return. 0XXXXX means you have a valid result. 00 means you did nothing. mmap returns -1 as the address for a failed pointer attempt. Floating point sort of has it - inf, nan, number. (Way too many varieties of NAN!)

Codd and Date struggled with a ternary logic - the 3rd value of NULL is needed but doesn’t fit into the language they designed (SQL) very well. It fit a lot better in later attempts like QUEL and GPRE - but those languages failed in the marketplace.

Most recently - it showed up in Mill Computer’s CPU design - NAR is “Not a result”. (I love the mill. Erm will run like the wind on a mill).

Despite all that, we’ve never had enough bits to spare, (until now) and the legacy of libraries first designed in the 70s lives on, with countless millions of (buggy) lines piled on top of them. C doesn’t map particularly well to this. Go goes and makes the error return another variable entirely. C++ and java have exceptions.

In modern CPUs…

You can sort of get there using arm’s conditional instructions but those are being phased out. You can also use the top bits of the address on all modern 64 bit architectures for something other than their basic purpose. Vector units sort of have it in their 4 way modes with “Select”.

I am attempting to use it consistently in the ERM. Take errno, for example - an error return with the top bit set contains the rest of the errno in the bottom bits. No need to stash errno somewhere else or check for it somewhere else, you already have it. No need to actually use the global errno type either, just the (usually less than -10) specific errors that you are returning for, that you can map back to a conventional errno if you need to. Result: straight line code with no obvious error checking, jumps, or branches required in many cases.

As used in erm it models closely verilog - 01 is true 00 false 10 don’t care 11 error (1,0,X,Z). I have to look over dunn/belnap to see how similar it is. I do not like their symbols much: W∗={∅,{⊥},{⊤},{⊥,⊤}},

NAR = don’t care

I wish I had a preprocessor that could do infix operators in C, but we don’t, so…

Flow model

Written from right to left (function, rather than dataflow syntax) to be more expressible in C.

FIXME: This is incorrect and quite a bit more complex than this - needs something other than ascii text for four valued logic!


l: foreach(source) { while(!(nok |= select(spew(arb(splice(suck(source))))))) ; & }

if(Fix(OK)) goto l;

You can’t perform that action at this time.