<?xml version="1.0" encoding="UTF-8"?>
<commit>
  <added type="array">
    <added>
      <filename>Makefile</filename>
    </added>
    <added>
      <filename>README</filename>
    </added>
    <added>
      <filename>dia1.svg</filename>
    </added>
    <added>
      <filename>dia2.svg</filename>
    </added>
    <added>
      <filename>example-sandbox1.sb</filename>
    </added>
    <added>
      <filename>example-sandbox2.sb</filename>
    </added>
    <added>
      <filename>kernel-patch.diff</filename>
    </added>
    <added>
      <filename>lsmsb-as.cc</filename>
    </added>
    <added>
      <filename>lsmsb-install.c</filename>
    </added>
    <added>
      <filename>lsmsb.c</filename>
    </added>
    <added>
      <filename>lsmsb.html</filename>
    </added>
    <added>
      <filename>lsmsb_external.h</filename>
    </added>
    <added>
      <filename>style.css</filename>
    </added>
  </added>
  <modified type="array">
    <modified>
      <diff>@@ -1,3 +1,4 @@
+&lt;!DOCTYPE html&gt;
 &lt;html&gt;
   &lt;head&gt;
     &lt;title&gt;LSMSB&lt;/title&gt;
@@ -9,11 +10,31 @@
     &lt;h1&gt;LSMSB: A Linux Sandboxing Scheme&lt;/h1&gt;
 
 &lt;p&gt;Adam Langley (&lt;tt&gt;agl@google.com&lt;/tt&gt;)&lt;br&gt;
-Version &lt;tt&gt;20090506&lt;/tt&gt;&lt;/p&gt;
+Version &lt;tt&gt;20090606&lt;/tt&gt;&lt;/p&gt;
 
-&lt;p&gt;This is LSMSB, a sandboxing scheme for Linux based on the ideas of the OS X
-sandbox (which, in turn, was inspired by TrustedBSD and FreeBSD). We aim to
-be:&lt;/p&gt;
+@@TOC
+
+&lt;p&gt;This is LSMSB, a sandboxing scheme for Linux based on the ideas of the &lt;a
+href=&quot;http://www.318.com/techjournal/security/a-brief-introduction-to-mac-os-x-sandbox-technology/&quot;&gt;
+OS X sandbox&lt;/a&gt; (which, in turn, was inspired by TrustedBSD and FreeBSD).&lt;/p&gt;
+
+&lt;p&gt;Imagine that you're working on a university computer and you get a binary
+which promises to do some fiendishly complex calculation, reading from a file
+&lt;tt&gt;./input&lt;/tt&gt; and writing to a file &lt;tt&gt;./output&lt;/tt&gt;. It also talks to a
+specific server to access a pre-computed lookup table. You want to run it, but
+you don't want to have to trust that it won't do anything malicious (save
+giving the wrong answer).&lt;/p&gt;
+
+&lt;p&gt;You current options are very limited. Without root access you cannot setup a
+chroot jail, as troublesome as that is. If the system has SELinux or AppArmor
+installed, the tools are there but those are MAC systems and only root can
+define their policies.&lt;/p&gt;
+
+&lt;p&gt;Your best bet, currently, is either to use &lt;tt&gt;ptrace&lt;/tt&gt; or to run a whole
+virtual machine. The former is slow and difficult to get right in the face of
+threads and the latter is a sledgehammer when we just want to crack a nut.&lt;/p&gt;
+
+&lt;p&gt;To address these concerns we need a sandboxing system which is:&lt;/p&gt;
 
 &lt;ol&gt;
 &lt;li&gt;Available: The sandbox must be available to normal users. MAC systems
@@ -23,8 +44,7 @@ sandbox must be able to express the correct level of authority.&lt;/li&gt;
 &lt;li&gt;Reliable: The sandbox should not be open to races etc.&lt;/li&gt;
 &lt;li&gt;Deployable: One should be able to use a sandbox via a couple of call in
 &lt;tt&gt;main&lt;/tt&gt;(). If you have to implement an IPC system and pass file descriptors around
-in order to achieve reliability (i.e. current pthread in the face of threads)
-then that's a significant demerit.&lt;/li&gt;
+in order to achieve reliability then that's a significant demerit.&lt;/li&gt;
 &lt;li&gt;Composable: If I choose to impose a sandbox on a process before &lt;tt&gt;exec&lt;/tt&gt;()
 then that process should still be able to impose another sandbox on itself. The
 resulting authority should be the intersection of the two sandboxes.&lt;/li&gt;
@@ -34,6 +54,14 @@ requires a tracing process for every sandboxed process, then that's a
 problem.&lt;/li&gt;
 &lt;/ol&gt;
 
+&lt;p&gt;We present a sandboxing scheme using the LSM hooks in the Linux kernel. At
+the moment this scheme is a prototype only and this code is based off of
+2.6.30-rc&lt;i&gt;x&lt;/i&gt;.&lt;/p&gt;
+
+@/* Kernel code
+
+&lt;div style=&quot;float:left; padding-right: 1em;&quot;&gt;&lt;object width=&quot;212&quot; height=&quot;200&quot; data=&quot;dia1.svg&quot; type=&quot;image/svg+xml&quot; class=&quot;img&quot;&gt;&lt;/object&gt;&lt;/div&gt;
+
 &lt;p&gt;LSMSB uses the Linux Security Modules hooks to intercept security decisions in
 the kernel. The policies are implemented by tables of rules which are uploaded
 from user-space. Each process has a stack of zero or more sandboxes. Each
@@ -43,6 +71,17 @@ processes inherit the sandbox stack of their parents and are free to push extra
 sandboxes onto their stack. By construction, this can only reduce their
 authority.&lt;/p&gt;
 
+&lt;div style=&quot;clear: left;&quot;&gt;&lt;/div&gt;
+
+@&lt;Sandbox structure@&gt;=
+
+struct lsmsb_sandbox {
+        atomic_t refcount;
+        struct lsmsb_sandbox *parent;
+        struct lsmsb_filter *dentry_open;
+        // TODO: add support for more actions
+};
+
 @/ Rule tables
 
 &lt;p&gt;An LSMSB filter evaluates a table of rules to decide if a given action is
@@ -58,20 +97,20 @@ not Turing complete and are guaranteed to terminate.&lt;/p&gt;
 
 &lt;h5&gt;The filter structure&lt;/h5&gt;
 
-&lt;p&gt;Each rule in the table is a single 32-bit unsigned integer. Because of this, we
-need another way to reference constant values as they generally don't fit
-in 32-bits. Thus constants are kept in a side array, linked with each filter.
-The rules in the table can reference them by index.&lt;/p&gt;
+&lt;p&gt;Each rule in the table is a single 32-bit unsigned integer. Because constant
+values often don't fit in 32-bits, we need another way to deal with them.  Thus
+constants are kept in a side array, linked with each filter.  The rules in the
+table can reference them by their index.&lt;/p&gt;
 
 &lt;p&gt;For working storage, the rules in the table have an array of 16 registers which
 can either store a 32-bit unsigned integer or a byte-string (which is a normal
-string, but may contain NUL charactors). If a table is exceedingly complex, it
+string, but may contain NUL characters). If a table is exceedingly complex, it
 may need more than 16-registers of storage to hold temporary values. For these
-situations, we also allow the rule table to specifiy the number of spill slots
+situations, we also allow the table to specify the number of spill slots
 that it needs. Spill slots act just like registers except that they have to be
 read and written explicitly.&lt;/p&gt;
 
-&lt;p&gt;From this, the structure for a filter is pretty obvious:&lt;/p&gt;
+&lt;p&gt;From this, the structure for a filter is obvious:&lt;/p&gt;
 
 @&lt;Filter structure@&gt;=
 struct lsmsb_filter {
@@ -82,10 +121,18 @@ struct lsmsb_filter {
 	struct lsmsb_value constants[0];
 };
 
+#define LSMSB_NUM_REGISTERS 16
+// This is the maximum number of operations in a filter. Note that the code
+// assumes that this value fits in a uint16_t with a couple of values to spare.
+#define LSMSB_FILTER_OPS_MAX 32768
+#define LSMSB_SPILL_SLOTS_MAX 32
+#define LSMSB_CONSTANTS_MAX 256
+#define LSMSB_CONSTANT_LENGTH_MAX 512
+
 @/ Filter values
 
 &lt;p&gt;
-  The contents of registers, spill slots and constants are all `values'. These
+  The contents of registers, spill slots and constants are all &amp;lsquo;values&amp;rsquo;. These
   values are either a 32-bit unsigned int or a bytestring and represented by the
   following structure.
 &lt;/p&gt;
@@ -107,7 +154,7 @@ struct lsmsb_value {
 &lt;p&gt;
   Each rule in the table consists of at least an operation, which is encoded in
   the top 8-bits. Given the operation, there are often other arguments (register
-  numbers etc) encoded in the remaining 24-bits. The format is which is specific
+  numbers etc) encoded in the remaining 24-bits, the format of which is specific
   to each operation.
 &lt;/p&gt;
 
@@ -119,7 +166,7 @@ struct lsmsb_value {
   &lt;tr&gt;&lt;th&gt;Name&lt;/th&gt;&lt;th&gt;Explanation&lt;/th&gt;&lt;th&gt;Effect&lt;/th&gt;&lt;/tr&gt;
 
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;MOV&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Move&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
-  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;LDI&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Load mmediate&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = immediate value&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;LDI&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Load immediate&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = immediate value&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;LDC&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Load constant&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = constant value&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;RET&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Return value in register&lt;/td&gt; &lt;td&gt;&lt;i&gt;terminates the filter&lt;/i&gt;&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;JMP&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Jump&lt;/td&gt; &lt;td&gt;Skips the next &lt;i&gt;n&lt;/i&gt; rules&lt;/td&gt;&lt;/tr&gt;
@@ -129,8 +176,8 @@ struct lsmsb_value {
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;EQ&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Equal?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; == reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;GT&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Greater than?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;gt; reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;LT&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Less than?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;lt; reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
-  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;GTE&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Greater than or equal?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;lt;= reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
-  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;LTE&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Less than or equal?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;gt;= reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;GTE&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Greater than or equal?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;gt;= reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;LTE&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Less than or equal?&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;lt;= reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;AND&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Bitwise conjunction&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; &amp;amp; reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;OR&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Bitwise disjunction&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; | reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;&lt;tt&gt;XOR&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;Bitwise exclusive-or&lt;/td&gt; &lt;td&gt;reg&lt;sub&gt;1&lt;/sub&gt; = reg&lt;sub&gt;2&lt;/sub&gt; ^ reg&lt;sub&gt;3&lt;/sub&gt;&lt;/td&gt;&lt;/tr&gt;
@@ -158,12 +205,10 @@ enum lsmsb_opcode {
 	LSMSB_OPCODE_ISPREFIXOF,
 };
 
-@/ Availible filter types
+@/ Available filter types
 
-&lt;p&gt;
-  The type of a filter defines the filter will be evaluated and the arguments which
-  are passed to the filter. Here are the currently defined filter types:
-&lt;/p&gt;
+&lt;p&gt;The type of a filter signifies the operation which it intends to filter, as
+well as the context that it runs in. (See &lt;a href=&quot;@@cite:Filter structure@@&quot;&gt;above&lt;/a&gt;). Here are the currently defined filters.&lt;/p&gt;
 
 &lt;table&gt;
   &lt;tr&gt;&lt;th&gt;Name&lt;/th&gt; &lt;th&gt;Explanation&lt;/th&gt; &lt;th&gt;Arguments&lt;/th&gt;&lt;/tr&gt;
@@ -177,7 +222,7 @@ enum lsmsb_filter_code {
 	LSMSB_FILTER_CODE_MAX,  // not a real filter code
 };
 
-@/ Typechecking.
+@/** Typechecking.
 
 &lt;p&gt;
   The rule tables are simple enough that they can be validated such that we can
@@ -188,6 +233,8 @@ enum lsmsb_filter_code {
 
 @&lt;Typechecking@&gt;=
 
+@&lt;Predecessor table utility functions@&gt;
+
 @&lt;Predecessor tables@&gt;
 
 @&lt;Type unification@&gt;
@@ -198,7 +245,7 @@ enum lsmsb_filter_code {
 
 &lt;p&gt;
   For each rule in the table there is a set of rules which can be the immediate
-  predecessor of that rule. In the simple case, the rule preceeding will fall
+  predecessor of that rule. In the simple case, the rule preceding will fall
   though and be the single predecessor. In more complex cases, a rule might be
   the target of several jumps and a fall though.
 &lt;/p&gt;
@@ -220,9 +267,12 @@ enum lsmsb_filter_code {
   we have to deal with the overflows.
 &lt;/p&gt;
 
+
+&lt;div style=&quot;text-align: center;&quot;&gt;&lt;object width=&quot;420&quot; height=&quot;175&quot; data=&quot;dia2.svg&quot; type=&quot;image/svg+xml&quot; class=&quot;img&quot;&gt;&lt;/object&gt;&lt;/div&gt;
+
 @&lt;Predecessor tables@&gt;=
 
-/* This is the magic NULL value in the predecessor table. This value must be
+/* This is the magic unset value in the predecessor table. This value must be
  * all ones because we clear the table with a memset. */
 #define LSMSB_PREDECESSOR_TABLE_INVAL 0xffff
 /* This is a magic value in the predecessor table which marks an overflow. */
@@ -268,7 +318,7 @@ static void lsmsb_predecessor_table_append(uint16_t *ptable, unsigned target,
 
 &lt;p&gt;
   We build the predecessor table by first clearing it then, for each rule, we
-  find its one or two sucessor instructions and mark it as a predecessor for
+  find its one or two successor instructions and mark it as a predecessor for
   those instructions.
 &lt;/p&gt;
 
@@ -292,6 +342,8 @@ static char lsmsb_predecessor_table_fill(uint16_t *ptable, const uint32_t *ops,
 		if (lsmsb_opcode_is_jump(opcode)) {
 			/* 0 &lt;= i, jumplength &lt;= 0xffff */
 			const unsigned target = i + lsmsb_op_jump_length(op);
+			if (target == i)
+				return 0;  /* zero length jump */
 			if (target &gt;= num_ops)
 				return 0;  /* tried to jump off the end */
 			lsmsb_predecessor_table_append(ptable, target, i);
@@ -315,37 +367,6 @@ static char lsmsb_predecessor_table_fill(uint16_t *ptable, const uint32_t *ops,
 @&lt;Clear the predecessor table@&gt;=
 	memset(ptable, 0xff, sizeof(uint16_t) * num_ops * LSMSB_PREDECESSOR_TABLE_WIDTH);
 
-@/ Utility functions
-
-&lt;P&gt;
-  We used a few utility functions in this code which we'll now flesh out.
-&lt;/P&gt;
-
-@&lt;Predecessor table utility functions@&gt;=
-
-static inline enum lsmsb_opcode lsmsb_op_opcode_get(uint32_t op)
-{
-	return op &gt;&gt; 24;
-}
-
-static inline char lsmsb_opcode_falls_through(enum lsmsb_opcode opcode)
-{
-	return opcode != LSMSB_OPCODE_RET &amp;&amp;
-	       opcode != LSMSB_OPCODE_JMP;
-}
-
-static inline unsigned lsmsb_opcode_is_jump(enum lsmsb_opcode opcode)
-{
-	return opcode == LSMSB_OPCODE_JMP || opcode == LSMSB_OPCODE_JC;
-}
-
-static inline unsigned lsmsb_op_jump_length(uint32_t op) {
-	const unsigned opcode = op &gt;&gt; 24;
-	if (opcode == LSMSB_OPCODE_JMP || opcode == LSMSB_OPCODE_JC)
-		return op &amp; 0xff;
-	return 0;
-}
-
 @/ Type unification
 
 &lt;p&gt;
@@ -359,6 +380,26 @@ static inline unsigned lsmsb_op_jump_length(uint32_t op) {
   possibly hold the wrong type, we reject the rule table.
 &lt;/p&gt;
 
+@&lt;Type unification@&gt;=
+
+@&lt;lsmsb_type@&gt;
+
+@&lt;Type vector utility functions@&gt;
+
+@&lt;is_predecessor_of@&gt;
+
+@&lt;Type vector unification@&gt;
+
+@&lt;Type vector updating@&gt;
+
+@&lt;Type vector arrays@&gt;
+
+@&lt;Filter contexts@&gt;
+
+@&lt;type_vector_for_filter@&gt;
+
+@/ Type vectors
+
 &lt;p&gt;
   A &lt;i&gt;type vector&lt;/i&gt; is an array of 2-bit values, one for each register and
   spill slot. Each 2-bit value is either:
@@ -372,7 +413,7 @@ which isn't defined by the context has an undefined type.&lt;/td&gt;&lt;/tr&gt;
   &lt;tr&gt;&lt;td&gt;Conflicting:&lt;/td&gt; &lt;td&gt;the type differs depending on the control flow.&lt;/td&gt;&lt;/tr&gt;
 &lt;/table&gt;
 
-@&lt;Type unification@&gt;=
+@&lt;lsmsb_type|&lt;tt&gt;lsmsb_type&lt;/tt&gt;@&gt;=
 
 enum lsmsb_type {
 	LSMSB_TYPE_UNDEF = 0,
@@ -381,15 +422,17 @@ enum lsmsb_type {
 	LSMSB_TYPE_CONFLICTING,
 };
 
-@&lt;Type vector unification@&gt;
-
-@&lt;Type vector updating@&gt;
-
-@&lt;Type vector arrays@&gt;
+static inline char lsmsb_type_is_value(enum lsmsb_type type)
+{
+	return type == LSMSB_TYPE_U32 || type == LSMSB_TYPE_BYTESTRING;
+}
 
-@&lt;Filter contexts@&gt;
+static inline enum lsmsb_type lsmsb_constant_type_get(const struct lsmsb_value *v)
+{
+	return v-&gt;data ? LSMSB_TYPE_BYTESTRING : LSMSB_TYPE_U32;
+}
 
-@/ Unifing type vectors
+@/ Unifying type vectors
 
 &lt;p&gt;
   Given two type vectors (say, the type vectors from the two predecessors of a
@@ -401,15 +444,17 @@ enum lsmsb_type {
 &lt;p&gt;
   To implement this quickly, we try to operate on many elements concurrently.
   Most of the time a rule table will not use spill slots so the type vector will
-  be 2 * 16 = 32 bits long and we can do them all in one go.
+  be 2 &amp;times; 16 = 32 bits long and we can do them all in one go.
 &lt;/p&gt;
 
 &lt;p&gt;
   To understand the code below, consider a single pair of 2-bit inputs. If we
-  exclusive-or them, we'll end up with a 1 bit iff the inputs differed. If we
-  could map 01, 10, and 11 to 11 and then OR with the original input, that would
-  map equal inputs to themselves and differing inputs to
-  &lt;tt&gt;LSMSB_TYPE_CONFLICTING&lt;/tt&gt;.
+  exclusive-or them, we'll end up with a true bit in the result iff the inputs
+  differed. If we could map 01, 10, and 11 to 11 and then OR with the original
+  input, that would map equal inputs to themselves and differing inputs to
+  &lt;tt&gt;LSMSB_TYPE_CONFLICTING&lt;/tt&gt;. (Remember that
+  &lt;tt&gt;LSMSB_TYPE_CONFLICTING&lt;/tt&gt; &lt;a href=&quot;@@cite:lsmsb_type@@&quot;&gt;has a value of
+  11&lt;/a&gt; in binary).
 &lt;/p&gt;
 
 &lt;p&gt;
@@ -428,7 +473,8 @@ enum lsmsb_type {
 
 @&lt;Type vector unification@&gt;=
 
-static void lsmsb_type_vector_unify(uint8_t *a, const uint8_t *b, unsigned bytelen) {
+static void lsmsb_type_vector_unify(uint8_t *a, const uint8_t *b, unsigned bytelen)
+{
 	unsigned offset = 0;
 	while (bytelen &gt;= 4) {
 		uint32_t u = *((uint32_t *) (a + offset));
@@ -590,7 +636,7 @@ static char lsmsb_type_vector_array_fill(uint8_t *tva,
 	if (filter-&gt;num_operations == 0)
 		return 1;
 
-	memset(tva, tva_width, 0);  /* set the first row to LSMSB_TYPE_UNDEF */
+	memset(tva, 0, tva_width);  /* set the first row to LSMSB_TYPE_UNDEF */
 	memcpy(tva, context_tv, LSMSB_NUM_REGISTERS / 4);
 
 	if (!lsmsb_op_type_vector_update(tva, ops[0], filter-&gt;constants,
@@ -606,9 +652,9 @@ static char lsmsb_type_vector_array_fill(uint8_t *tva,
 		char found_predecessor = 0;
 
 		if (ptable_row[0] == LSMSB_PREDECESSOR_TABLE_OVERFLOW) {
-			@&lt;Type vector array: handle overflowed row@&gt;
+			@&lt;handle overflowed row@&gt;
 		} else {
-			@&lt;Type vector array: handle normal row@&gt;
+			@&lt;handle normal row@&gt;
 		}
 
 		if (!lsmsb_op_type_vector_update(tva_row, ops[i],
@@ -622,7 +668,7 @@ static char lsmsb_type_vector_array_fill(uint8_t *tva,
 	return 1;
 }
 
-@/ Unifing normal rows
+@/ Unifying normal rows
 
 &lt;p&gt;
   With a row where the predecessor table didn't overflow, we can easily find the
@@ -639,7 +685,7 @@ static char lsmsb_type_vector_array_fill(uint8_t *tva,
   keeping it around.
 &lt;/p&gt;
 
-@&lt;Type vector array: handle normal row@&gt;=
+@&lt;handle normal row@&gt;=
 	for (j = 0; j &lt; LSMSB_PREDECESSOR_TABLE_WIDTH; ++j) {
 		const unsigned p = ptable_row[j];
 		const uint8_t *tva_row_p = tva + p * tva_width;
@@ -658,16 +704,16 @@ static char lsmsb_type_vector_array_fill(uint8_t *tva,
 	if (!found_predecessor)
 		return 0;  // Dead code.
 
-@/ Unifing overflow rows
+@/ Unifying overflow rows
 
 &lt;p&gt;
-  When the precedessor table is marked as overflowed, we have to find all the
+  When the predecessor table is marked as overflowed, we have to find all the
   predecessors ourselves. We can do this by checking all previous operations for
   jumps to the current operation and checking the previous instruction for fall
   though. (Recall that we only have forward jumps.)
 &lt;/p&gt;
 
-@&lt;Type vector array: handle overflowed row@&gt;=
+@&lt;handle overflowed row@&gt;=
 	for (j = 0; j &lt; i; ++j) {
 		if (lsmsb_op_is_predecessor_of(ops, j, i)) {
 			if (!found_predecessor) {
@@ -688,6 +734,66 @@ static char lsmsb_type_vector_array_fill(uint8_t *tva,
 	if (!found_predecessor)
 		return 0;  // shouldn't ever happen
 
+@/ Testing for predecessor rules
+
+@&lt;is_predecessor_of|Predecessor testing function@&gt;=
+static char lsmsb_op_is_predecessor_of(const uint32_t *ops, unsigned i,
+                                       unsigned target) {
+	const uint32_t op = ops[i];
+	const enum lsmsb_opcode opcode = lsmsb_op_opcode_get(op);
+	
+	if (i == target - 1 &amp;&amp;
+	    lsmsb_opcode_falls_through(opcode)) {
+		return 1;
+	}
+
+	if (lsmsb_opcode_is_jump(opcode) &amp;&amp;
+	    lsmsb_op_jump_length(op) + i == target) {
+		return 1;
+	}
+
+	return 0;
+}
+
+@/ Type vector utility functions
+
+@&lt;Type vector utility functions@&gt;=
+
+static inline unsigned lsmsb_filter_tva_width(const struct lsmsb_filter *filter)
+{
+	return (LSMSB_NUM_REGISTERS + filter-&gt;num_spill_slots + 3) / 4;
+}
+
+static inline enum lsmsb_type lsmsb_type_vector_reg_get(uint8_t *vector, unsigned reg)
+{
+	const unsigned byte = reg &gt;&gt; 2;
+	const unsigned index = reg &amp; 3;
+
+	return (enum lsmsb_type) ((vector[byte] &gt;&gt; (6 - (index * 2))) &amp; 3);
+}
+
+static inline enum lsmsb_type lsmsb_type_vector_spill_get(uint8_t *vector, unsigned slot)
+{
+	return lsmsb_type_vector_reg_get(vector, slot + LSMSB_NUM_REGISTERS);
+}
+
+static inline void lsmsb_type_vector_reg_set(uint8_t *vector, unsigned reg,
+                                             enum lsmsb_type newtype)
+{
+	const unsigned byte = reg &gt;&gt; 2;
+	const unsigned index = reg &amp; 3;
+	static const uint8_t masks[4] = { 0x3f, 0xcf, 0xf3, 0xfc };
+	const uint8_t new_value = (vector[byte] &amp; masks[index]) |
+	                          newtype &lt;&lt; (6 - (index * 2));
+	vector[byte] = new_value;
+}
+
+static inline void lsmsb_type_vector_spill_set(uint8_t *vector, unsigned slot,
+                                               enum lsmsb_type newtype)
+{
+	lsmsb_type_vector_reg_set(vector, slot + LSMSB_NUM_REGISTERS, newtype);
+}
+
 @/ Bringing typechecking together
 
 &lt;p&gt;
@@ -702,16 +808,18 @@ int lsmsb_filter_typecheck(const struct lsmsb_filter *filter,
 			   const uint8_t *context_type_vector)
 {
 	const unsigned tva_width = lsmsb_filter_tva_width(filter);
-	uint16_t *predecessor_table = kmalloc(
+	uint16_t *predecessor_table;
+	uint8_t *type_vector_array;
+	int return_code = -EINVAL;
+
+	predecessor_table = (uint16_t*) kmalloc(
 		filter-&gt;num_operations *
 		LSMSB_PREDECESSOR_TABLE_WIDTH *
 		sizeof(uint16_t), GFP_KERNEL);
-	uint8_t *type_vector_array = kmalloc(filter-&gt;num_operations *
-					     tva_width, GFP_KERNEL);
-	int return_code = -EINVAL;
-	
 	if (!predecessor_table)
 		return -ENOMEM;
+	type_vector_array = kmalloc(filter-&gt;num_operations *
+				    tva_width, GFP_KERNEL);
 	if (!type_vector_array) {
 		kfree(predecessor_table);
 		return -ENOMEM;
@@ -735,20 +843,267 @@ exit:
 	return return_code;
 }
 
+@/** Evaluating filters
+
+&lt;p&gt;Since typechecking has eliminated most run-time errors from the filter, the
+evaluation of filters can happen without many of those checks.&lt;/p&gt;
+
+&lt;p&gt;The evaluation is straight-forward: a register machine is simulated with
+with the register values in an array on the stack. Each instruction is
+dispatched using a switch. There are several pieces of low-hanging fruit that
+could make this code faster but, for now, we choose the simplest code that
+works.&lt;/p&gt;
+
+@&lt;Evaluating filters@&gt;=
+char lsmsb_filter_run(const struct lsmsb_filter *filter,
+		      const struct lsmsb_value *init_values,
+		      unsigned num_init_values)
+{
+	unsigned ip = 0;
+	struct lsmsb_value regs[LSMSB_NUM_REGISTERS];
+	struct lsmsb_value *spills = NULL;
+	uint32_t op;
+	enum lsmsb_opcode opcode;
+	unsigned reg1, reg2, reg3, c1, s1;
+	unsigned imm;
+	char return_value, returned = 0;
+
+	memcpy(regs, init_values, num_init_values * sizeof(struct lsmsb_value));
+
+	if (filter-&gt;num_spill_slots) {
+		spills = kmalloc(sizeof(struct lsmsb_value) *
+				 filter-&gt;num_spill_slots, GFP_ATOMIC);
+		if (!spills) {
+			printk(&quot;lsmsb: failed to allocate %u spill slots\n&quot;,
+			       filter-&gt;num_spill_slots);
+			return 0;
+		}
+	}
+
+	for (ip = 0; !returned; ++ip) {
+		op = filter-&gt;operations[ip];
+		opcode = lsmsb_op_opcode_get(op);
+
+		switch (opcode) {
+		case LSMSB_OPCODE_MOV:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			memcpy(&amp;regs[reg1], &amp;regs[reg2],
+			       sizeof(struct lsmsb_value));
+			break;
+		case LSMSB_OPCODE_LDI:
+			reg1 = lsmsb_op_reg1_get(op);
+			imm = lsmsb_op_imm_get(op);
+			lsmsb_value_u32_set(&amp;regs[reg1], imm);
+			break;
+		case LSMSB_OPCODE_LDC:
+			reg1 = lsmsb_op_reg1_get(op);
+			c1 = lsmsb_op_constant1_get(op);
+			memcpy(&amp;regs[reg1], &amp;filter-&gt;constants[c1],
+			       sizeof(struct lsmsb_value));
+			break;
+		case LSMSB_OPCODE_RET:
+			reg1 = lsmsb_op_reg1_get(op);
+			return_value = regs[reg1].value &gt; 0;
+			returned = 1;
+			break;
+		case LSMSB_OPCODE_JMP:
+			ip--;
+			ip += lsmsb_op_jump_length(op);
+			break;
+		case LSMSB_OPCODE_SPILL:
+			s1 = lsmsb_op_spill1_get(op);
+			reg1 = lsmsb_op_reg3_get(op);
+			memcpy(&amp;spills[s1], &amp;regs[reg1],
+			       sizeof(struct lsmsb_value));
+			break;
+		case LSMSB_OPCODE_UNSPILL:
+			reg1 = lsmsb_op_reg1_get(op);
+			s1 = lsmsb_op_spill2_get(op);
+			memcpy(&amp;regs[reg1], &amp;spills[s1],
+			       sizeof(struct lsmsb_value));
+			break;
+		case LSMSB_OPCODE_JC:
+			reg1 = lsmsb_op_reg1_get(op);
+			if (regs[reg1].value) {
+				ip--;
+				ip += lsmsb_op_jump_length(op);
+			}
+			break;
+		case LSMSB_OPCODE_EQ:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value == regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_GT:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value &gt; regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_LT:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value &lt; regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_GTE:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value &gt;= regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_LTE:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value &lt;= regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_AND:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value &amp; regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_OR:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value | regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_XOR:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].value = regs[reg2].value ^ regs[reg3].value;
+			break;
+		case LSMSB_OPCODE_ISPREFIXOF:
+			reg1 = lsmsb_op_reg1_get(op);
+			reg2 = lsmsb_op_reg2_get(op);
+			reg3 = lsmsb_op_reg3_get(op);
+			regs[reg1].data = NULL;
+			if (regs[reg2].value &gt; regs[reg3].value) {
+				regs[reg1].value = 0;
+			} else {
+				regs[reg1].value =
+				    memcmp(regs[reg2].data, regs[reg3].data,
+					   regs[reg2].value) == 0;
+			}
+			break;
+		default:
+			// should never hit this
+			returned = 1;
+			return_value = 0;
+			break;
+		}
+	}
+
+	if (spills)
+		kfree(spills);
+
+	return return_value;
+}
+
+@/ Utility functions
+
+&lt;P&gt;
+  We used a few utility functions in this code which we'll now flesh out.
+&lt;/P&gt;
+
+@&lt;Predecessor table utility functions@&gt;=
+
+static inline enum lsmsb_opcode lsmsb_op_opcode_get(uint32_t op)
+{
+	return (enum lsmsb_opcode) (op &gt;&gt; 24);
+}
+
+static inline char lsmsb_opcode_falls_through(enum lsmsb_opcode opcode)
+{
+	return opcode != LSMSB_OPCODE_RET &amp;&amp;
+	       opcode != LSMSB_OPCODE_JMP;
+}
+
+static inline unsigned lsmsb_opcode_is_jump(enum lsmsb_opcode opcode)
+{
+	return opcode == LSMSB_OPCODE_JMP || opcode == LSMSB_OPCODE_JC;
+}
+
+static inline unsigned lsmsb_op_jump_length(uint32_t op)
+{
+	const unsigned opcode = op &gt;&gt; 24;
+	if (opcode == LSMSB_OPCODE_JMP || opcode == LSMSB_OPCODE_JC)
+		return op &amp; 0xff;
+	return 0;
+}
+
+static inline unsigned lsmsb_op_reg1_get(uint32_t op)
+{
+	return (op &gt;&gt; 20) &amp; 0xf;
+}
+
+static inline unsigned lsmsb_op_reg2_get(uint32_t op)
+{
+	return (op &gt;&gt; 16) &amp; 0xf;
+}
+
+static inline unsigned lsmsb_op_reg3_get(uint32_t op)
+{
+	return (op &gt;&gt; 12) &amp; 0xf;
+}
+
+static inline unsigned lsmsb_op_spill1_get(uint32_t op)
+{
+	return (op &gt;&gt; 16) &amp; 0xff;
+}
+
+static inline unsigned lsmsb_op_spill2_get(uint32_t op)
+{
+	return (op &gt;&gt; 12) &amp; 0xff;
+}
+
+static inline unsigned lsmsb_op_constant1_get(uint32_t op)
+{
+	return op &amp; 0xff;
+}
+
+static inline unsigned lsmsb_op_imm_get(uint32_t op)
+{
+	return op &amp; 0xfffff;
+}
+
+static inline void lsmsb_value_u32_set(struct lsmsb_value *value, unsigned v)
+{
+	value-&gt;data = NULL;
+	value-&gt;value = v;
+}
+
 @/ Filter contexts
 
-&lt;p&gt;
-  When typechecking a filter, it has a context implied in the code. For
-  typechecking, we also need to mirror that context so that we can create the
-  initial type vector.
-&lt;/p&gt;
+&lt;p&gt;The context of a filter (the semantics and types of the registers on entry)
+are specified implicitly in the code for running each different type of filter.
+In order to perform typechecking we need to duplicate that information
+, or at least the types, here.&lt;/p&gt;
 
 @&lt;Filter contexts@&gt;=
 
-static const char *filter_contexts[] = {
-	&quot;BI&quot;,  // DENTRY_OPEN
+struct filter_context {
+  const char *filter_name;
+  const char *type_string;
 };
 
+const struct filter_context filter_contexts[] = {
+  {&quot;dentry-open&quot;, &quot;BI&quot;}, // LSMSB_FILTER_CODE_DENTRY_OPEN
+  {NULL, NULL}
+};
+
+@/ Getting an initial type vector for a filter
+
+&lt;p&gt;Once we have &lt;tt&gt;filter_contexts&lt;/tt&gt;, we can define a function to build
+an initial type vector for a filter given the type string.&lt;/p&gt;
+
+@&lt;type_vector_for_filter|Getting a type vector for a filter@&gt;=
+
 static uint8_t *type_vector_for_filter(const struct lsmsb_filter *filter,
 				       const char *context_string)
 {
@@ -789,6 +1144,8 @@ static uint8_t *type_vector_for_filter(const struct lsmsb_filter *filter,
 
 @&lt;Installing a constant@&gt;
 
+@&lt;Handling sandbox lifetimes@&gt;
+
 @&lt;Installing a filter@&gt;
 
 @&lt;Installing a sandbox@&gt;
@@ -797,7 +1154,7 @@ static uint8_t *type_vector_for_filter(const struct lsmsb_filter *filter,
 
 &lt;p&gt;
   Several structures are used in the data which is provided to the kernel. These
-  structures thus become part of the kernel ABI. They are named with a |_wire|
+  structures thus become part of the kernel ABI. They are named with a &lt;tt&gt;_wire&lt;/tt&gt;
   suffix to mark them as such.
 &lt;/p&gt;
 
@@ -816,7 +1173,7 @@ struct lsmsb_constant_wire {
 	/* In the case of a bytestring, the bytes follow and |value| is the length */
 };
 
-@/ Installing the sandbox
+@/** Installing the sandbox
 
 &lt;p&gt;
   The filters are prefixed by a &lt;tt&gt;uint32_t&lt;/tt&gt; which contains the number of
@@ -832,6 +1189,8 @@ struct lsmsb_constant_wire {
 
 @&lt;Installing a sandbox@&gt;=
 
+#define LSMSB_MAX_SANDBOXES_PER_PROCESS 16
+
 int lsmsb_sandbox_install(struct task_struct *task,
 			  const char __user *buf,
 			  size_t len)
@@ -842,6 +1201,8 @@ int lsmsb_sandbox_install(struct task_struct *task,
 	unsigned i;
 	int return_code;
 
+	@&lt;Check for limits on the number of sandboxes@&gt;
+
 	if (copy_from_user(&amp;num_filters, buf, sizeof(num_filters)))
 		return -EFAULT;
 	buf += sizeof(num_filters);
@@ -874,8 +1235,9 @@ error:
 
 &lt;p&gt;
   The sandboxes are conceptually in a stack. The top of the stack is pointed by
-  by the &lt;tt&gt;struct task_struct&lt;/tt&gt; of a given process. Thus, pushing a new sandbox on
-  the top of the stack involves an RCU update of the current &lt;tt&gt;struct task_struct&lt;/tt&gt;.
+  by the &lt;tt&gt;struct task_struct&lt;/tt&gt; of a given process. Thus, pushing a new
+  sandbox on the top of the stack involves an RCU update of the current
+  &lt;tt&gt;struct task_struct&lt;/tt&gt;.
 &lt;/p&gt;
 
 @&lt;Push a new sandbox onto the current process@&gt;=
@@ -890,12 +1252,24 @@ error:
 	new_creds-&gt;security = sandbox;
 	commit_creds(new_creds);
 
+@/ Limiting the number of sandboxes
+
+&lt;p&gt;In order to stop userspace processes from consuming kernel memory
+unboundedly, we limit the number of sandboxes which can be active for any given
+process.&lt;/p&gt;
+
+@&lt;Check for limits on the number of sandboxes@&gt;=
+	for (i = 0, sandbox = task-&gt;cred-&gt;security; sandbox; ++i)
+		sandbox = sandbox-&gt;parent;
+	if (i &gt; LSMSB_MAX_SANDBOXES_PER_PROCESS)
+		return -ENOSPC;
+
 @/ Installing a filter
 
 &lt;p&gt;
   Installing a filter involves copying the filter header from userspace, followed
   by the operation stream and any constants. At this point a number of limits are
-  imposed on the sizes of the various structures for sanities sake and also to
+  imposed on the sizes of the various structures for sanity's sake and also to
   avoid having to worry about integer overflows in other parts of the code.
 &lt;/p&gt;
 
@@ -914,16 +1288,17 @@ static int lsmsb_filter_install(struct lsmsb_sandbox *sandbox,
 	struct lsmsb_filter *filter;
 	unsigned i;
 	int return_code = -ENOMEM;
+	uint8_t *type_vector;
 
 	if (copy_from_user(&amp;filter_wire, *buf, sizeof(filter_wire)))
 		return -EFAULT;
 
 	if (filter_wire.num_operations &gt; LSMSB_FILTER_OPS_MAX ||
 	    filter_wire.num_spill_slots &gt; LSMSB_SPILL_SLOTS_MAX ||
-	    filter_wire.num_constants &gt; LSMSB_CONSTANTS_MAX ||
-	    filter_wire.filter_code &gt; LSMSB_FILTER_CODE_MAX) {
+	    filter_wire.num_constants &gt; LSMSB_CONSTANTS_MAX)
 		return -EOVERFLOW;
-	}
+	if (filter_wire.filter_code &gt;= LSMSB_FILTER_CODE_MAX)
+		return -EINVAL;
 
 	*buf += sizeof(struct lsmsb_filter_wire);
 
@@ -956,6 +1331,17 @@ static int lsmsb_filter_install(struct lsmsb_sandbox *sandbox,
 			goto error;
 	}
 
+	type_vector = type_vector_for_filter(
+		filter, filter_contexts[filter_wire.filter_code].type_string);
+	if (!type_vector) {
+		return_code = -ENOMEM;
+		goto error;
+	}
+
+	return_code = lsmsb_filter_typecheck(filter, type_vector);
+	if (return_code)
+		goto error;
+
 	switch (filter_wire.filter_code) {
 	case LSMSB_FILTER_CODE_DENTRY_OPEN:
 		if (sandbox-&gt;dentry_open) {
@@ -972,21 +1358,14 @@ static int lsmsb_filter_install(struct lsmsb_sandbox *sandbox,
 	return 0;
 
 error:
-	for (i = 0; i &lt; filter_wire.num_constants; ++i) {
-		if (filter-&gt;constants[i].data)
-			kfree(filter-&gt;constants[i].data);
-	}
-	if (filter-&gt;operations)
-		kfree(filter-&gt;operations);
-	kfree(filter);
-
+	lsmsb_filter_free(filter);
 	return return_code;
 }
 
 @/ Installing a constant
 
 &lt;p&gt;
-  Each constant is described by a |struct constant_wire| and, optionally,
+  Each constant is described by a &lt;tt&gt;struct constant_wire&lt;/tt&gt; and, optionally,
   followed by its data if it's a bytestring.
 &lt;/p&gt;
 
@@ -1021,7 +1400,7 @@ static int lsmsb_constant_install(struct lsmsb_value *value,
 	return 0;
 }
 
-@/ Interfacing with LSM
+@/** Interfacing with LSM
 
 &lt;p&gt;
   LSM modules provide a structure of function pointers. Each function pointer
@@ -1035,52 +1414,69 @@ static int lsmsb_constant_install(struct lsmsb_value *value,
 
 @&lt;dentry_open hook@&gt;
 
-@&lt;Handling sandbox lifetimes@&gt;
-
 @&lt;LSM operations structure@&gt;
 
 @/ Handling sandbox lifetimes
 
 &lt;p&gt;
   We only have to provide a couple of functions to handle sandbox lifetimes. One
-  to destroy sandboxes and another to duplicate them. Since the sandboxes are
-  refcounted, we never actually duplicate them.
+  to destroy sandboxes and another to duplicate them. (The sandboxes are
+  reference countered, so &amp;lsquo;duplicating&amp;rsquo; is very cheap.)
 &lt;/p&gt;
 
 @&lt;Handling sandbox lifetimes@&gt;=
-static void lsmsb_filter_free(struct lsmsb_filter *filter) {
+static void lsmsb_filter_free(struct lsmsb_filter *filter)
+{
+	unsigned i;
+
 	if (!filter)
 		return;
-	kfree(filter-&gt;operations);
+	for (i = 0; i &lt; filter-&gt;num_constants; ++i) {
+		if (filter-&gt;constants[i].data)
+			kfree(filter-&gt;constants[i].data);
+	}
+	if (filter-&gt;operations)
+		kfree(filter-&gt;operations);
 	kfree(filter);
 }
 
-static void lsmsb_sandbox_free(struct lsmsb_sandbox *sandbox) {
+static void lsmsb_sandbox_free(struct lsmsb_sandbox *sandbox)
+{
 	lsmsb_filter_free(sandbox-&gt;dentry_open);
 	kfree(sandbox);
 }
 
-static int lsmsb_cred_prepare(struct cred *new, const struct cred *old, gfp_t gfp) {
-	struct lsmsb_sandbox *sandbox = (struct lsmsb_sandbox *) old-&gt;security;
+@&lt;Dealing with cred structures@&gt;
+
+@/ &lt;tt&gt;cred&lt;/tt&gt; structures
+
+&lt;p&gt;A &lt;tt&gt;struct cred&lt;/tt&gt; contains the authority of a process; it's various
+UIDs and GIDs and an opaque pointer to the LSM data for a process. In our case,
+that points to the sandbox that is at the top of the stack for the process.&lt;/p&gt;
+
+&lt;p&gt;We need a couple of functions to deal with them. The first comes into play
+when duplicating a &lt;tt&gt;cred&lt;/tt&gt; structure for a new process. The new process
+initially gets the same sandbox stack as the parent so we just copy the pointer
+and increment the reference count.&lt;/p&gt;
+
+&lt;p&gt;The second deals with freeing a &lt;tt&gt;cred&lt;/tt&gt; structure. When freeing a
+sandbox we have to keep in mind that the parent pointer is reference counted.
+Thus, when we delete a sandbox that might cause it's parent to be deleted and
+so on.&lt;/p&gt;
+
+@&lt;Dealing with cred structures|Dealing with &lt;tt&gt;cred&lt;/tt&gt; structures@&gt;=
+static int lsmsb_cred_prepare(struct cred *new, const struct cred *old, gfp_t gfp)
+{
+	struct lsmsb_sandbox *sandbox = old-&gt;security;
 	new-&gt;security = sandbox;
 	if (sandbox)
 		atomic_inc(&amp;sandbox-&gt;refcount);
 	return 0;
 }
 
-@&lt;Freeing a cred structure@&gt;
-
-@/ Freeing sandboxes
-
-&lt;p&gt;
-  When freeing a sandbox we have to keep in mind that the parent pointer is
-  reference counted. Thus, when we delete a sandbox that might cause it's parent
-  to be deleted and so on.
-&lt;/p&gt;
-
-@&lt;Freeing a cred structure@&gt;=
-static void lsmsb_cred_free(struct cred *cred) {
-	struct lsmsb_sandbox *sandbox = (struct lsmsb_sandbox *) cred-&gt;security;
+static void lsmsb_cred_free(struct cred *cred)
+{
+	struct lsmsb_sandbox *sandbox = cred-&gt;security;
 	struct lsmsb_sandbox *garbage;
 
 	while (sandbox) {
@@ -1096,12 +1492,23 @@ static void lsmsb_cred_free(struct cred *cred) {
 
 @/ The &lt;tt&gt;dentry_open&lt;/tt&gt; hook
 
+&lt;p&gt;We now come the list of hooks. These functions are LSM hook functions which
+convert their arguments into a context for a filter and evaluate the stack of
+sandboxes.&lt;/p&gt;
+
+&lt;p&gt;&lt;tt&gt;dentry_open&lt;/tt&gt; is called when a process opens a file. This function is
+currently incomplete as it doesn't deal with files which cannot be named in the
+current context (for example, the file is outside the current root for the
+process). Nor does it currently deal differences between the view of the
+filesystem that was active when the sandbox was installed vs the current view
+of the filesystem.&lt;/p&gt;
+
 @&lt;dentry_open hook@&gt;=
 static int lsmsb_dentry_open(struct file *f, const struct cred *cred)
 {
 	const struct lsmsb_sandbox *sandbox;
 	char buffer[512];
-	struct lsmsb_value constants[2];
+	struct lsmsb_value registers[2];
 
 	struct path root;
 	struct path ns_root = { };
@@ -1110,7 +1517,7 @@ static int lsmsb_dentry_open(struct file *f, const struct cred *cred)
 
 	if (!cred-&gt;security)
 		return 0;
-	sandbox = (struct lsmsb_sandbox *) cred-&gt;security;
+	sandbox = cred-&gt;security;
 
 	while (sandbox) {
 		if (sandbox-&gt;dentry_open)
@@ -1139,15 +1546,15 @@ static int lsmsb_dentry_open(struct file *f, const struct cred *cred)
 	path_put(&amp;root);
 	path_put(&amp;ns_root);
 	
-	constants[0].data = sp;
-	constants[0].value = strlen(sp);
-	constants[1].data = NULL;
-	constants[1].value = f-&gt;f_flags;
+	registers[0].data = sp;
+	registers[0].value = strlen(sp);
+	registers[1].data = NULL;
+	registers[1].value = f-&gt;f_flags;
 
 	while (sandbox) {
 		if (sandbox-&gt;dentry_open) {
 			if (!lsmsb_filter_run(sandbox-&gt;dentry_open,
-					      constants, 2)) {
+					      registers, 2)) {
 				return -EPERM;
 			}
 		}
@@ -1169,11 +1576,11 @@ struct security_operations lsmsb_ops = {
 };
 
 
-@/ Initilising the module
+@/ Initialising the module
 
 &lt;p&gt;
   LSM modules, despite the name, can no longer actually be loadable modules. They
-  are initilisaed during the boot sequence by the security code so that they
+  are initialised during the boot sequence by the security code so that they
   can label kernel objects early on.
 &lt;/p&gt;
 
@@ -1183,7 +1590,7 @@ struct security_operations lsmsb_ops = {
   line.
 &lt;/p&gt;
 
-@&lt;Module initilisation@&gt;=
+@&lt;Module initialisation@&gt;=
 static __init int lsmsb_init(void)
 {
 	if (!security_module_enable(&amp;lsmsb_ops))
@@ -1206,37 +1613,876 @@ security_initcall(lsmsb_init);
 #include &lt;linux/fs_struct.h&gt;
 #include &lt;linux/uaccess.h&gt;
 
-#include &quot;lsmsb.h&quot;
 #include &quot;lsmsb_external.h&quot;
 
+@&lt;Value structure@&gt;
+
+@&lt;Filter structure@&gt;
+
+@&lt;Sandbox structure@&gt;
+
+@&lt;List of operations@&gt;
+
+@&lt;Filter codes@&gt;
+
+@&lt;Typechecking@&gt;
+
+@&lt;Evaluating filters@&gt;
+
 @&lt;Installing sandboxes@&gt;
 
 @&lt;LSM interface@&gt;
 
-@&lt;Module initilisation@&gt;
+@&lt;Module initialisation@&gt;
+
+@{file lsmsb_external.h
+@&lt;External structures@&gt;
+
+@/* The assembler
+
+&lt;p&gt;Obviously humans are going to need some assistance when building these
+filter structures to load into the kernel. Preferably, a high level language
+would allow programmers to precisely specify their desired level of access.
+For now at least, we only provide a low level, assembly like language
+for this purpose.&lt;/p&gt;
+
+&lt;p&gt;The following code defines a sandbox with a single filter for
+&lt;tt&gt;dentry-open&lt;/tt&gt;. If you recall, the context of &lt;tt&gt;dentry-open&lt;/tt&gt;
+specifies that the &lt;tt&gt;mode&lt;/tt&gt; argument to &lt;tt&gt;open&lt;/tt&gt; is provided in
+register one. The following code tests the least significant bit of this
+argument (which requests write access) and fails if it's set.&lt;/p&gt;
+
+@&lt;ex1|LSMSB example code@&gt;=
+filter dentry-open {
+  ldi r2,1;
+  and r2,r1,r2;
+  jc r2,#fail;
+  ldi r0,1;
+  ret r0;
+#fail:
+  ldi r0,0;
+  ret r0;
+}
 
-@{file lsmsb_filter.c
-#include &lt;linux/string.h&gt;
-#include &lt;linux/slab.h&gt;
+@/ A more complex example
 
-#include &quot;lsmsb.h&quot;
-#include &quot;lsmsb_filter.h&quot;
-#include &quot;lsmsb_type_vector.h&quot;
+&lt;p&gt;Here we have a more complex example which involves bytestrings and
+constants. As you can see, we define a bytestring constant named
+&lt;tt&gt;etc-prefix&lt;/tt&gt; which we can use as an argument to &lt;tt&gt;ldc&lt;/tt&gt;.&lt;/p&gt;
 
-@&lt;Typechecking@&gt;
+&lt;p&gt;Once loaded, we test the bytestring in register zero (which, according to
+the context for &lt;tt&gt;dentry-open&lt;/tt&gt;, is the full path to the file to be
+opened) and test that &lt;tt&gt;/etc/&lt;/tt&gt; is a prefix of the path.&lt;/p&gt;
 
-@{file lsmsb.h
+@&lt;ex2|LSMSB example code@&gt;=
 
-@&lt;Filter structure@&gt;
+filter dentry-open {
+  constants {
+    var etc-prefix bytestring = &quot;/etc/&quot;;
+  }
+
+  ldc r2,etc-prefix;
+  isprefixof r2,r2,r0;
+  jc r2,#fail;
+  ldi r0,1;
+  ret r0;
+#fail:
+  ldi r0,0;
+  ret r0;
+}
+
+@{file example-sandbox1.sb
+@&lt;ex1@&gt;
+
+@{file example-sandbox2.sb
+@&lt;ex2@&gt;
+
+@/ The filter structure
+
+While parsing filters, we build up a structure called a &lt;tt&gt;Filter&lt;/tt&gt; (we've
+switched to C++ for this code).
+
+@&lt;lsmsb-as-filter|Filter structure@&gt;=
+struct Filter {
+  Filter()
+      : spill_slots(0),
+        type_string(NULL),
+        filter_code(LSMSB_FILTER_CODE_MAX) {
+  }
+
+  @&lt;lsmsb-as-filter-typecheck@&gt;
+  @&lt;lsmsb-as-filter-write@&gt;
+
+  std::string name;  // the name of the filter (i.e. &quot;dentry-open&quot;)
+  std::vector&lt;Constant*&gt; constants;
+  unsigned spill_slots;
+  std::vector&lt;uint32_t&gt; ops;
+  const char *type_string;  // the types of this filter's context
+  unsigned filter_code;  // the enum value of the filter
+};
+
+@/ The constant structure
+
+&lt;p&gt;Constants in this language have a name and a type. We implement this as a base
+class which holds the name with subclasses for each of the types.&lt;/p&gt;
+
+&lt;p&gt;The &lt;tt&gt;Write&lt;/tt&gt; member serialises the constant to standard out in the
+external format which the kernel expects.&lt;/p&gt;
+
+@&lt;lsmsb-as-constant|Constant classes@&gt;=
+struct Constant {
+  explicit Constant(const std::string &amp;n)
+      : name(n) {
+  }
+
+  enum Type {
+    TYPE_U32,
+    TYPE_BYTESTRING,
+  };
+
+  const std::string name;
+
+  virtual Type type() const = 0;
+  virtual bool Write() const = 0;
+};
+
+struct ByteString : public Constant {
+  ByteString(const std::string &amp;name, const std::string &amp;ivalue)
+      : Constant(name),
+        value(ivalue) {
+  }
+
+  Type type() const {
+    return TYPE_BYTESTRING;
+  }
+
+  bool Write() const {
+    struct lsmsb_constant_wire wire;
+
+    wire.type = 1;
+    wire.value = value.size();
+
+    if (!writea(1, &amp;wire, sizeof(wire)) ||
+        !writea(1, value.data(), value.size())) {
+      return false;
+    }
+
+    return true;
+  }
+
+  const std::string value;
+};
+
+struct U32 : public Constant {
+  U32(const std::string &amp;name, uint32_t v)
+      : Constant(name),
+        value(v) {
+  }
+
+  Type type() const {
+    return TYPE_BYTESTRING;
+  }
+
+  bool Write() const {
+    struct lsmsb_constant_wire wire;
+
+    wire.type = 0;
+    wire.value = value;
+
+    if (!writea(1, &amp;wire, sizeof(wire)))
+      return false;
+
+    return true;
+  }
+
+  const uint32_t value;
+};
+
+@/** Parsing
+
+&lt;p&gt;The parser is written using &lt;a
+href=&quot;http://www.complang.org/ragel/&quot;&gt;Ragel&lt;/a&gt; which generates code for a
+state-machine parser from a description. Readers are directed to the Ragel
+documentation to fully understand the following.&lt;/p&gt;
+
+&lt;p&gt;The parser is non-recursive and uses a single &lt;tt&gt;int&lt;/tt&gt; value for its
+current state. Several magic variables are used in the snippets of code
+embedded in the parser:&lt;/p&gt;
+
+&lt;table&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;start&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;A pointer, into the input, to the start of the current word/string etc.&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;fpc&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;A pointer, into the input, to the byte which has just been parsed.&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;current_filter&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;A pointer to a &lt;tt&gt;Filter&lt;/tt&gt; structure.&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;line_no&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;The current line number.&lt;/td&gt;&lt;/tr&gt;
+  &lt;tr&gt;&lt;td&gt;&lt;tt&gt;op&lt;/tt&gt;&lt;/td&gt; &lt;td&gt;The current operation (a &lt;tt&gt;uint32_t&lt;/tt&gt;).&lt;/td&gt;&lt;/tr&gt;
+&lt;/table&gt;
+
+&lt;p&gt;The first chunk of the parser defines an action (&lt;tt&gt;next_line&lt;/tt&gt;) which
+increments the current line counter, a parser (&lt;tt&gt;ws&lt;/tt&gt;) to skip
+whitespace and another action (&lt;tt&gt;start&lt;/tt&gt;) to set the global &lt;tt&gt;start&lt;/tt&gt;
+variable.&lt;/p&gt;
+
+@&lt;parser-1|First chunk of Ragel code@&gt;=
+  action next_line {
+    line_no++;
+  }
+
+  ws = (' ' | '\t' | ('\n' %next_line) | &quot;//&quot; . (any - '\n') . ('\n' %next_line) | &quot;/*&quot; . any :&gt;&gt; &quot;*/&quot;)*;
+
+  action start {
+    start = fpc;
+  }
+
+@/ Parsing a filter
+
+@&lt;parser-filter|Parsing a filter@&gt;=
+  filter = &quot;filter&quot; . ws . (token &gt;start %filter_new) . ws . &quot;{&quot; . ws . constants? . spillslots? . inst* . (&quot;}&quot; @filter_push) . ws;
+
+@/ Starting to parse a new filter
+
+&lt;p&gt;In the Ragel chunk above, a token is parsed for the filter name and, at the
+end of the token, the &lt;tt&gt;filter_new&lt;/tt&gt; action is called. This sets the
+global &lt;tt&gt;current_filter&lt;/tt&gt; variable to a new filter and looks up the filter
+name in the list of supported filters&lt;/p&gt;
+
+&lt;p&gt;When a filter is complete, it's pushed onto a list of filters.&lt;/p&gt;
+
+@&lt;filter_new|Starting to parse a new filter@&gt;=
+  action filter_new {
+    current_filter = new Filter;
+    current_filter-&gt;name = std::string(start, fpc - start);
+
+    for (unsigned i = 0; ; ++i) {
+      if (filter_contexts[i].filter_name == NULL) {
+        fprintf(stderr, &quot;Error line %u: Unknown filter name '%s'\n&quot;, line_no, current_filter-&gt;name.c_str());
+        abort();
+      }
+
+      if (filter_contexts[i].filter_name == current_filter-&gt;name) {
+        current_filter-&gt;filter_code = i;
+        current_filter-&gt;type_string = filter_contexts[i].type_string;
+        break;
+      }
+    }
+  }
+
+  action filter_push {
+    filters.push_back(current_filter);
+    current_filter = NULL;
+    jmp_targets.clear();
+  }
+
+@/ Parsing constants
+
+@&lt;Parsing constants@&gt;=
+  action constant_new {
+    current_const_name = std::string(start, fpc - start);
+  }
+
+  action constant_bytestring_hex {
+    current_filter-&gt;constants.push_back(new ByteString(current_const_name, hex_parse(std::string(start, fpc - start))));
+  }
+
+  action constant_bytestring_string {
+    current_filter-&gt;constants.push_back(new ByteString(current_const_name, std::string(start, fpc - start)));
+  }
+
+  action constant_u32 {
+    current_filter-&gt;constants.push_back(new U32(current_const_name, u32_parse(std::string(start, fpc - start))));
+  }
+
+  token = [a-zA-Z\-_][0-9a-zA-Z\-_]*;
+  u32 = &quot;0x&quot;? . digit+;
+  constant_u32 = &quot;u32&quot; . ws . &quot;=&quot; . ws . (u32 &gt;start %constant_u32) . ws;
+  bytestring_literal_hex = &quot;x\&quot;&quot; . ([0-9a-fA-F]* &gt;start %constant_bytestring_hex) . &quot;\&quot;&quot; . ws;
+  bytestring_literal_string = &quot;\&quot;&quot; . ((any - '&quot;')* &gt;start %constant_bytestring_string) . &quot;\&quot;&quot; . ws;
+  bytestring_literal = bytestring_literal_hex | bytestring_literal_string;
+  constant_bytestring = &quot;bytestring&quot; . ws . &quot;=&quot; . ws . bytestring_literal;
+  constant_type_and_value = constant_u32 | constant_bytestring;
+  constant = &quot;var&quot; . ws . (token &gt;start %constant_new) . ws . constant_type_and_value . ws . &quot;;&quot; . ws;
+  constants = &quot;constants&quot; . ws . &quot;{&quot; . ws . constant* . &quot;}&quot; . ws;
+
+@/ Helper functions
+
+@&lt;Constant parsing helper functions@&gt;=
+
+static uint8_t
+from_hex_char(char h) {
+  if (h &gt;= '0' &amp;&amp; h &lt;= '9')
+    return h - '0';
+  if (h &gt;= 'a' &amp;&amp; h &lt;= 'f')
+    return (h - 'a') + 10;
+  return (h - 'A') + 10;
+}
+
+static std::string
+hex_parse(const std::string &amp;in) {
+  uint8_t *bytes = (uint8_t *) malloc(in.size() / 2);
+
+  for (size_t i = 0; i &lt; in.size() / 2; ++i) {
+    bytes[i] = (from_hex_char(in[i*2]) &lt;&lt; 4) |
+                from_hex_char(in[i*2 + 1]);
+  }
+
+  std::string ret((const char *) bytes, in.size() / 2);
+  free(bytes);
+  return ret;
+}
+
+static uint32_t
+u32_parse(const std::string &amp;in) {
+  return strtoul(in.c_str(), NULL, 0);
+}
+
+@/ Parsing the spill slots declaration
+
+@&lt;Parsing the spill slots declaration@&gt;=
+  action set_spill_slots {
+    current_filter-&gt;spill_slots = u32_parse(std::string(start, fpc - start));
+  }
+
+  spillslots = &quot;spill-slots&quot; . ws . (u32 &gt;start %set_spill_slots) . ws . &quot;;&quot; . ws;
+
+@/ Parsing instructions
+
+&lt;p&gt;After the constants and spill-slots declarations, each line is an
+'instruction'. I put the word in quotes because a jump target is included as an
+instruction according to the parser, but it doesn't actually emit an operation
+in the filter. Since all our jumps are forwards, jump resolution is simple and
+we do everything in a single pass.&lt;/p&gt;
+
+@&lt;Parsing instructions@&gt;=
+  inst = jump_target | and | ldc | ldi | jc | jmp | isprefixof | ret;
+
+@/ Parsing simple instructions
+
+&lt;p&gt;First we'll consider how to parse the easy instructions which don't
+reference any other structures.&lt;/p&gt;
+
+@&lt;Parsing simple instructions@&gt;=
+
+@&lt;set_reg-helpers@&gt;
+@&lt;push_op@&gt;
+@&lt;simple-insts@&gt;
+
+@/ Helper actions
+
+&lt;p&gt;We build up the current operation (in the global &lt;tt&gt;op&lt;/tt&gt;) as we parse,
+so we start with a bunch of helper actions for setting various parts of
+&lt;tt&gt;op&lt;/tt&gt;.&lt;/p&gt;
+
+@&lt;set_reg-helpers|Helper actions@&gt;=
+  reg = &quot;r&quot; digit+;
+
+  action set_reg1 {
+    op |= reg_parse(std::string(start, fpc - start)) &lt;&lt; 20;
+  }
+  action set_reg2 {
+    op |= reg_parse(std::string(start, fpc - start)) &lt;&lt; 16;
+  }
+  action set_reg3 {
+    op |= reg_parse(std::string(start, fpc - start)) &lt;&lt; 12;
+  }
+  action set_imm {
+    op |= imm_check(u32_parse(std::string(start, fpc - start)));
+  }
+
+@/ Helper functions
+
+&lt;p&gt;Those helper actions called several helper functions which we'll expand on
+now:&lt;/p&gt;
+
+@&lt;set_reg-functions|Helper functions@&gt;=
+
+static uint32_t
+imm_check(uint32_t v) {
+  if ((v &amp; 0xfffff) != v) {
+    fprintf(stderr, &quot;Immediate value too large: %d\n&quot;, v);
+    abort();
+  }
+
+  return v;
+}
+
+static uint32_t
+reg_parse(const std::string &amp;r) {
+  return strtoul(r.c_str() + 1, NULL, 10);
+}
+
+@/ Pushing new operations
+
+&lt;p&gt;Every time we complete an operation, we need to add it to the current
+filter:&lt;/p&gt;
+
+@&lt;push_op|Pushing operations@&gt;=
+  action push_op {
+    current_filter-&gt;ops.push_back(op);
+    op = 0;
+  }
+
+@/ Parsing the simple operations
+
+@&lt;simple-insts|Parsing the simple operations@&gt;=
+
+  action opcode_ldi {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_LDI) &lt;&lt; 24;
+  }
+  ldi = (&quot;ldi&quot; %opcode_ldi) . ws . (reg &gt;start %set_reg1) . &quot;,&quot; . ws . (u32 &gt;start %set_imm) . ws . (&quot;;&quot; %push_op) . ws;
+
+  action opcode_ret {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_RET) &lt;&lt; 24;
+  }
+  ret = (&quot;ret&quot; %opcode_ret) . ws . (reg &gt;start %set_reg1) . ws . (&quot;;&quot; %push_op) . ws;
+
+  action opcode_and {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_AND) &lt;&lt; 24;
+  }
+  and = (&quot;and&quot; %opcode_and) . ws .
+        (reg &gt;start %set_reg1) . ws . &quot;,&quot; . ws .
+        (reg &gt;start %set_reg2) . ws . &quot;,&quot; . ws .
+        (reg &gt;start %set_reg3) . ws .
+        (&quot;;&quot; %push_op) . ws;
+
+  action opcode_isprefixof {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_ISPREFIXOF) &lt;&lt; 24;
+  }
+  isprefixof = (&quot;isprefixof&quot; %opcode_isprefixof) . ws .
+               (reg &gt;start %set_reg1) . ws . &quot;,&quot; . ws .
+               (reg &gt;start %set_reg2) . ws . &quot;,&quot; . ws .
+               (reg &gt;start %set_reg3) . ws .
+               (&quot;;&quot; %push_op) . ws;
+
+@/ Parsing &lt;tt&gt;ldc&lt;/tt&gt;
+
+&lt;p&gt;The &lt;tt&gt;ldc&lt;/tt&gt; instruction is a little more complicated since we have to
+translate the constant name, given in the instruction, into an index into the
+constant table.&lt;/p&gt;
+
+&lt;p&gt;We simply use the order which the constants were defined as the index and
+walk the vector till we find the correct index.&lt;/p&gt;
+
+@&lt;parse-ldc|Parsing &lt;tt&gt;ldc&lt;/tt&gt;@&gt;=
+  action set_const {
+    const std::string constant_name(start, fpc - start);
+    unsigned i;
+
+    for (i = 0; i &lt; current_filter-&gt;constants.size(); ++i) {
+      if (current_filter-&gt;constants[i]-&gt;name == constant_name) {
+        op |= i;
+        break;
+      }
+    }
+
+    if (i == current_filter-&gt;constants.size()) {
+      fprintf(stderr, &quot;Error line %u: Unknown constant: %s\n&quot;, line_no, constant_name.c_str());
+      abort();
+    }
+  }
+
+  action opcode_ldc {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_LDC) &lt;&lt; 24;
+  }
+  ldc = (&quot;ldc&quot; %opcode_ldc) . ws . (reg &gt;start %set_reg1) . &quot;,&quot; . ws . (token &gt;start %set_const) . ws . (&quot;;&quot; %push_op) . ws;
+
+@/ Parsing jumps
+
+&lt;p&gt;In the code which the kernel sees, all the jumps are expressed as an
+unsigned number of operations to skip. However, people don't like writing code
+like that, counting lines and all, so we let them specify jump targets as
+strings and define them later.&lt;/p&gt;
+
+&lt;p&gt;So, uniquely for jump instructions, we don't actually know what the
+exact operation is when we finish parsing the instruction: the offset is still
+unknown. Thus we keep a map of named jump targets to a vector of offsets into
+the operation list for jumps to that label.&lt;/p&gt;
+
+&lt;p&gt;When we parse a jump, we add an entry to the map for the current instruction
+and when we parse a jump target, we lookup in the map and write the correct
+offset for each instruction which jumps there.&lt;/p&gt;
+
+@&lt;Parsing jumps@&gt;=
+  action jmp_mark {
+    const std::string target = std::string(start, fpc - start);
+    const std::map&lt;std::string, std::vector&lt;unsigned&gt; &gt;::iterator i =
+      jmp_targets.find(target);
+
+    if (i == jmp_targets.end()) {
+      jmp_targets[target].push_back(current_filter-&gt;ops.size());
+    } else {
+      i-&gt;second.push_back(current_filter-&gt;ops.size());
+    }
+  }
+
+  action opcode_jmp {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_JMP) &lt;&lt; 24;
+  }
+  jmp = (&quot;jmp&quot; %opcode_jmp) . ws . '#' . (token &gt;start %jmp_mark) . ws . (&quot;;&quot; %push_op) . ws;
+
+  action opcode_jc {
+    op |= static_cast&lt;uint32_t&gt;(LSMSB_OPCODE_JC) &lt;&lt; 24;
+  }
+  jc = (&quot;jc&quot; %opcode_jc) . ws . (reg &gt;start %set_reg1) . ws . &quot;,&quot; . ws . '#' . (token &gt;start %jmp_mark) . ws . (&quot;;&quot; %push_op) . ws;
+
+  action jump_resolve {
+    const std::string target = std::string(start, fpc - start);
+    const std::map&lt;std::string, std::vector&lt;unsigned&gt; &gt;::iterator i =
+      jmp_targets.find(target);
+
+    if (i == jmp_targets.end()) {
+      fprintf(stderr, &quot;Error line %u: Jump target without any jumps (%s)\n&quot;, line_no, target.c_str());
+      abort();
+    }
+
+    for (std::vector&lt;unsigned&gt;::const_iterator
+         j = i-&gt;second.begin(); j != i-&gt;second.end(); ++j) {
+      current_filter-&gt;ops[*j] |= current_filter-&gt;ops.size() - *j;
+    }
+
+    jmp_targets.erase(i);
+  }
+  jump_target = '#' . (token &gt;start %jump_resolve) . ':' . ws;
+
+@/ Parsing a whole file
+
+&lt;p&gt;A file, as a whole, is just a list of filters:&lt;/p&gt;
+
+@&lt;Parsing a file@&gt;=
+  as := ws . filter*;
+
+@/ The &lt;tt&gt;main&lt;/tt&gt; function
+
+@&lt;main@&gt;=
+int
+main(int argc, char **argv) {
+  @&lt;main-openread@&gt;
+
+  @&lt;parsing-globls@&gt;
+
+  %% write init;
+  %% write exec;
+
+  if (cs == as_error) {
+    fprintf(stderr, &quot;Error line %u: parse failure around: %s\n&quot;, line_no, p);
+  }
+
+  @&lt;typecheck-filters@&gt;
+
+  @&lt;serialise-filters@&gt;
+
+  return 0;
+}
+
+@/ Opening and reading the input
+
+@&lt;main-openread|Opening and reading the input@&gt;=
+  if (argc != 2) {
+    fprintf(stderr, &quot;Usage: %s &lt;input file&gt;\n&quot;, argv[0]);
+    return 1;
+  }
+
+  const int fd = open(argv[1], O_RDONLY);
+  if (fd &lt; 0) {
+    perror(&quot;Cannot open input file&quot;);
+    return 1;
+  }
+
+  struct stat st;
+  fstat(fd, &amp;st);
+
+  char *input = (char *) malloc(st.st_size);
+  read(fd, input, st.st_size);
+  close(fd);
+
+@/ Parsing variables
+
+&lt;p&gt;A detailed &lt;a href=&quot;@@cite:parser-1@@&quot;&gt;above&lt;/a&gt;, during parsing a number of
+globals are used. These happen to not actually be globals in the C sense. Since
+the parsing code is expanded inline into this function, the 'globals' are
+actually local variables to this function.&lt;/p&gt;
+
+@&lt;parsing-globls@&gt;=
+  int cs;  // current parsing state
+  uint32_t op = 0;  // current operation
+  char *p = input;  // pointer to input
+  const char *start = NULL;
+  char *const pe = input + st.st_size;
+  char *const eof = pe;
+  unsigned line_no = 1;
+  std::map&lt;std::string, std::vector&lt;unsigned&gt; &gt; jmp_targets;
+  std::vector&lt;Filter*&gt; filters;
+  Filter *current_filter = NULL;
+  std::string current_const_name;
+
+@/ Typechecking the filters
+
+&lt;p&gt;Once we have parsed the filters we typecheck them since the kernel will
+reject them anyway if they fail to typecheck. We simply call the
+&lt;tt&gt;Typecheck&lt;/tt&gt; member on each filter which we'll expand upon below.&lt;/p&gt;
+
+@&lt;typecheck-filters@&gt;=
+  bool filter_failed = false;
+  for (std::vector&lt;Filter*&gt;::const_iterator
+       i = filters.begin(); i != filters.end(); ++i) {
+    if ((*i)-&gt;Typecheck()) {
+      fprintf(stderr, &quot;Filter %s failed typecheck.\n&quot;, (*i)-&gt;name.c_str());
+      filter_failed = true;
+    }
+  }
+
+  if (filter_failed)
+    return 1;
+
+@/ Typechecking a filter
+
+&lt;p&gt;We wouldn't want the typechecking to diverge between the kernel and
+userspace, so we use the exact same code here as we do in the kernel.&lt;/p&gt;
+
+&lt;p&gt;We do so by filling out an &lt;tt&gt;lsmsb_filter&lt;/tt&gt; structure and an array of
+constants.&lt;/p&gt;
+
+@&lt;lsmsb-as-filter-typecheck|bool Filter::Typecheck() const {...@&gt;=
+  bool Typecheck() const {
+    const unsigned data_len = sizeof(struct lsmsb_filter) +
+                              constants.size() * sizeof(struct lsmsb_value);
+    uint8_t *filter_data = reinterpret_cast&lt;uint8_t*&gt;(malloc(data_len));
+    memset(filter_data, 0, data_len);
+    struct lsmsb_filter *filter = reinterpret_cast&lt;lsmsb_filter*&gt;(filter_data);
+
+    filter-&gt;num_operations = ops.size();
+    filter-&gt;num_spill_slots = spill_slots;
+    filter-&gt;num_constants = constants.size();
+    filter-&gt;operations = const_cast&lt;uint32_t*&gt;(&amp;ops[0]);
+    for (unsigned i = 0; i &lt; constants.size(); ++i) {
+      // It doesn't matter what value we use, as long as it isn't NULL so
+      // |filter_data| is as good as any.
+      filter-&gt;constants[i].data = constants[i]-&gt;type() == Constant::TYPE_BYTESTRING ?
+                                  filter_data : NULL;
+    }
+
+    uint8_t* type_vector = type_vector_for_filter(filter, type_string);
+
+    const bool ret = lsmsb_filter_typecheck(filter, type_vector);
+    free(type_vector);
+    free(filter);
+
+    return ret;
+  }
+
+@/ Serialising the filters
+
+&lt;p&gt;Once the filters have been typechecked, we write them out to stdout in the
+format which the kernel expects to parse them in (see &lt;a
+href=&quot;@@cite:Installing a sandbox@@&quot;&gt;the kernel code for installing a
+sandbox&lt;/a&gt;). This code follows the pattern of typechecking: all the actual
+work is in an a &lt;tt&gt;Filter&lt;/tt&gt; member function which we'll expand on
+below.&lt;/p&gt;
+
+@&lt;serialise-filters@&gt;=
+  const uint32_t num_filters = filters.size();
+  writea(1, &amp;num_filters, sizeof(num_filters));
+
+  for (std::vector&lt;Filter*&gt;::const_iterator
+       i = filters.begin(); i != filters.end(); ++i) {
+    if (!(*i)-&gt;Write()) {
+      fprintf(stderr, &quot;Write failure writing to stdout\n&quot;);
+      abort();
+    }
+  }
+
+@/ Serialising a filter
+
+&lt;p&gt;We serialise a filter by filling out the &lt;a href=&quot;@@cite:External structures@@&quot;&gt;external structures&lt;/a&gt; which the kernel expects and writing to
+&lt;tt&gt;stdout&lt;/tt&gt;.&lt;/p&gt;
+
+@&lt;lsmsb-as-filter-write|bool Filter::Write() const {...@&gt;=
+  bool Write() const {
+    struct lsmsb_filter_wire wire;
+    wire.filter_code = filter_code;
+    wire.num_operations = ops.size();
+    wire.num_spill_slots = spill_slots;
+    wire.num_constants = constants.size();
+
+    if (!writea(1, &amp;wire, sizeof(wire)) ||
+        !writea(1, &amp;ops[0], sizeof(uint32_t) * ops.size())) {
+      return false;
+    }
+
+    for (std::vector&lt;Constant*&gt;::const_iterator
+         i = constants.begin(); i != constants.end(); ++i) {
+      if (!(*i)-&gt;Write())
+        return false;
+    }
+
+    return true;
+  }
+
+@/ The &lt;tt&gt;writea&lt;/tt&gt; utility function
+
+&lt;p&gt;This is a very thin wrapper around &lt;tt&gt;write&lt;/tt&gt; which handles short
+writes.&lt;/p&gt;
+
+@&lt;writea|&lt;tt&gt;writea&lt;/tt&gt;@&gt;=
+static bool writea(int fd, const void *in_data, size_t length)
+{
+  size_t done = 0;
+  const uint8_t *data = reinterpret_cast&lt;const uint8_t*&gt;(in_data);
+
+  while (done &lt; length) {
+    ssize_t result;
+
+    do {
+      result = write(fd, data + done, length - done);
+    } while (result == -1 &amp;&amp; errno == EINTR);
+
+    if (result &lt; 0)
+      return false;
+    done += result;
+  }
+
+  return true;
+}
+
+@/ Compatibility with kernel code
+
+&lt;p&gt;Several of the functions which we are using in this code were written for
+the kernel code and, as such, use kernel specific functions like
+&lt;tt&gt;kmalloc&lt;/tt&gt;. In order to have them run in a userspace context we provide
+small shims for them.&lt;/p&gt;
+
+@&lt;Kernel compatibility code@&gt;=
+
+#include &lt;assert.h&gt;
+#include &lt;stdlib.h&gt;
+#define GFP_KERNEL 0
+#define BUG_ON(x) assert(!(x))
+
+uint8_t* kmalloc(size_t size, int unused) {
+  return (uint8_t*) malloc(size);
+}
+void kfree(void* heap) { free(heap); }
+
+@{file lsmsb-as.rl
+#include &lt;string&gt;
+#include &lt;vector&gt;
+#include &lt;map&gt;
+
+#include &lt;stdint.h&gt;
+#include &lt;stdio.h&gt;
+#include &lt;stdlib.h&gt;
+#include &lt;string.h&gt;
+
+#include &lt;unistd.h&gt;
+#include &lt;fcntl.h&gt;
+#include &lt;errno.h&gt;
+
+@&lt;Kernel compatibility code@&gt;
 
 @&lt;Value structure@&gt;
 
+@&lt;Filter structure@&gt;
+
 @&lt;List of operations@&gt;
 
 @&lt;Filter codes@&gt;
 
-@{file lsmsb_filter.h
-@&lt;Predecessor table utility functions@&gt;
+@&lt;Typechecking@&gt;
 
-@{file lsmsb_external.h
 @&lt;External structures@&gt;
+
+@&lt;writea@&gt;
+
+@&lt;Constant parsing helper functions@&gt;
+
+@&lt;set_reg-functions@&gt;
+
+%%{
+  machine as;
+
+@&lt;parser-1@&gt;
+
+@&lt;filter_new@&gt;
+
+@&lt;Parsing constants@&gt;
+
+@&lt;Parsing the spill slots declaration@&gt;
+
+@&lt;Parsing simple instructions@&gt;
+
+@&lt;parse-ldc@&gt;
+
+@&lt;Parsing jumps@&gt;
+
+@&lt;Parsing instructions@&gt;
+
+@&lt;parser-filter@&gt;
+
+@&lt;Parsing a file@&gt;
+
+  write data;
+}%%
+
+@&lt;lsmsb-as-constant@&gt;
+@&lt;lsmsb-as-filter@&gt;
+@&lt;main@&gt;
+
+@/* Using a sandbox
+
+&lt;p&gt;Once a sandbox has been built, using it is very simple. One needs only to
+write the sandbox to &lt;tt&gt;/proc/self/sandbox&lt;/tt&gt;. We provide a very simple
+binary which installs a given sandbox and runs a shell within it.&lt;/p&gt;
+
+&lt;p&gt;Note that, because sandboxes are composable, this can be done multiple
+times.&lt;/p&gt;
+
+@&lt;sb-install|Activate a sandbox@&gt;=
+  const int sandboxfd = open(&quot;/proc/self/sandbox&quot;, O_WRONLY);
+  if (sandboxfd &lt; 0) {
+    perror(&quot;Opening /proc/self/sandbox&quot;);
+    return 1;
+  }
+
+  if (write(sandboxfd, buffer, st.st_size) == -1) {
+    perror(&quot;Installing sandbox&quot;);
+    return 1;
+  }
+
+@{file lsmsb-install.c
+#include &lt;stdint.h&gt;
+#include &lt;stdio.h&gt;
+#include &lt;stdlib.h&gt;
+
+#include &lt;unistd.h&gt;
+#include &lt;fcntl.h&gt;
+#include &lt;sys/stat.h&gt;
+
+static int
+usage(const char *argv0) {
+  fprintf(stderr, &quot;Usage: %s &lt;sandbox file&gt;\n&quot;, argv0);
+  return 1;
+}
+
+int
+main(int argc, char **argv) {
+  if (argc != 2)
+    return usage(argv[0]);
+
+  const int fd = open(argv[1], O_RDONLY);
+  if (fd &lt; 0) {
+    perror(&quot;opening input&quot;);
+    return 1;
+  }
+
+  struct stat st;
+  fstat(fd, &amp;st);
+
+  uint8_t *buffer = malloc(st.st_size);
+  read(fd, buffer, st.st_size);
+
+  @&lt;sb-install@&gt;
+
+  fprintf(stderr, &quot;Sandbox installed\n&quot;);
+
+  execl(&quot;/bin/bash&quot;, &quot;/bin/bash&quot;, NULL);
+
+  return 127;
+}</diff>
      <filename>lsmsb.aw</filename>
    </modified>
  </modified>
  <removed type="array"/>
  <parents type="array">
    <parent>
      <id>e7e6fdcb438c4acf034a901f498907b6ccc8138f</id>
    </parent>
  </parents>
  <author>
    <name>Adam Langley</name>
    <email>agl@chromium.org</email>
  </author>
  <url>http://github.com/agl/lsmsb/commit/b8fd2e3fd44eccd316ba3fbdf9c69846c90b6163</url>
  <id>b8fd2e3fd44eccd316ba3fbdf9c69846c90b6163</id>
  <committed-date>2009-06-07T17:37:56-07:00</committed-date>
  <authored-date>2009-06-07T17:37:56-07:00</authored-date>
  <message>Readying for public release</message>
  <tree>2918dbc735412648f7ced7b82593c1c06e69870b</tree>
  <committer>
    <name>Adam Langley</name>
    <email>agl@chromium.org</email>
  </committer>
</commit>
