Merge pull request #126 from sine-fdn/consts

Support for `const` declarations
sine-fdn · Jun 5, 2024 · 28ad240 · 28ad240
2 parents cf07d50 + 58bc1ab
commit 28ad240
Show file tree

Hide file tree

Showing 17 changed files with 1,225 additions and 254 deletions.
diff --git a/Cargo.lock b/Cargo.lock
diff --git a/Cargo.toml b/Cargo.toml
@@ -1,13 +1,19 @@
 [package]
 name = "garble_lang"
-version = "0.2.0"
+version = "0.3.0"
 edition = "2021"
 rust-version = "1.60.0"
 description = "Turing-Incomplete Programming Language for Multi-Party Computation with Garbled Circuits"
 repository = "https://github.com/sine-fdn/garble/"
 license = "MIT"
 categories = ["command-line-utilities", "compilers"]
-keywords = ["programming-language", "secure-computation", "garbled-circuits", "circuit-description", "smpc"]
+keywords = [
+    "programming-language",
+    "secure-computation",
+    "garbled-circuits",
+    "circuit-description",
+    "smpc",
+]
 
 [[bin]]
 name = "garble"

diff --git a/language_tour.md b/language_tour.md
@@ -123,7 +123,7 @@ pub fn main(x: i32) -> i32 {
 }
 ```
 
-Garble supports for-each loops as the only looping / recursion construct in the language. For-each loops can only loop over _fixed-size_ arrays. This is by design, as it disallows any form of unbounded recursion and thus enables the Garble compiler to generate fixed circuits consisting only of boolean gates. Garble programs are thus computationally equivalent to [LOOP programs](https://en.wikipedia.org/wiki/LOOP_(programming_language)) and capture the class of _primitive recursive functions_.
+Garble supports for-each loops as the only looping / recursion construct in the language. For-each loops can only loop over _fixed-size_ arrays. This is by design, as it disallows any form of unbounded recursion and thus enables the Garble compiler to generate fixed circuits consisting only of boolean gates. Garble programs are thus computationally equivalent to [LOOP programs](<https://en.wikipedia.org/wiki/LOOP_(programming_language)>) and capture the class of _primitive recursive functions_.
 
 ```rust
 pub fn main(_x: i32) -> i32 {
@@ -196,7 +196,7 @@ Panic due to Overflow on line 17:43.
 
 Garble will also panic on integer overflows caused by other arithmetic operations (such as subtraction and multiplication), divisions by zero, and out-of-bounds array indexing.
 
-*Circuit logic for panics is always compiled into the final circuit (and includes the line and column number of the code that caused the panic), it is your responsibility to ensure that no sensitive information can be leaked by causing a panic.*
+_Circuit logic for panics is always compiled into the final circuit (and includes the line and column number of the code that caused the panic), it is your responsibility to ensure that no sensitive information can be leaked by causing a panic._
 
 ## Collection Types
 
@@ -366,13 +366,37 @@ The patterns are not exhaustive. Missing cases:
        | }
 ```
 
+### Constants
+
+Garble supports boolean and integer constants, which need to be declared at the top level and must be provided before compilation. This can be helpful for modelling "pseudo-dynamic" collections, i.e. collections whose size is not known during type-checking but will be known before compilation and execution:
+
+```rust
+const MY_CONST: usize = PARTY_0::MY_CONST;
+
+pub fn main(x: u16) -> u16 {
+    let array = [2u16; MY_CONST];
+    x + array[1]
+}
+```
+
+Garble also supports taking the minimum / maximum of several constants as part of the declaration of a constant, which, for instance, can be useful to set the size of a collection to the size of the biggest collection provided by different parties:
+
+```rust
+const MY_CONST: usize = max(PARTY_0::MY_CONST, PARTY_1::MY_CONST);
+
+pub fn main(x: u16) -> u16 {
+    let array = [2u16; MY_CONST];
+    x + array[1]
+}
+```
+
 ## Mental Model of Garble Programs
 
-Garble programs are boolean *circuits* consisting of a graph of logic gates, not a sequentially executed program of instructions on a von Neumann architecture with main memory and CPU. This has deep consequences for the programming style that leads to efficient Garble programs, with programs that would be efficient in "normal" programming languages resulting in highly inefficient circuits and vice versa.
+Garble programs are boolean _circuits_ consisting of a graph of logic gates, not a sequentially executed program of instructions on a von Neumann architecture with main memory and CPU. This has deep consequences for the programming style that leads to efficient Garble programs, with programs that would be efficient in "normal" programming languages resulting in highly inefficient circuits and vice versa.
 
 One example has already been mentioned: Copying whole arrays in Garble is essentially free, because arrays (and their elements) are just a collection of output wires from a bunch of boolean logic gates. Duplicating these wires does not increase the complexity of the circuit, because no additional logic gates are required.
 
-Replacing the element at a *constant* index in an array with a new value is equally cheap, because the Garble compiler can just duplicate the output wires of all the other elements and only has to use the wires of the replacement element where previously the old element was being used. In contrast, replacing the element at a *non-constant* index (i.e. an index that depends on a runtime value) is a much more expensive operation in a boolean circuit than it would be on a normal computer, because the Garble compiler has to generate a nested multiplexer circuit.
+Replacing the element at a _constant_ index in an array with a new value is equally cheap, because the Garble compiler can just duplicate the output wires of all the other elements and only has to use the wires of the replacement element where previously the old element was being used. In contrast, replacing the element at a _non-constant_ index (i.e. an index that depends on a runtime value) is a much more expensive operation in a boolean circuit than it would be on a normal computer, because the Garble compiler has to generate a nested multiplexer circuit.
 
 Here's an additional example: Let's assume that you want to implement an MPC function that on each invocation adds a value into a (fixed-size) collection of values, overwriting previous values if the buffer is full. In most languages, this could be easily done using a ring buffer and the same is possible in Garble:
 
@@ -402,4 +426,4 @@ The difference in circuit size is staggering: While the first version (with `i`
 
 Such an example might be a bit contrived, since it is possible to infer the inputs of both parties (except for the element that is dropped from the array) from the output of the above function, defeating the purpose of MPC, which is to keep each party's input private. But it does highlight how unintuitive the computational model of pure boolean circuits can be from the perspective of a load-and-store architecture with main memory and CPU.
 
-It can be helpful to think of Garble programs as being executed on a computer with infinite memory, free copying and no garbage collection: Nothing ever goes out of scope, it is therefore trivial to reuse old values. But any form of branching or looping needs to be compiled into a circuit where each possible branch or loop invocation is "unrolled" and requires its own dedicated logic gates. In normal programming languages, looping a few additional times does not increase the program size, but in Garble programs additional gates are necessary. The size of Garble programs therefore reflects the *worst case* algorithm performance: While normal programming languages can return early and will often require much less time in the best or average case than in the worst case, the evaluation of Garble programs will always take constant time, because the full circuit must always be evaluated.
+It can be helpful to think of Garble programs as being executed on a computer with infinite memory, free copying and no garbage collection: Nothing ever goes out of scope, it is therefore trivial to reuse old values. But any form of branching or looping needs to be compiled into a circuit where each possible branch or loop invocation is "unrolled" and requires its own dedicated logic gates. In normal programming languages, looping a few additional times does not increase the program size, but in Garble programs additional gates are necessary. The size of Garble programs therefore reflects the _worst case_ algorithm performance: While normal programming languages can return early and will often require much less time in the best or average case than in the worst case, the evaluation of Garble programs will always take constant time, because the full circuit must always be evaluated.
diff --git a/src/ast.rs b/src/ast.rs
@@ -11,6 +11,10 @@ use crate::token::{MetaInfo, SignedNumType, UnsignedNumType};
 #[derive(Debug, Clone, PartialEq, Eq)]
 #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
 pub struct Program<T> {
+    /// The external constants that the top level const definitions depend upon.
+    pub const_deps: HashMap<String, HashMap<String, (T, MetaInfo)>>,
+    /// Top level const definitions.
+    pub const_defs: HashMap<String, ConstDef>,
     /// Top level struct type definitions.
     pub struct_defs: HashMap<String, StructDef>,
     /// Top level enum type definitions.
@@ -19,6 +23,48 @@ pub struct Program<T> {
     pub fn_defs: HashMap<String, FnDef<T>>,
 }
 
+/// A top level const definition.
+#[derive(Debug, Clone, Hash, PartialEq, Eq)]
+#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
+pub struct ConstDef {
+    /// The type of the constant.
+    pub ty: Type,
+    /// The value of the constant.
+    pub value: ConstExpr,
+    /// The location in the source code.
+    pub meta: MetaInfo,
+}
+
+/// A constant value, either a literal, a namespaced symbol or an aggregate.
+#[derive(Debug, Clone, Hash, PartialEq, Eq)]
+#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
+pub struct ConstExpr(pub ConstExprEnum, pub MetaInfo);
+
+/// The different kinds of constant expressions.
+#[derive(Debug, Clone, Hash, PartialEq, Eq)]
+#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
+pub enum ConstExprEnum {
+    /// Boolean `true`.
+    True,
+    /// Boolean `false`.
+    False,
+    /// Unsigned integer.
+    NumUnsigned(u64, UnsignedNumType),
+    /// Signed integer.
+    NumSigned(i64, SignedNumType),
+    /// An external value supplied before compilation.
+    ExternalValue {
+        /// The party providing the value.
+        party: String,
+        /// The variable name of the value.
+        identifier: String,
+    },
+    /// The maximum of several constant expressions.
+    Max(Vec<ConstExpr>),
+    /// The minimum of several constant expressions.
+    Min(Vec<ConstExpr>),
+}
+
 /// A top level struct type definition.
 #[derive(Debug, Clone, Hash, PartialEq, Eq)]
 #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
@@ -137,6 +183,8 @@ pub enum Type {
     Fn(Vec<Type>, Box<Type>),
     /// Array type of a fixed size, containing elements of the specified type.
     Array(Box<Type>, usize),
+    /// Array type of a fixed size, with the size specified by a constant.
+    ArrayConst(Box<Type>, String),
     /// Tuple type containing fields of the specified types.
     Tuple(Vec<Type>),
     /// A struct or an enum, depending on the top level definitions (used only before typechecking).
@@ -173,6 +221,13 @@ impl std::fmt::Display for Type {
                 size.fmt(f)?;
                 f.write_str("]")
             }
+            Type::ArrayConst(ty, size) => {
+                f.write_str("[")?;
+                ty.fmt(f)?;
+                f.write_str("; ")?;
+                size.fmt(f)?;
+                f.write_str("]")
+            }
             Type::Tuple(fields) => {
                 f.write_str("(")?;
                 let mut fields = fields.iter();
@@ -279,6 +334,8 @@ pub enum ExprEnum<T> {
     ArrayLiteral(Vec<Expr<T>>),
     /// Array "repeat expression", which specifies 1 element, to be repeated a number of times.
     ArrayRepeatLiteral(Box<Expr<T>>, usize),
+    /// Array "repeat expression", with the size specified by a constant.
+    ArrayRepeatLiteralConst(Box<Expr<T>>, String),
     /// Access of an array at the specified index, returning its element.
     ArrayAccess(Box<Expr<T>>, Box<Expr<T>>),
     /// Tuple literal containing the specified fields.
@@ -290,7 +347,7 @@ pub enum ExprEnum<T> {
     /// Struct literal with the specified fields.
     StructLiteral(String, Vec<(String, Expr<T>)>),
     /// Enum literal of the specified variant, possibly with fields.
-    EnumLiteral(String, Box<VariantExpr<T>>),
+    EnumLiteral(String, String, VariantExprEnum<T>),
     /// Matching the specified expression with a list of clauses (pattern + expression).
     Match(Box<Expr<T>>, Vec<(Pattern<T>, Expr<T>)>),
     /// Application of a unary operator.
@@ -309,11 +366,6 @@ pub enum ExprEnum<T> {
     Range((u64, UnsignedNumType), (u64, UnsignedNumType)),
 }
 
-/// A variant literal, used by [`ExprEnum::EnumLiteral`], with its location in the source code.
-#[derive(Debug, Clone, Hash, PartialEq, Eq)]
-#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
-pub struct VariantExpr<T>(pub String, pub VariantExprEnum<T>, pub MetaInfo);
-
 /// The different kinds of variant literals.
 #[derive(Debug, Clone, Hash, PartialEq, Eq)]
 #[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]