Skip to content
Permalink
e78f90fa98
Switch branches/tags
Go to file
 
 
Cannot retrieve contributors at this time

Summary

The Tangle is the core of IOTA's distributed ledger technology. This RFC proposes an API for interacting with and manipulating the Tangle in a type-safe manner that mirrors the flexible semantics of a garbage-collected language with retaining the performance and ownership model of Rust. It also proposes a novel eager solidification algorithm that is (likely to be) more performant than the existing thread-based solidification system, as well as only ever providing users of the API with a fully-informed, fully-inferred view of Tangle solidification.

Motivation

API

Rust's ownership model is designed to prevent concurrent mutable aliasing of a value. It does this by tracking the lifetimes of mutable references and enforcing an ownership model that permits each value to be owned by only one other thing.

The Tangle, being a recursive data structure that permits ownership relationships that do not pertain to Rust's ownership model, cannot be implemented using Rust's ownership semantics and must instead be modelled using a data structure such as a HashMap. As a result, graph APIs written in Rust tend to be rather more awkward to manipulate than those of garbage-collected or memory-unsafe languages.

For example, in a traditional HashMap-driven API:

let mut tangle = Tangle::default();

// Insert a transaction
let tx0_hash = tangle.insert(tx0);

// Navigate from tx0 to the trunk of the trunk of tx0
let tx0_trunk_hash = tangle
    .get(tx0_hash)
    .unwrap()
    .trunk;
let tx0_trunk_trunk_hash = tangle
    .get(tx0_trunk_hash)
    .unwrap()
    .trunk;
let tx0_trunk_trunk = tangle
    .get(tx0_trunk_trunk_hash)
    .unwrap();

As can be seen, walking the Tangle is unnecessarily explicit.

To address this issue we suggest a design that makes use of Rust's zero-cost approach to abstraction to transparently manipulate the Tangle using guard/reference types and the Deref trait. The Deref trait permits a type to dereference to another, thereby allowing one to call the methods of the inner type on the wrapper type.

The equivalent code using the proposed API abstraction:

let mut tangle = Tangle::default();

// Insert a transaction
let tx0 = tangle.insert(tx0);

// Navigate from tx0 to the trunk of the trunk of tx0
let tx0_trunk_trunk = tx0
    .trunk()
    .unwrap()
    .trunk()
    .unwrap();

Evidently, the proposed API makes traversing The Tangle significantly less explicit.

Solidification

IOTA's existing solidification algorithm (implemented in IRI) works by assuming that incoming transactions are unsolid (i.e: not associated with the main body of The Tangle) by default. On occasion, the program will spin up a thread that walks The Tangle searching for transactions that are unsolid but that can be inferred to be solid based on the solidity of transactions that they approve. This approach is unnecessarily slow and requires the entire Tangle structure to be thread-safe, vastly increasing the runtime overhead of accessing and manipulating it.

To address this issue we suggest a novel eager solidification algorithm that considers only Tangle transactions that would be affected by new incoming information when performing solidification. This eradicates the need for a solidification thread, and hence also the requirement for the Tangle to be thread-safe. This should improve performance for hot Tangle operations such as solidification, random walk, etc.

Detailed design

use std::{
    collections::HashMap,
    rc::Rc,
    ops::{Deref, DerefMut},
};

/// A transaction hash. To be replaced later with whatever implementation is required.
#[derive(Copy, Clone, Debug, Eq, PartialEq, Hash)]
pub struct TxHash(u64);

impl TxHash {
    pub fn is_genesis(&self) -> bool {
        unimplemented!()
    }
}

/// A transaction. Cannot be mutated once created.
pub struct Tx {
    trunk: TxHash,
    branch: TxHash,
    body: (),
}

impl Tx {
    pub fn trunk_hash(&self) -> TxHash {
        self.trunk
    }

    pub fn branch_hash(&self) -> TxHash {
        self.branch
    }

    pub fn approvee_hashes(&self) -> [TxHash; 2] {
        [self.trunk, self.branch]
    }
}

/// A vertex within the Tangle. Each vertex represents a transaction and its associated metadata.
pub struct Vertex {
    hash: TxHash,
    tx: Tx,
    solid: bool,
}

impl Vertex {
    pub fn new(hash: TxHash, tx: Tx) -> Self {
        Self {
            hash,
            tx,
            solid: false,
        }
    }

    /// This method is private because all solidification should occur via the solidification
    /// algorithm automatically.
    fn set_solid(&mut self) {
        self.solid = true;
    }

    pub fn is_solid(&self) -> bool {
        self.solid
    }

    pub fn tx(&self) -> &Tx {
        &self.tx
    }
}

impl Deref for Vertex {
    type Target = Tx;

    fn deref(&self) -> &Tx {
        &self.tx
    }
}

/// The main Tangle structure. Usually, this type is used as a singleton.
#[derive(Default)]
pub struct Tangle {
    vertices: HashMap<TxHash, Vertex>,
    txs_to_approvers: HashMap<TxHash, Vec<TxHash>>,
    missing_to_approvers: HashMap<TxHash, Vec<Rc<TxHash>>>,
}

impl Tangle {
    pub fn new() -> Self {
        Self::default()
    }

    pub fn contains(&self, hash: TxHash) -> bool {
        self.vertices.contains_key(&hash)
    }

    /// Get an immutable handle to the transaction with the given hash.
    pub fn get(&self, hash: TxHash) -> Option<VertexRef> {
        if self.contains(hash) {
            Some(VertexRef {
                hash: hash,
                tangle: self,
            })
        } else {
            None
        }
    }

    /// Get a mutable handle to the transaction with the given hash.
    pub fn get_mut(&mut self, hash: TxHash) -> Option<VertexRefMut> {
        if self.contains(hash) {
            Some(VertexRefMut {
                hash: hash,
                tangle: self,
            })
        } else {
            None
        }
    }

    /// Insert a vertex into the Tangle, automatically triggering the solidification algorithm.
    pub fn insert(&mut self, vert: Vertex) -> VertexRefMut {
        let new_hash = vert.hash;
        let new_approvees = vert.approvee_hashes();

        // Don't re-insert a vertex
        if !self.contains(new_hash) {
            // Perform the tangle insertion
            self.vertices.insert(new_hash, vert);
            new_approvees
                .iter()
                .for_each(|a| self.txs_to_approvers
                    .entry(*a)
                    .or_default()
                    .push(new_hash));

            // Does the new vertex approve vertices that we don't yet know about?
            if new_approvees
                // Do any of the new vertex's approvees...
                .iter()
                // ...not exist yet?
                .any(|approvee| !self.contains(*approvee))
            {
                let new_rc = Rc::new(new_hash);
                // For each approvee of the inserted vertex...
                let vertices = &self.vertices;
                let missing_to_approvers = &mut self.missing_to_approvers;
                new_approvees
                    .iter()
                    // ...check to see whether it's missing from the tangle...
                    .filter(|approvee| !vertices.contains_key(*approvee))
                        // ...and remember that visiting it is work we need to do later...
                    .for_each(|approvee| missing_to_approvers
                        .entry(*approvee)
                        .or_default()
                        // ...by associating it with the missing approvee.
                        .push(new_rc.clone()));
            }

            // Attempt to propagate solidification information based on the new
            // information the inserted vertex has provided us with. We do this
            // by checking to see whether any approvers were waiting upon this vertex.
            self.missing_to_approvers
                .remove(&new_hash)
                .into_iter()
                .flatten()
                .filter_map(|hash| Rc::try_unwrap(hash).ok())
                .for_each(|hash| self.try_solidify(hash));
        }

        self.get_mut(new_hash)
            .unwrap() // Can't fail, we just inserted it
    }

    /// Attempt to perform solidification upon a node (and its approvers). This method is private
    /// because it is automatically run whenever the tangle is updated with new information
    fn try_solidify(&mut self, root: TxHash) {
        // Prevent borrow errors by borrowing the fields independently
        let vertices = &mut self.vertices;
        let txs_to_approvers = &self.txs_to_approvers;

        // The algorithm is recursive, but we don't want to use the stack
        let mut stack = vec![root];
        while let Some(current_vert) = stack.pop() {
            if let Some(approvee_hashes) = vertices
                .get(&current_vert)
                .filter(|v| !v.is_solid())
                .map(|v| v.approvee_hashes())
            {
                if approvee_hashes
                    // For each of the current root's approvees...
                    .iter()
                    // ...ensure that they are all solid...
                    .all(|a| vertices.get(&a).map(|a| a.is_solid()).unwrap_or(false) || a.is_genesis())
                {
                    // We can now solidify the current root since we know all approvees are solid
                    vertices
                        .get_mut(&current_vert)
                        .unwrap() // Can't fail
                        .set_solid();

                    // Now, propagate this information to the approvers of the current root by
                    // running the algorithm again for each of them
                    for approver in txs_to_approvers
                        .get(&current_vert)
                        .iter()
                        .map(|approvers| approvers.iter())
                        .flatten()
                    {
                        // Push the approver to the stack as the next vertex to consider
                        stack.push(*approver);
                    }
                }
            }
        }
    }
}

pub struct VertexRef<'a> {
    tangle: &'a Tangle,
    hash: TxHash,
}

impl<'a> VertexRef<'a> {
    pub fn exists(&self) -> bool {
        self.tangle.contains(self.hash)
    }

    pub fn trunk(&'a self) -> Option<Self> {
        let trunk_hash = self.tx.trunk;
        self.tangle.get(trunk_hash)
    }

    pub fn branch(&'a self) -> Option<Self> {
        let branch_hash = self.tx.branch;
        self.tangle.get(branch_hash)
    }
}

impl<'a> Deref for VertexRef<'a> {
    type Target = Vertex;

    fn deref(&self) -> &Self::Target {
        self.tangle.vertices
            .get(&self.hash)
            .unwrap()
    }
}

pub struct VertexRefMut<'a> {
    tangle: &'a mut Tangle,
    hash: TxHash,
}

impl<'a> VertexRefMut<'a> {
    pub fn trunk(&'a mut self) -> Option<Self> {
        let trunk_hash = self.tx.trunk;
        self.tangle.get_mut(trunk_hash)
    }

    pub fn branch(&'a mut self) -> Option<Self> {
        let branch_hash = self.tx.branch;
        self.tangle.get_mut(branch_hash)
    }
}

impl<'a> Deref for VertexRefMut<'a> {
    type Target = Vertex;

    fn deref(&self) -> &Self::Target {
        self.tangle.vertices
            .get(&self.hash)
            .unwrap()
    }
}

impl<'a> DerefMut for VertexRefMut<'a> {
    fn deref_mut(&mut self) -> &mut Self::Target {
        self.tangle.vertices
            .get_mut(&self.hash)
            .unwrap()
    }
}

impl<'a> VertexRefMut<'a> {
    pub fn do_for(&self, f: impl FnOnce(&Vertex)) {
        f(self.tangle.vertices
            .get(&self.hash)
            .unwrap());
    }

    pub fn do_for_mut(&mut self, f: impl FnOnce(&mut Vertex)) {
        f(self.tangle.vertices
            .get_mut(&self.hash)
            .unwrap());
    }
}

#[cfg(test)]
mod tests {
    use super::*;

    #[test]
    fn mutate() {
        let mut tangle = Tangle::default();

        let hash = unimplemented!();

        let vertex = tangle.get_mut(hash);

        vertex.set_solid();

        vertex.do_for(|vertex| {
            println!("Solid: {:?}", vertex.is_solid());
            println!("Trunk: {:?}", vertex.trunk_hash());
            println!("Branch: {:?}", vertex.branch_hash());
        });
    }
}

Drawbacks

API

  • The API requires a small amount of overhead when accessing vertex attributes such as a re-indexing the vertex from the Tangle structure. In practice, this can be mitigated by changing the implementation of the Tangle.

  • The API makes use of dereference "magic" by lazily performing a lookup whenever the VertexRef type is dereferenced. This may also be considered an advantage for users of the API, but it does make the performance characteristics of the code slightly harder to judge. We propose do_for and do_for_mut methods to permit working on a vertex without lookup overhead (example above)

Solidification

  • The eager solidification algorithm performs solidification upon insertion. It is theoretically possible for a situation to occur in which a backlog of unsolid vertices are inserted in a manner that causes the propagation work to build up until a completing vertex arrives, triggering an unusual amount of work upon insertion. This means that the insertion cost of vertices is not fixed, although in the long term, the total overhead of node solidification will always be optimally minimal.

Rationale and alternatives

  • The Tangle API design is both flexible for users (it feels like it has dynamic ownership) but also flexible for the implementation (it makes it easier to refactor the implementation later into a more efficient design without being tied to any particular ownership model).

  • Alternative solidification algorithms include sticking with the current implementation that uses a worker thread to solidify the tangle. As previously mentioned, the solution proposed will likely have much better performance in the average case.

Unresolved questions

  • How does the design of this Tangle API relate to the implementation of the storage database layer? Should we move to an async query model?