Skip to content
This repository has been archived by the owner on Sep 12, 2018. It is now read-only.
Richard Newman edited this page Feb 21, 2017 · 5 revisions

Welcome to the Project Mentat Wiki!

Context

There are two blog posts that discuss some of the motivations for the project and some of the technical context you might need in order to work on or with Project Mentat.

How it works

At its core, Mentat maintains a set of assertions of the form entity-attribute-value (EAV). The assertions conform to a schema whereby the given attribute constrains the associated value/set of associated values. We call these assertions datoms.

Entities and entids

A Mentat entity is represented by a positive integer. (This agrees with Datomic.) We call such a positive integer an entid.

Partitions

Datomic partitions the entid space in order to separate core knowledge base entities required for the healthy function of the system from user-defined entities. Datomic also partitions in order to ensure that certain index walks of related entities are efficient. Mentat follows suit, partitioning entids into the following partitions:

  • :db.part/db, for core knowledge base entities;
  • :db.part/user, for user-defined entities;
  • :db.part/tx, for transaction entities.

You almost certainly want to add new entities in the :db.part/user partition.

The entid sequence in a given partition is monotonically increasing, although not necessarily contiguous. That is, it is possible for a specific entid to have never been present in the system, even though its predecessor and successor are present.

Representation of assertions

Mentat assertions are represented as rows in the datoms SQLite table, and each Mentat row representing an assertion is tagged with a numeric representation of :db/valueType.

The tag is used to limit queries, and therefore is placed carefully in the relevant indices to allow searching numeric longs and doubles quickly. The tag is also used to convert SQLite values to the correct Mentat value type on query egress.

The value type tag mapping is currently:

:db/valueType value type tag SQLite storage class examples
:db.type/ref 0 INTEGER 1234
:db.type/boolean 1 INTEGER 0 (false), 1 (true)
:db.type/long 5 INTEGER -4321
:db.type/double 5 REAL -0.369
:db.type/string 10 TEXT arbitrary textual data
:db.type/keyword 13 TEXT :namespaced/keyword

Observe that some Mentat value types share a value type tag: they are differentiated using SQLite's storage class.

Representation as SQL tables

The authoritative table in the SQL store is the transactions table; from its contents, all other tables can be derived. Each assertion in a transaction (see Transacting) is represented as a row in the transactions table, which has columns roughly

e a v value_type_tag added tx

The added column is a boolean flag that is non-0 if the datom was added and 0 if the datom was retracted from the datom store. We index on tx so that we can quickly extract the datoms added or retracted as part of a particular transaction.

The most important table in the SQL store is the datoms table. Queries extract data from the datoms table (see Querying). It is the materialized view of the transactions table, taking into account all transacted additions and retractions. Each asserted assertion is represented as a row in the datoms table, which has columns roughly

e a v value_type_tag tx FLAGS

We (really, SQLite) maintains several indexes and partial indexes to make particular types of queries efficient (at the cost of increasing transaction time and increasing database fragmentation). In particular, we maintain the same set of indexes that Datomic does:

Index Contains
EAVT all datoms
AEVT all datoms
AVET datoms with attributes that have :db/index or :db/unique
VAET datoms with attributes that are :db/type :db.type/ref

The EAVT and AEVT indexes allow to efficiently enumerate entities and attributes, respectively. The AVET index allows to efficiently map attribute-value pairs to matching entities. The VAET index allows to efficiently reverse index.

Representation of metadata as SQL tables

The transactor maintains three metadata tables: idents, schema, and parts. These are materialized views capturing the current state of (the schema part of) the transactions table.

The idents table maintains the set of ident mappings from keyword ident (like :db/ident) to numeric entid (like 1). It looks like

ident entid
:db/ident 1
:db.part/db 2

The schema table maintains the flags and types of the Mentat schema. It looks like

ident attribute v value_type_tag
:db/txInstant :db/cardinality 31 0
:db/txInstant :db/index 1 1
:db/txInstant :db/valueType 25 0

Observe that the value type is represented with a value type tag of 0 (since it's a reference to the entid with ident :db.type/long (in future, :db.type/instant)), but the index is represented with a value type tag of 1 (since it's a boolean). This all may change in future as we make the representation more compact, or make certain operations more efficient.

The parts table maintains the partition ranges and especially the next ID to be allocated in each partition. It looks like

| part | start | idx | | --- | --- | --- | --- | |:db.part/db|0|38| |:db.part/user|65536|65536| |:db.part/tx|268435456|268435457|

Each transaction that allocates temporary IDs will increment the idx of the appropriate partition. Every transaction allocates a single transaction ID, so the :db.part/tx index should tick up regularly.