-
Notifications
You must be signed in to change notification settings - Fork 1
Bedrock
A truly composable data layer for urbit (and beyond).
The goal of bedrock is to make data composability a real possibility by centralizing application state
data in a single agent with a consistent interface that other agents or front-end clients can interact with, allowing developers of one application to surface data from another application in a repeatable way. Previous attempts at urbit composability have been stymied by the fact that each agent writes its own data interface and thus each integration must be custom.
Bedrock takes the view that databases are a good technology and models its concepts off of proven solutions, like tables
, rows
, and ids
etc. Doing this makes caching solutions using sqlite3 (or your preferred technology) much easier, which enables responsive performance that users demand and urbit often lacks.
Bedrock modifies the typical database idea by being distributed-by-default, though of course you can use it "by yourself" like a normal database. To this end, we require every row to belong to a "path" which is essentially a set of peers who should keep in sync with that data, and some meta-data rules about who can modify what. We use the one-at-a-time subscriptions pattern to keep peers up to date so that peers who go offline for long periods of time do not impose an ever increasing memory burden of "unsent pokes."
Bedrock provides a core set of "common" types for the typical social primitives that many apps will require and re-invent over and over. Things like vote
react
comment
are relatively easy to provide and should prove useful, but a more interesting feature is relay
which is essentially a re-tweet for any arbitrary piece of data. We make use of the newly-release remote-scry functionality within urbit to make this work. The possibility of an "urbit-feed" app that aggregates across many data-types and applications becomes a reality.
Bedrock also has a "thread-poke" interface to provide users with a more convenient/typical API experience of synchronous request/response cycle as opposed to the default urbit model of separation between those things (which we also support).
Bedrock has two "special" tables, and then everything else. The paths-table and the peers-table. These two are closely related and could in fact be combined but for ergonomic and conceptual reasons they are separate. These special tables are essential for the distributed nature of the database to work. Essentially, a path is "where" the data lives and the peers are who are "in" that path. Each row in the peers table is just a (path, ship, role)
and some timestamps. %host
is the only meaningful role, everything else is up to application-specific logic to define the meaning of. Currently each path has 1 %host
who is the "ultimate source of truth" for all that data in that path. Other distribution protocols are specifiable but currently only %host
is supported.
+$ peers (map path (list peer))
:: when we create an object, we must specify who our peers are for the /path
+$ peer
$: =path :: same as path.row
=ship
=role :: %host or any other custom role
created-at=@da
updated-at=@da
received-at=@da
==
the path-table is a little more complicated, because it's the metadata container for the "rules" of the path. Basically the path-row specifies who is allowed to create, modify, or delete what kind of data within that path. It also contains typical database "constraint" rules like "id column must be unique in the foo
table"
+$ paths (map path path-row)
+$ path-row
$: =path
host=ship
=replication
default-access=access-rules :: for everything not found in the table-access
=table-access :: allows a path to specify role-based access rules on a per-table basis
=constraints :: if there is not a constraint rule for a given type, the default constraints for types will be applied
space=(unit [=path =role:membership]) :: if the path-row is created from a space, record the info
created-at=@da
updated-at=@da
received-at=@da
==
+$ replication ?(%host %gossip %shared-host) :: for now only %host is supported
+$ table-access (map type:common access-rules)
+$ access-rules (map role access-rule)
+$ access-rule [create=? edit=permission-scope delete=permission-scope]
+$ permission-scope ?(%table %own %none)
:: by default the host can CED everything and everyone else can CED the objects they created
++ default-access-rules (~(gas by *access-rules) ~[[%host [%.y %table %table]] [%$ [%.y %own %own]]])
+$ constraints (map type:common constraint)
+$ constraint [=type:common =uniques]
+$ uniques (set unique-columns) :: the various uniqueness rules that must all be true
+$ unique-columns (set column-accessor) :: names of columns that taken together must be unique in the table+path
++ default-vote-constraint [%vote (silt ~[(~(gas in *unique-columns) ~[1 2 3 "ship.id"])]) ~]
++ default-rating-constraint [%rating (silt ~[(~(gas in *unique-columns) ~[3 4 5 "ship.id" 2])]) ~]
++ default-constraints
%- ~(gas by *constraints)
:~ [%vote default-vote-constraint]
[%rating default-rating-constraint]
==
+$ column-accessor ?(@ud tape)
Urbit is a distributed network, so bedrock must be a distributed database. Paths and peers specify the rules around the data distribution, but how does that data actually sync up between ships? We use the one-at-a-time subscription model described here. When a new path is created, the peers in that path are all poked by the host agent, telling them they have been added to a path. In response, they update their own paths-table and peers-table to match, and then subscribe to the "next" update from the host. This subscription remains open until the host actually sends out some update (like creating a new row). They receive the update on the subscription, are kicked, mutate their own state to process the update, and then resubscribe for the next update. By forcing peers to subscribe for each update, we avoid the unbounded memory problem that undelivered pokes can have, since when a peer goes offline, they simply will not be re-subscribing to the next update, and won't be "bothering" the host anymore until they come back online, at which point the host just brings them up to speed on what they missed. This is essentially a custom implementation of solid-state subscriptions we developed which solves some of the issues we have seen in our development of chat-db.
When the host adds a data-row to the path, it's no big deal, because he just sends out the update since he's the host. But what if a peer wants to add (or edit or delete) a data-row? The peer simply forwards its create/edit/delete poke to the host, who checks if that peer actually has permissions to modify the data in that way, and then the host does the modification and pushes the update to subscribers on the original peer's behalf. Simple.
Bedrock is a distributed database, and the key word there is data. We provide several "common" types that we think will be useful, but we're not omniscient, so we allow users to specify (pretty much) any type they desire. They do this by using a uniquely-named data type
and using the %general
columns type and specifying a schema. The schema is just an ordered list of [name, type]
of the columns in their custom %general
type. In addition to all typical atom code, we support string maps, sets, lists, and paths and our own id
type which is useful for defining dependent relationships (a hallmark of any relational database)
+$ row
$: =path :: application-specific logic about what this row is attached to (ie /space/space-path/app/app-name/thing)
:: is used to push data out to peers list for that path
=id:common
=type:common :: MUST always be same as table type
v=@ud :: data-type version
data=columns :: the actual content
created-at=@da :: when the source-ship originally created the row
updated-at=@da :: when the source-ship originally last updated the row
received-at=@da :: when this ship actually got the latest version of the row, regardless of when the row was originally created
==
+$ columns
$% [%general cols=(list @)]
[%vote vote:common]
[%rating rating:common]
[%comment comment:common]
[%tag tag:common]
[%link link:common]
[%follow follow:common]
[%relay relay:common]
[%react react:common]
==
+$ schemas (map [=type:common v=@ud] schema)
+$ schema (list [name=@t t=@t]) :: list of [column-name type-code]
:: allowable @t codes are:
:: @t @ud etc (any atom code)
:: id (for a id:common type of [=ship t=@da], useful for referencing other rows from within your custom-type
:: unit (for a (unit @t) only)
:: path
:: list (for a list of @t)
:: set (for a set of @t)
:: map (for a map of @t to @t)
as far as "common" types go, the ones we have defined so far are:
- %vote
- %rating
- %comment
- %tag
- %follow
- %relay
- %react
+$ id [=ship t=@da] :: ship is who created the row, t is when it was created since that's inherently unique in one-at-a-time only creation fashion
:: like/dislike upvote/downvote
+$ vote
$: up=? :: true for like/upvote, false for dislike/downvote 0 -> 2
parent-type=type :: table name of the thing this vote is attached to 1 -> 6
parent-id=id :: id of the thing this vote is attached to 2 -> 14
parent-path=path :: 3 -> 30
==
:: 5 star rating, 100% scoring, etc
+$ rating
$: value=@rd :: the rating. any real number. up to app to parse properly
max=@rd :: the maximum rating the application allows. (useful for aggregating, and making display agnostic)
format=@tas :: an app-specific code for indicating what "kind" of rating it is (5-star or 100% or 7/10 cats or whatever)
parent-type=type :: table name of the thing being rated
parent-id=id :: id of the thing being rated
parent-path=path
==
:: plain text snippet referencing some other object
+$ comment
$: txt=@t :: the comment
parent-type=type :: table name of the thing being commented on
parent-id=id :: id of the thing being commented on
parent-path=path
==
:: reaction (emoji)
+$ react
$: react=@t :: the emoji code
parent-type=type :: table name of the thing being commented on
parent-id=id :: id of the thing being commented on
parent-path=path
==
:: tag some <thing> with metadata (ex: 'funny' 'based' 'programming' etc)
+$ tag
$: tag=@t :: the tag (ex: 'based')
parent-type=type :: table name of the thing being tagged
parent-id=id :: id of the thing being tagged
parent-path=path
==
:: classic social graph information
+$ follow
$: leader=ship
follower=ship
domain=path :: maybe I only want to follow ~zod's %recipes, not their %rumors posts
==
:: the relay table is necessary for making retweets work on urbit
:: the goal includes the ability to count retweets within a space
:: (should come with ability to relay to all paths or just to a
:: particular path)
+$ relay-protocol ?(%static %edit %all)
:: %static relays never change
:: %edit relays will push new versions when edits come through
:: %all will also delete when/if the original is deleted
+$ relay
$: =id :: the id of what is being relayed
=type :: type of what is being relayed
=path :: where the thing originally came from
revision=@ud
protocol=relay-protocol
deleted=?
==
--