Conclusion

NOTE: This document is currently under heavy development and is currently in an incomplete state as we finalize the fifth revision.

The Distributed Web (dWeb)

Authors

Jared Rice Sr., Neo Thawreww, Shikhar Srivastava and Vinay Gupta

Versions

v1.0 - October 7th, 2018
v2.0 - August 26th, 2019
v3.0 - December 10th, 2020
v4.0 - February 8th, 2022
v5.0 - November 7th, 2022

Core Concepts

Distributed Networking
UniChains
Binary Interplanetary Transport Protocol (BIT)
MultiChains
dDatabases
dDrives
dMachines
dDNS
dIdentity // Under development
dCash // Under development
dTokens // Coming soon
dOrganizations // Coming soon
App-Specific DHTs // Coming soon

Manifesto

By: Jared Rice Sr.

Decades ago, we set out to preserve your freedoms on the Internet. Over the years, we have fought tyrannical governments, institutions, corporations, and politicians who have continued to do everything they can to erode those freedoms and attack the many of us who continue to innovate ways to bypass their constant attacks. Somewhere, there is someone who is reading this, tasked with figuring out a way to destroy the things that we are building, not because a crime has been committed, but because we’re doing the unthinkable – we’re standing up to the cabal and continuing to find ways to thwart the authoritarian machine on your behalf. We’re the only army standing between you and them. The good news is, the Web that we're referencing in this paper and its infrastructure, has already been online for over 20 years. We're not launching a network, we're not launching a coin, nor does the dWeb represent software specifically. We view the terms "dWeb," "Distributed Web" and "Decentralized Web" as a reference to what this specification sets out to define: an ecosystem of components that allow for the development of truly end-to-end decentralized and blockchain-less systems.

While many Cypherpunks over the years, like Hal Finney, have passed on to solve the many ciphers found within their next lives, many like Aaron Swartz and John McAfee have been murdered by globalist elites and many others like Julian Assange and Edward Snowden have been unfairly framed and prosecuted for exposing the misdeeds of those very same globalist elites. Meanwhile, other Cypherpunks have been prosecuted for developing software, like decentralized banking systems, that ultimately assist humanity in evading the globalists and the tyranny of their puppateers - leading many software developers to operate under pseudonyms or move to "safe" jursidictions where they can legally innovate. The Cypherpunks movement wasn't just about blockchain, cryptocurrency or Web3 as most people have been taught – these are new concepts, we have been doing this since the early 80s. As Erick Hughes said in 1993 within the original Cypherpunk manifesto, it was about using the Internet to not only preserve our online freedom, but freedom in general. We were established to preserve freedom everywhere, for everybody, even if it meant risking our own lives or freedom.

We knew math, God’s language of the Universe, was the answer to freeing the world from the centralized controls of the many tyrants who seek extreme levels of power, at the expense of your sanity. God gave us cryptography so that we could control our own destiny, our own money, our own information, our own identity and therefore, our own existence. It mathematically ensures that no government or entity is capable of invading our private lives or wholly preventing us from participating in the world God gave us the right to dwell in. It sets the laws, it authenticates, it arbitrates, and it settles the score.

Together, we set out to develop the Bitcoin project in 2006-2008, as an alternative to government-issued currencies, giving people a way to be their own bank and many could argue it has been an unbridled success. We utilized many battletested technologies and protocols, like Proof-of-Work consensus and Berkeley DB, to build out a platform that we felt would help preserve your financial freedoms. While Bitcoin was far from perfect, it helped spearhead a digital revolution, showcasing the strengths and weaknesses of peer-to-peer networks, while exposing the power of cryptography to the masses. It was a great first step, although, much has changed over the last 13 years.

The times we live in are unprecedented. Governments are no longer hiding their motives or their corruption as they blatantly poison the masses with misinformation and viruses. At the same time, they are doing everything they can to prevent us from developing alternatives to their centralized surveillance systems, by using the SEC and other governing bodies to “regulate” decentralized technologies. Nothing can be more oxymoronic, yet, many Cypherpunks have elected to switch sides, siding with governments rather than the people they set out to protect decades ago. They agree that governments should play a role and have backdoors into their systems, ultimately choosing centralization over decentralization, in return for regulatory approval and financial favors.

Over the past five years, misinformation has become the cabal’s greatest weapon. While many make everything about politics, the political landscape is only a means for the cabal to control our ideologies and therefore, our lives. In the same way, corrupt governments and their friends in big tech have somehow found ways to misinform the public about what is truly centralized and what is decentralized. Platforms like Coinbase and Crypto.com are being promoted to the masses as cryptocurrency wallets, when in reality, cryptographic features are never provided to their end users. This is an attempt by central banks and central governments to centralize, what was one thought to be decentralized currencies and networks. In the same way, it’s the cabal’s attempt to misinform the masses about what cryptography truly is, because they know cryptography is the greatest threat to their existence. They know it’s the only way for humanity to truly prosper. They know it’s the one thing they’ll never beat or control. With cryptography, they know they’ve met their match.

Bitcoin and other blockchains have been invaded because the World Wide Web they were built on top of is inherently centralized and controlled by a handful of elites, it has also been realized that blockchain technology is inherently centralized as well. Over the past five years, we have been hard at work reimagining the web itself, so that we could not only build the currencies, apps and even operating systems of the future - we wanted to ensure it was a place where every square inch could only be controlled collectively by the people who use it, effectively keeping tyrants from forcefully centralizing it. Likewise, we wanted to ensure people were in control of their own data, whether it is currency or a social network post – allowing them to control their own destiny, as well as their own privacy. That web is the subject of this paper – and we call it the Distributed Web (dWeb).

Unlike Bitcoin, dWeb is just a specification that's built around the core ideas and discoveries of many of the industry's greatest peer-to-peer innovators. Its overall specification is derived from many of those technologies and protocols, like Hypercore, Dat, Paxos and DID, while bringing new discoveries of our own to the table, like off-chain currency, organizations and more. While many former Cypherpunks will most certainly rail against it, it is what they have been called upon to do by those in power. While many webs are being numbered (Web3, Web5 and now Web6), they're proving that the web has moved beyond iterations - we'll be at Web50 before you know it. The Distributed Web is the only Distributed Web that will ever exist and through this living document, we hope that it continues to grow as it has over the past 20 years. It's important to note that the technologies used to form this specification are the same technologies that powered popular decentralized and distributed file sharing networks and software platforms like BitTorrent, Kazaa, Limewire, BearShare and others; networks that withstood the attacks of globalist elites nearly 20 years ago and still stand today - we're just going back to what works. We're building on infrastructure that we know is decentralized and reliable - infrastructure that we know will survive the battles ahead.

While we only make up a small army of cryptographers, when the world adopts our creations, revolutions form and a much larger army appears on the horizon. Like the Apache Indians, our decentralized make up will be impossible for the Incas to defeat. This is the Great Reawakening. This is our moment. This is our war-to-win, in fact, the war is already won, we just have to begin the fight. Lastly, we will never forgive, we will never forget and we are always watching. You can expect us when the terrain is rough. You will find us when freedom is in jeopardy. Whether it’s the invasion of our countries or our freedoms, one element remains the same within our Universe’s immutable state: each invasion God has triumphed – and we will triumph again.

Onward.

Jared Rice Sr.

7 November 2022

Introduction

Past iterations of the web have been designed around a simple concept: users interacting with two-dimensional and sometimes three-dimensional software applications. To expand on this concept, computer scientists have begun applying modern-day applications to the web, finding new and innovative ways to bring robots and devices into the mix. Some of these modern-day applications include artificial intelligence, machine learning, robotics, the Internet of Things (IoT), and immersive digital experiences such as virtual reality, augmented reality, and mixed reality.

The web as we know it today is centered around the client-server model, in which:

Applications are hosted in data centers and served to remote clients (user devices).
Robots and machine learning algorithms store data on servers that can only be accessed by related robots.
IoT devices store data on servers that can only be accessed by related devices.
Immersive digital experiences store data for millions of users within vast networks of data centers.

The client-server model has served its purpose well and has been instrumental in the development of the web, but I feel it is time we begin the transition to something better. My reasons for stating this are simple. At best, the client-server model decreases system performance and stifles innovation. At worst, the client-server model hinders user experience, reduces privacy, disregards human rights, and negatively impacts our planet’s climate. These issues are caused by three factors, all of which typically occur simultaneously:

The organic generation and distribution of data are halted when entities (humans, robots, devices, applications, etc.) are prevented from owning and controlling the data they create. In other words, a scenario in which the original state of data can be changed once the creating entity has relinquished control to a system that can alter the data. When data does not maintain its organic state, systems that utilize the data are no longer “organic machines,” rather they are “artificial machines.” There can be no other definition of a machine whose state has been artificially altered.
Data is held in a silo, therefore outside entities are unable to consume the data in the same manner as those within. As a result, the data is closed off from public consumption and thereby limited in value, largely because the data could have been used by other entities for value-added purposes. Data that is held by centralized entities rarely benefits the author, financially or otherwise.
Data held in a centralized location leaves it vulnerable to attacks both technical and ethical in nature. For instance, consider that a web database is susceptible to hackers, or that a web application can be taken offline by a provider due to a policy disagreement, or that a human user and their digital existence can be removed from the web because of a particular viewpoint. Centralized technologies suffer from central points of failure; web operators have become de facto arbiters for the many entities that utilize systems under their control.

As it relates to the preceding factors, my primary concern as a systems engineer involves the central points of failure that can be found in today’s web-based systems. I have long pondered whether the combination of distributed data, distributed computing, and cryptography, could enable the development of a perfect system. It is one of the more important questions I have sought to answer over the past twenty years, if not the most important.

The use of the word “organic” in my studies is purposeful. It is not often that the terms “organic” and “machine” are used together, but after years of researching the topic, I believe their marriage is inevitable and that their existence within the computer science field should be commonplace. Viewing entities and their data within a weblike fabric, much in the same way that a physicist views particles within the universe, was my greatest insight and the perspective I finally settled on.

I realized that when every web-based entity could operate and control its own cryptographically secure binary data structures to which only they could append data; and when every web-based entity could openly announce, broadcast, and stream those data structures to other web-based entities so they could read or consume the data for their own purposes; that we would be well on our way to creating a perfect system. A system of organic machines.

The Inception of Organic Machines

During the progression towards organic data distribution, I was intrigued by the potential of organic machines.

The basic idea of an organic machine is not as complex as one might think. Consider the above diagram in which a program could be appended to block #0 of the data structure and then computed by the entity’s underlying processor. The first computation of the program would store its output at block #1, the second computation of the program would store its output at block #2, and so forth. If the program were a simple counter, which added 1 to the value of the most recent computation’s output, the third computation would output 1 (+) the output of the second computation, which would output “3.” Computation takes place on the entity’s local system where the data structure exists, meaning it uses the local process of that entity to compute the program stored at block #0 (when asked to), thus altering the state of the machine. In this case, the state of the machine is “3” and with the next run it will be “4.” By definition, this is an example of a single-writer distributed state machine, otherwise known as a single-writer distributed Turning computer.

Note that the machine’s state is immutable, meaning we could rewind its entire state to the very first computation and be able to replay the machine’s entire lifespan. Also, any processor, regardless of its characteristics, could rewind the machine’s state to its genesis computation and recompute (validate) the entire computation history. After validating the computation history, the processor would end up with a replica of the original state, because it is computing the same program with the same inputs as the original processor. This data structure can be streamed to other entities, including other single-writer distributed Turing computers, and consumed however they deem fit. To extend the example a bit further, imagine a program on a remote computer that wanted to execute our computer’s simple counter program. The remote computer’s program could grab the output from our computer and use it as part of its own program, which it would then store within its own data structure. An outside computer consuming the data (state) of our computer is one thing, but it is substantially more when that same outside computer is actually executing our computer.

It is important to point out that distributed Turing computers executing programs on other distributed Turing computers would not be possible using each entity’s underlying single-writer architecture, since each executing entity would require write privileges to the other entity’s underlying data structure (memory). This limitation presents a problem, since single-writer data structures were the undeniable solution to the three issues presented by the web’s client-server model, due in large part to their append-only, cryptographically secure state. This is because each data structure utilizes a keypair to ensure that only the data structure’s creator can append data to the structure (alter its state), which mathematically guarantees that tampering is an impossibility, and is what makes these data structures organic in the first place.

So how could we keep our distributed single-writer data structures in place while forming a multi-writer, highly cooperative environment around them? The answer to that question is what brought about the inception of the dWeb and an entirely new vision for the web at large.

Cooperative Organic Machines and The Particalized World of the dWeb

Our research led us to one solution: a mechanism that would combine single-writer data structures into a linearized or causally ordered feed of other single-writer data structures. It would look and act like a single-writer data structure but would compile its state from other single-writer data structures. In the case of organic machines, it would compile the program executions of multiple entities (participants) by collecting each of their computations from single-writer structures that are “plugged in” to the machine itself and combining them into a causally ordered or linearized view while preserving the integrity of the computations. Cooperative organic machines and multi-entity data views were born.

We aptly called dWeb single-writer data structures “UniChains,” the combined view of multiple UniChains, a “MultiChain,” and our multi-tenant distributed Turing machine, which derives its multi-entity computation history from a combined memory of multiple UniChains compiled into a single MultiChain, a “dMachine.” Consider for a moment that UniChains, MultiChains and dMachines exist within a single file containing pure binary that can be exchanged between billions of peers in a matter of seconds, or even milliseconds, and that a dMachine can handle billions of computations in a single second.

The speed at which 10,000 robots can share UniChains with one another is game-changing. Even more so is the fact that these 10,000 robots can constantly replicate and consume data from billions of other unrelated UniChains and MultiChains while simultaneously executing billions of dMachine computations. It is fascinating to imagine all the ways in which entities across our planet will analyze the data generated by these endless computations. To think, smart contracts without the need for expensive blockchain-based consensus. I will let your mind delve into the waters of our blockchainless future.

The dWeb is about open data and connecting the world in ways never before seen. Since dWeb-based data is open, omnipresent, and decentralized, entities like apps operate within the fabric in the same way as other entities. As a result, apps compete around data rather than over data. Yet all entities retain the ability to maintain their own content policies by aggregating the UniChains of entities that use their platforms into a collective MultiChain and using the structure’s built-in filters to create a “custom view” of the underlying UniChains. The dWeb enables a balance between individual entities that want to control their data and apps that want to control their content policies.

While there are those that may view this as a form of censorship, consider that a UniChain can only be edited and distributed by the individual entity that created it. If the individual entity continues to distribute the data, other apps can still consume the data while others may choose to filter it.

It is important to note that dWeb-compliant apps cannot alter a UniChain’s original data. Instead, apps create their own custom collection from the data of others, which itself is an organic form of data creation so long as the original source of data and its state remain unmutated and consumable by other entities. For instance, Social Network A may choose to filter an individual entity’s posts, but those posts may appear on Social Network B because it is replicating and consuming the very same posts.

The concept of distributed, unfiltered, multi-entity data consumption is central to the, much in the same way that the client-server model has been central to the web’s past iterations. Distributed, unfiltered, multi-entity data consumption will help us transition from an era of “big data” to an era where data openness is commonplace, privacy and user control are paramount, and digital experiences are simply collective “views” of the world’s data. This transition forces the ideas of tomorrow to compete around technology, experience, and design, speeding up the rate of Internet-based innovation and all but ending the business of data hoarding.

My hope is that by allowing countless numbers of tiny, ever-expanding data particles to roam the world as digital nutrients for countless robots, devices, computers, humans, and other intelligent entities, it will bring us closer to solving one of our planet’s most pressing problems: climate change. I must admit, it can be a bit awe-inspiring when put in this context, and I hope that it encourages a new generation of problem solvers to contribute a few bits of data to our collective intelligence. You never know what your contribution today might help compute tomorrow. It just might lead to a “perfect system,” or even the cure to save our beautiful planet.

True end-to-end decentralization has arrived. Welcome to the dWeb.
-Neo

Distributed Networking

As you will learn in this paper, the dWeb can be thought of as a multitude of entities (humans, robots, devices, apps, etc.) announcing and streaming distributed data structures among one another. These data structures include UniChains (single-writer ledgers), MultiChains (a causally ordered or linearized view of multiple UniChains), dMachines (programmable and cooperative UniChains), dDrives (UniChain based file systems), and dDatabases (UniChain based, binary tree, key-value stores). Essentially, one could state that the dWeb is comprised of UniChains, and they would be correct. UniChains and higher-level data structures are identifiable and discoverable on the Internet by a unique SHA-256 based cryptographic network address, which is discoverable using dWeb’s distributed hash table.

dWeb-Compliant Distributed Hash Tables

The creator of a dWeb data structure utilizes a dWeb-compliant Kademlia distributed hash table (DHT) to announce and store the network address associated with their data structure. The creator then lists themself as the initial seeder of the data structure by publishing their IP address and port number, alongside their dWeb network address, so that peers can connect and request the underlying data. Peers that download the data can then announce their connection details under the same dWeb network address so that future peers can request the data from them. The more peers that announce themselves as seeders under a given dWeb network address, the larger the “swarm” and the more distributed the data becomes. The DHT allows dWeb entities, such as UniChains and dDrives and the peers that host them, to be quickly and easily located.

DHTs are reliable, scalable, highly distributed key-value stores that ensure redundancy for networking data. DHTs have been implemented and tested by numerous applications, one of the more notable being the popular file sharing network BitTorrent. The DHT’s information is evenly distributed across participating entities, with each storing a small portion of information related to network addresses and peers, enabling entities to self-organize into a network and communicate with one another.

A pseudo-representation of a dWeb-compliant DHT would look as follows if the DHT contained 3 keys:

Key A	Key B	Key C
1.11.5.3:5000 5.22.13.1:1200 6.77.11.1:88	5.17.19.6:150 6.12.5.111:222	12.5.19.3:121

If an entity queried the DHT for Key A, the DHT would return all three IP addresses listed below the key. The entity could then choose which peers to connect to and request the data related to Key A.

What makes a DHT so efficient is the manner in which it distributes data among participants. Entities that query a DHT become hosts for a select portion of its data. The DHT organizes this data strategically among participants so that upon request, a particular piece of information can be located within a minimal number of network hops.

This paper does not intend to explain the inner workings of Kademlia DHT. Those looking for an in-depth explanation should read Kademlia’s official whitepaper located here.

UniChains

A UniChain is a reference implementation of a single-writer append-only ledger (SWL) that is designed to be exchanged between participants of a peer-to-peer network, without the risk of peers, other than the creator, altering its overall state. Every UniChain has a key-pair in which the private key holder is the only entity that can append data to the ledger. Entities that possess the public key can validate the authenticity of data within the ledger. Together, these traits make a UniChain a trustless distributed data structure.

From a computer science perspective, UniChains are binary append-only data structures that can be streamed between multiple network participants. The contents of these data structures are cryptographically hashed and signed. UniChains are identified internally by signed Merkle trees and are identified on the dWeb using a public key, which is discoverable using a chainID (dWeb network address) that is derived from the public key and stored within a dWeb-compliant DHT.

When a UniChain is created, a public/private key-pair is generated and the UniChain “type” and public key are appended to the UniChain’s genesis block (block 0). The genesis block is considered the UniChain’s “header block.” Unlike blockchains, a UniChain only stores data for the entity that can append data to the ledger, and blocks are not added in time intervals, rather, they are added only when data is appended to the ledger.

A pseudo-representation of a UniChain would look as follows:

Data Integrity

Data within a UniChain is stored as “blocks” which are identified by an “index.” Each block is signed by its creator so that a UniChain can be audited as to whether the data stored within it aligns with the hashes in the Merkle tree. This structure ensures that peers can request and stream (exchange) a specific block or block range with other peers, rather than the entire UniChain, while still being able to validate the partial chain.

Merkle trees are utilized within UniChains to create a way of identifying the content of a dataset by using hashes. The concept is simple: if the underlying content of a UniChain changes, the hash changes. When a block is added to the chain, a UniChain functions as a ledger that calls the “append()” mutation, thereby adding a new leaf to the tree and generating a new root hash. A UniChain’s private key is used to sign the root hash every time a new root hash is generated. This digital signature is sent to recipients along with the root hash so that recipients can verify its integrity.

User-Controlled Data

The dWeb is built around a simple concept: that each entity (human, device, robot, application, etc.) should maintain its own ledger or ledgers independent of other entities. This concept allows for the creation of custom materialized views surrounding multiple UniChains, resulting in a more organic and open system.

This concept also ensures that entities can control their own data and, importantly, that applications can develop views around the data of other entities using their own content policies. This duality provides a much-needed balance between users and applications, forcing developers to focus on innovation and experience rather than the simple ownership of data. Applications that choose to develop views (experiences) that filter particular data, will do so knowing that other apps can develop their own respective views around the same data, considering that all of it is open.

This fundamental change forces apps to compete around data rather than over data. For example, Social Network A may choose to filter some of user Bob’s posts, but those very same posts might appear on Social Network B because it is consuming Bob’s posts and choosing to filter them in a different way. Keep in mind, the data in this example belongs to Bob and not the applications that consume it. In other words, while Social Network A may choose to filter Bob, his data remains ever-present and unaltered since Bob is the only entity that can change his data. Consider also that Social Network C, or any other application for that matter, may choose to consume and materialize Bob’s data in a unique way by simply using different filters.

Live UniChain Replication

As mentioned previously, peers of a given UniChain can stream live so long as it was indicated by the requestor during the BIT handshake phase. Since the UniChain’s creator is live replicating by default, peers can do the same by live replicating their version to other peers, and so forth. Below is a pseudo-representation of live replication:

This type of replication makes UniChains a perfect fit for live media streaming and real-time communications like voice, video, and text. Real-time communication between two or more parties is made possible by MultiChains, causal streaming, and peers agreeing to live replicate their data among one another. You can learn more about MultiChains here.

Chainstore Management and Replication

The private key related to a UniChain is stored locally on the creating entity’s system, which is also the only system that can append blocks to the chain. This attribute presents a problem in situations in which a creating entity needs to mutate its UniChain from multiple systems (a human entity mutating its chain from a desktop and mobile device). Additionally, it is certain that an entity will create many UniChains as it travels through the dWeb; a problem as the replication of the entity’s chains will lead to a large number of connections to and from the entity’s network.

Considering these problems, a model was needed for managing and replicating UniChains and synchronizing write privileges (private keys) across multiple systems. Chainstore is that model and is essentially a hive of UniChains consisting of a default UniChain that derives the sub-UniChains below it.

Below is a pseudo-representation of sub-UniChains within a Chainstore:

Key-pairs corresponding to the sub-UniChains derive from the keys of the default UniChain, meaning that write access to the hive can be obtained by possessing the keys to the default UniChain.

Chainstore provides a protocol for synchronizing write privileges of the underlying hive, or a specific sub-UniChain, to remote entities, known as the Distributed UniChain Key Syncing (DUCKS) Protocol. For more information on the DUCKS protocol, read the DUCKS Whitepaper located here.

Chainstore has modules for generating, retrieving, and live replicating UniChains, even UniChains to which an entity does not possess write privileges. These modules, for instance, make it possible for an entity like a web application to easily generate and interact with UniChains on another entity’s system (a human entity’s computer). In this way, a web application can utilize the Chainstore module to initiate the synchronization of write privileges to multiple systems (a human entity’s mobile device or tablet) so that an entity can mutate its UniChains from multiple systems.

Use Cases for SWLs

Given that a UniChain is a single trustless binary file that can be used to represent any type of data distributed to any Internet entity, one could consider a UniChain to have an unlimited number of use cases. Data is not limited to web applications; data within UniChains can take on all shapes and sizes. For instance, as will be discussed in dDrives, a UniChain can be used to store the binary representation of an entire file system, which can then be exchanged between peers. Human entities can use UniChains to store data related to applications and their devices. Robots can use UniChains to store data related to their machine learning algorithms. Devices can use UniChains to store data related to their outputs. Developers can use UniChains to store the files related to an application so that the app is distributed and completely serverless. Whether distributed databases for web applications, distributed file systems, distributed machine learning, distributed IoT, or distributed computers, UniChains work with virtually any type of data.

Binary Interplanetary Transport Protocol

HTTP (Hypertext Transfer Protocol) is an application layer protocol used throughout the web for transferring information between servers and clients. Generally speaking, an application layer protocol defines the process for how clients on different systems transfer messages between one another. HTTP is an example of the client-server model. In this model, a client requests a resource located at an IP address or a domain that resolves to an IP address using an application such as a telnet or web browser. The response is handled by a server, which processes the client’s request and responds with a specific type of information.

The IP address can be thought of as a globally recognized address for a server (like a post office address is for mail) and is used to ensure that a client request arrives at the correct network address so that it can be properly handled and answered. The process starts when a client packages an HTTP request inside of a TCP or UDP message, which is packaged inside of an IP message containing the client’s IP address and the server’s IP address. The resulting message is transported over the Internet until it arrives at the router assigned to the server’s IP address. Once there, the assigned router sends the message to the appropriate server.

In most cases, the server employs a web application to handle the HTTP request after the IP and other information have been stripped away from the message. If the request is for a file such as “/index.html,” the server will package an HTTP response message containing the HTML (Hypertext Markup Language) from the “index.html” file and send the message to the IP address listed in the client’s request. If the file is missing or does not exist, the server will send a message back to the client indicating that the file does not exist.

Other communication protocols have been created for querying server-based data structures (e.g., MySQL databases) and transporting responses back to clients, much like HTTP transfers HTML, JSON, CSS, JavaScript, and media between servers and clients. The client-server model has been fundamental in the development of Web1 and Web2 and is used by blockchains and other Web3 technologies that rely on HTTP.

The dWeb enables the transition away from the client-server model and HTTP, making possible a serverless web that utilizes a single transport protocol. This new transport protocol, which is a wire-based protocol that transfers pure binary, leaving the process of data conversion to the consuming application, is known as the Binary Interplanetary Transport (BIT) Protocol. Whether HTML, JSON, CSS, or JavaScript, data is stored as binary within a UniChain, and when that data is requested by another peer it is transported as binary over BIT to the requesting peer. Computers and network devices speak machine code at their lowest level; thus, utilizing binary yields a very efficient means of data transmission.

Bit Request Types

BIT request messages are sent directly to peers (after peers have been discovered on the DHT in relation to a particular UniChain) similar to how HTTP request messages are sent from client to server. For example, if a dWeb-compliant browser were to look up a dWeb network address on a dWeb-compliant DHT, the DHT would return the peers and their network information back to the browser. From there, the browser would make a connection to one or more peers and begin sending request messages related to the underlying UniChain.

The requestor and peers would then exchange a series of messages until the peers begin streaming the requested data back to the requestor. The BIT protocol consists of various message types that enable peers to negotiate and transfer data between one another. Each message type is defined below:

Type	Name	Meaning
0	Feed	I want to talk about a particular dWeb address
1	Handshake	I want to negotiate how we can communicate
2	Info	Whether I'm starting or stopping uploading or downloading
3	Have	I have some data you said you wanted
4	Unhave	I no longer have data that I said I had
5	Want	This is the data I want
6	Unwant	I no longer want this data
7	Request	Please send me this data now
8	Cancel	Cancel this request
9	Data	Here is the data you requested
10-14	Unused	n/a
15	Custom	Custom extension message

dWeb Data Model

On the dWeb, peers can exchange any kind of data. UniChains represent the first standardized distributed dataset for the dWeb, and they are used in high-level abstractions such as dDrives and dDatabases. Other custom high-level abstractions involving a UniChain can be created, but at the end of the day, these abstractions remain UniChains at their lowest level. Put simply, any and all datasets on the dWeb are comprised of blocks in a linearized chain, each of which is filled with abstract blobs of binary data.

The BIT protocol can work with any UniChain abstraction, so long as the abstraction contains a list (log or ledger) of variable-sized blocks of bytes. A UniChain is a single-writer, immutable, cryptographically secure, append-only log, which means that new blocks can be added to the end by only the UniChain’s creator, and existing blocks cannot be removed or altered by anyone, even the creator. The following can be said of the data model used to exchange information between peers:

Each block in the chain is separated by boundaries so that the chain can be sent over several BIT protocol messages.
Each block has a corresponding hash to verify the integrity of the data.
Parent hashes are used to verify the integrity of two other hashes; parent hashes form a data structure known as a Merkle tree.
Block hashes are even-numbered, and parent hashes are odd-numbered.
Hash trees allow peers to validate a specific block or range of blocks without needing to download the entire Merkle tree.
Each time a chain’s creator appends data to the chain, a block is added, and a new root hash is calculated and signed with the creator’s private key.
When downloading a chain or a specific block, a peer can use the chain’s public key to verify its signature, which in turn verifies the integrity of other blocks and hashes.

Peer Messaging Process

Once peers have been discovered in connection with a dWeb network address, a requestor can then begin exchanging messages with peers that have announced themselves as seeds of the data. Below is a summary of the message process that takes place between a requestor and a seeding peer.

A seeding peer is chosen, and a TCP connection is opened using the IP address and port number associated with the peer’s dWeb network address.
A “feed” message is sent to the peer, indicating that the requestor is interested in data associated with a specific network key. It is important that the requestor sends specific information relating to the data in question as peers are likely seeding multiple sets of data.
The requestor sends a “handshake” message that starts the handshake phase with the peer.

The handshake phase provides the option to secure the connection with NOISE-based transport encryption. The NOISE-based handshake takes place over a duplex stream using NOISE’s ‘XX’ pattern. The handshake phase allows peers to authenticate and securely negotiate an encryption and MAC algorithm. This type of authenticated encryption is made possible by the generation and exchange of cryptographic keys that are used to secure data contained within protocol messages transmitted between peers. If encryption is enabled during the handshake phase, the stream between the peers is private. NOISE-based sessions are unique since they can be identified by a unique “handshake hash” value, which can be used for post-handshake channel binding.

A pseudo-representation of a handshake message would look as follows:

Field #	Name	Description
1	ID	Random 32-byte ID used to detect multiple connections to the same peer
2	Live	0 = End connection when neither peer is downloading 1 = Keep connection open indefinitely
3	User Data	Arbitrary bytes that can be used by higher level applications for any purpose
4	Extensions	The name of the extension the peer wants to use
5	Acknowledge	0 = No need to acknowledge each block of data received 1 = Must acknowledge each block of data

Once the handshake phase is complete and encryption is enabled between peers, all future data messages are encrypted. This feature is essential when the UniChain in question is not intended for public consumption, even considering that blocks within private UniChains are likely encrypted anyway. It is important to note that seeders can “choose” which data requests they fulfill, keeping data private to a specific “swarm” or group of peers. In this way, UniChain keys can be announced on a private DHT known only to a specific group, or peers that possess a particular dataset may already know their respective IP addresses and can choose to exchange privately over BIT.

During the handshake phase, the requestor can inform the peer that they want a “live” never-ending connection in order to receive continuous updates to the UniChain.

The requestor then sends a want message, indicating which blocks of the underlying UniChain it wants:

Field #	Name	Description
1	Start	Number of the first block you want
2	Length	1 = Just the start block 2 = The start block, the next one, and so on

The above message indicates that the requestor wants blocks 0 thru 5, if Start = 0 and Length = 5. If the Length is left blank, the requestor is indicating that they want the entire UniChain.

The peer then sends a have message, indicating whether or not it possesses blocks 0 thru 5:

Field #	Name	Description
1	Start	Number of the first block you want
2	Length	1 = Just the start block 2 = The start block, the next one, and so on

The above message indicates that the peer possesses blocks 0 thru 5, if Start = 0 and Length = 5.

In some cases, the peer may respond with a message that indicates it is missing some of the “wanted “data.” If so, the requestor must ask one of the other peers in the swarm for the missing blocks. It is a common occurrence to find peers that host a sparse version of a UniChain.

Upon receiving a have message, the requestor packages an individual request message for each block:

Field #	Name	Description
1	Index	Number of the block to send back
2	Bytes	If included, ignore the `Index` field and send the block containing this byte
3	Hash	0 = Send blocks and hashes 1 = Just send blocks without hashes
4	Nodes	0 = Send back all hashes to verify this block 1 = Just send blocks without hashes

So that the requestor can validate the UniChain’s underlying Merkle tree, the request message can be sent so that the peer knows that the requestor would like the block and all corresponding hashes. This validation ensures that the peer is sharing data that belongs only to the specific UniChain in question. The UniChain’s public key can then be used to validate the digital signature related to each block.

Upon receiving each individual request message, the peer responds with a corresponding data message:

Field #	Name	Description
1	Index	Block number
2	Value	Content of block
3	Nodes	See below

The nodes field contains a subfield that looks as follows:

Field #	Name	Description
1	Index	Hash number
2	Hash	32-byte block hash of parent hash
3	Size	Total length of data in block that hash covers
4	Signature	64-byte ed25519 signature of next hash corresponding to block hash

The signature can then be validated against the UniChain’s public key to verify that the data being received belongs to the correct UniChain.****

MultiChains

A MultiChain is a deterministically ordered “view” of multiple underlying UniChains. MultiChains are compiled by the BitStream module. BitStream accepts a variable number of input UniChains and outputs their state in a “causally ordered” stream. A causally ordered stream renders its deterministic order based on multiple factors, which are explained in the sections that follow.

BitStream passes this causally ordered stream to a “linearized mapping” function, which can be custom-tailored to map each entry in the ordered stream to a Unichain based output log, forming a UniChain consisting of state from other UniChains, or a MultiChain.

The linearization process can be thought of as a way for the MultiChain creator to “recompute” the deterministically rendered causal stream into a custom deterministic format. While BitStream can be used to filter data from multiple UniChains into a MultiChain consisting of altered or censored data, as will be discussed in dMachines, it can also be used to correctly source data using contract programs (as it is mapping to an output) so that an outputted MultiChain is constantly monitored for validity.

BitStream

The BitStream module allows for a single entity to derive a deterministic, causally ordered stream from multiple UniChain inputs. A BitStream can be formed using any number of UniChain-based inputs for which the BitStream creator has read or write access.

Below is an example of how this would work from user Neo's perspective:

import UniChain from ‘unichain’
import BitStream from ‘bitstream’
import ram from ‘random-access-memory`
import ChainStore from ‘chainstore’

const cstore = newChainStore(ram)    // manage all the UniChains in our local ChainStore
const Bob = cstore.get({ key: <bobKey> })
const Alice = cstore.get({ key: <aliceKey> })
const Neo = cstore.get({ <name: ‘neo-convo’ })

const stream = new BitStream([Bob, Alice, Neo], { input: Neo })

Appending Data to a BitStream

Using the above code, Neo could append data to the BitStream as follows:

await stream.append(‘Neo: Hey guys’)
await stream.append(‘Neo: I miss you all so very much’)

User Bob’s code would differ a bit from Neo’s. Bob's ChainStore get({ }) calls would retrieve Alice and Neo’s UniChains by key, while retrieving Bob's local UniChain by name because he has write access. Also, Bob's BitStream would be setup a bit differently:

const stream = new BitStream ([Bob, Alice, Neo]), { input: Bob })

Like Neo, Bob would now be able to append messages to the BitStream as follows:

await stream.append(‘Bob: Hey Neo’)
await stream.append(‘Bob: We miss you too’)

Viewing a Causal Stream

Neo, Bob, and Alice would all be able to live stream the conversation, causally, as follows:

for if (const: message of stream.createCausalStream()) {
    console.log(message.value.toString())
}

Per the messages already appended to Neo and Bob’s UniChains, the causal stream would read as follows:

Neo: I miss you all so very much
Neo: Hey guys
Bob: We miss you too
Bob: Hey Neo

Notice that the messages are displayed in reverse order. BitStream starts at the “head” of each inputted UniChain and “yields” entries in a “causal order.” At this point, it is probably helpful to explain causal ordering and why it is used with BitStream.

Causal Ordering

As it relates to the peer-to-peer web, peers come and go, which means they come online and sometimes go offline. BitStream must stream entries from inputs in a deterministic order so that the stream can be reproduced via unrelated systems, in the exact same order, in an unpredictable environment. To do so, BitStream appends a clock to each appended UniChain entry. The clock references the most recent entry in the stream that the writer witnessed, prior to appending this new state. The clock ensures that the state rendered by a causal stream is the same, regardless of when or which entity is viewing it (e.g., viewing a chat 20 years later), so long as the underlying UniChain inputs are the same. Figure MC-0 illustrates this concept.

This approach ensures that Bob, Alice, and Neo are viewing the exact same state (e.g., conversation), compiled independently on each of their local systems.

Without this approach, it is possible that the stream Bob renders would order messages in an entirely different order than Alice and Neo, whose streams could each produce an entirely different order as well. This misordering would destroy the integrity of the conversation for its participants and be confusing to read:

Alice: See you soon
Neo: I’m excited
Alice: yes, Causal Cafe
Bob: Hey!
Neo: See you when you get here!
Neo: Hey, did you get us some food?
Bob: I have Causal Cafe
Bob: See you soon

First, consider what happens if Alice goes offline. While Alice may still be appending messages to her UniChain, Bob and Neo are not seeing Alice's messages in their stream. Alice’s first offline entry will have a blank clock ([ ]) because she is unaware of what the latest entry is. This scenario creates a “fork” in the state in which Alice is in her own fork and Bob and Neo are communicating in their own fork. Alice’s second offline entry will reference her first offline entry within the clock, her third will reference the second, and so forth. Second, consider what happens when Alice comes back online and how her messages are ordered in the stream thereafter.

Forks and Stream Reordering

To better explain how forking works, it is helpful to review the previous example from the beginning. Bob, Alice, and Neo all start a new BitStream with three blank UniChains as inputs. The moment that they all create a BitStream on their local devices, Alice loses her Internet connection. Now Bob and Neo are communicating on their own fork independent of Alice’s. Consider the following conversation:

Bob & Neo	Alice
`Bob: Hey guys` `Neo: Hey` `Bob: Is Alice here?` `Neo: Not sure` `Neo: It doesn't seem like it`	`Alice: Hey guys` `Alice: Hello?` `Alice: Are you guys seeing this?`

Currently, Bob and Neo’s fork has a causal stream that looks as follows:

Neo: It doesn’t look like it
Neo: Not sure
Bob: Is Alice here?
Neo: Hey
Bob: Hey guys

Remember, BitStream starts at the head of each UniChain input and yields entries in a causal order.

On the other hand, Alice’s fork has a causal stream that looks as follows:

Alice: Are you guys seeing this?
Alice: Hello?
Alice: Hey guys

If Alice comes online right at this moment, Alice’s active BitStream session will see that the latest clock is Neo’s It doesn’t look like it message. This clock is identified after BitStream is able to pull the entries from Bob and Neo’s UniChains; logically, Alice’s BitStream is unable to ask Bob and Neo for their data while she is offline. Alice, unaware she is back online, creates a new entry: I guess not. This message is immediately linked to Neo’s most recent message, combining both forks into a single causally ordered chain. The stream is then reordered as follows:

Alice: I guess not (1)
Neo: It doesn’t look like it (2)
Neo: Not sure (3)
Bob: Is Alice here? (4)
Neo: Hey (5)
Bob: Hey guys (6)
Alice: Are you guys seeing this? (7)
Alice: Hello? (8)
Alice: Hey guys (9)

Notice how Alice’s fork (7, 8, 9) is placed before Bob and Neo’s fork (2, 3, 4, 5, 6). This particular ordering is because BitStream’s causal stream will always yield shorter forks before longer forks, and when forks have the same length, the winning fork is chosen deterministically by comparing the underlying keys of each input UniChain. No matter what, each stream participant will see the same output, so long as the BitStream inputs are the same.

When Alice’s most recent entry (1) was made, her entry was causally linked to Neo’s message at (2), interlocking both forks into a single stream. Anytime a peer in this exchange goes offline, the same forking occurs. When the peer comes back online and observes the latest stream entry, each fork will again interlock at a deterministic position. This ordering ensures that anyone, anywhere, can reproduce this exact causal stream when using the same inputs. Causal ordering is one of the dWeb’s most important features; without it, deterministic peer-to-peer communications would be impossible and the distributed Turing computer would simply be a figment of one’s imagination.

Linearized Mapping

So what is the purpose of using BitStream to produce a causally ordered stream of multiple UniChains? BitStream’s causally ordered streams can be persisted into a UniChain, or MultiChain, using a feature known as “Linearized Mapping.” Before diving into Linearized Mapping, it is important to first explain UniChain Truncation.

UniChain Truncation

Since a causal-ordered stream is known to deterministically reorder entries within a stream, due to forks and the interlocking of forks, in order for a causal-ordered stream to persistently be written to a UniChain, UniChains would be “re-chained" or reorganized. Since UniChains are append-only logs, blocks in a log that were previously written cannot be moved around. Note, as it relates to higher-level data structures like dDatabase, one can traverse the data in a UniChain using creative schemas that only make it seem like the data has been edited, deleted, or truncated, without compromising a UniChain’s underlying data integrity or altering its Merkle tree. Using the truncate method, one can shorten a UniChain to a particular length and then re-append new blocks. That being said, truncation is computationally expensive, thus it is fundamentally important to the performance of Linearized Mapping processes that truncation is used minimally and that the size of truncations are minimized. These unique attributes are why causal-ordered streams and Linearized Mapping make a very good match.

Linearized Views

Persisting a causal stream of many UniChains into a single UniChain is accomplished using the linearize() method. While this single UniChain may feel like a UniChain, it is really a UniChain of UniChains, and so from here on out it will be referred to as a MultiChain. It is important to note that a MultiChain is a UniChain when the high-level linearized abstractions have been peeled away. User Neo, using the example from earlier, could create a MultiChain from his BitStream with Bob and Alice as follows:

import UniChain from ‘unichain’
import BitStream from ‘bitstream’
import ram from ‘random-access-memory’
import ChainStore from ‘chainstore’

const cstore = new ChainStore(ram)
const Bob = cstore.get({ key: <bobKey> })
const Alice = cstore.get({ key: <aliceKey> })
const Neo = cstore.get({ key: <name: neo-convo> })

const stream = new BitStream({Bob, Alice, Neo], { input: Neo })

const multichainStore = cstore.get({ name: ‘neo-multichain’ })
const view = stream.linearize(multichainStore)
const view.update()

The above code creates a MultiChain that directly derives its ordering from BitStream’s causally ordered stream, which is why it is considered a causally ordered MultiChain and an organic trustless data structure. Using the linearize method, the causal order can be recomputed in other deterministic ways prior to writing updates to the MultiChain. For instance, all of Bob, Alice, and Neo’s entries could be stored in uppercase letters, like so:

const view = stream.linearize(multichainStore, {
    async apply (batch) {
        batch = batch.map (({ value }) => Buffer.from(value.toString().toUpperCase()))
        await view.append(batch)
    }
})

This type of operation can be referred to as “Linearized Mapping.” Using the apply function, one is able to configure exactly what is written to the MultiChain in response to entries in a causally ordered stream. This does ensure the underlying MultiChain is artificial, although, Linearized Mapping is used by dMachines and smart contracts, to create organic MultiChains. More information about dMachines is located here.

View Generation and Sharing MultiChains

Entities that are writing to a MultiChain - in other words, entities that have a writeable input within a BitStream that is streamed to a MultiChain - must regenerate their linearized view constantly as this view is constantly reordered by the BitStream. This requirement is particularly important for writers, otherwise it would signify that a writer is no longer connected to the BitStream and would therefore be operating on an independent fork. Because truncation is computationally intense, application performance and user experience will be affected as a BitStream’s list of UniChain inputs grows longer and longer (e.g., a chat room with millions of users).

Instead of reprocessing each block from start to finish, the solution to the issue above is to pass the “remote views” of others to linearize() so that only minimal changes are needed to bring the remote view up to date. This solution improves user experience and allows application developers to utilize higher-level data abstractions on top of MultiChain, in turn enabling their their users to download the specific MultiChain data they consume rather than the entire MultiChain.

dDatabases

A dDatabase is a key-value store built on top of a UniChain and is compatible with anything that consumes UniChains, such as a MultiChain. A dDatabase is simply a UniChain when all of the abstractions have been stripped away at the block level. dDatabases arrange data into a binary tree, which allows data to be located very quickly, even in data structures that contain billions (even trillions) of blocks. While the topic of binary trees is beyond the scope of this paper, it is important to note that they are utilized in many popular centralized database frameworks so that data can be easily indexed and located within underlying data structures. Binary trees are also used to arrange data within most DHTs.

UniChains, when used in the naked format are difficult to traverse for data, which is why a higher-level abstraction such as dDatabase was needed. dDatabases provide a simple CRUD (Create, Read, Update, and Delete) API for manipulating and retrieving data within the underlying UniChain. Data can appear as if it has been edited or deleted at the dDatabase level even though it still exists at the UniChain level. dDatabases use the following API:

put(key, value) – Adds or replaces a key/value in the store
get(key) – Locates the key and returns the latest version in the UniChain history
del(key) – Appends a new version of the key with a ‘deleted’ flag

Below is a pseudo-representation of how dDatabase’s API interacts with UniChains:

Higher-level data structures built on top of a UniChain utilize a value schema, such as the value schema used in Figure BT-API-1, and some sort of structure or organization, such as a binary tree, that allows the data to be traversed and manipulated. As can be seen in Figure BT-API-1, dDatabase is not actually editing or deleting the underlying UniChain’s entries; instead, it is storing data within the UniChain in a format that dDatabase’s framework can utilize whenever it needs to determine if a particular key has been edited or deleted.

dDatabase makes this determination by traversing the underlying UniChain for every block where “value.key = the queried key.” In the previous figure, by block #4, three matches would have been returned for the key “bitweb.” If the “get()” method was used, dDatabase would traverse the matches for “value.version” and pick the largest version number and return the “value.value” field. If the largest version number has a “value.deleted” Boolean that equates to true, “get()” would return “not found.” dDatabase’s API streamlines application development by simplifying the process of interfacing with UniChain based data structures.

Indexing and Range Queries

dDatabase keys can be formatted so that they can be used for complex indexing and traversal of data. Keys can contain multiple indexes within the same keys. This formatting can be seen in Figure BT-3, in example (C), with the key “/posts/bob/01/22/22.” Complex indexing strategies may utilize multiple “puts” for the same data. For example, user Bob likes user Alice’s post on his favorite dWeb-based social network, and the underlying application produces the following data from this interaction:

{
    type: "like",
    postID: "fa1b7c",
    postingUser: "alice",
    likingUser: "bob",
    state: true,    // true = liked, false = unliked
    intDate: "01/22/22 12:00:11"
}

Using this data, the following “puts” could be created:

(a) key: /likes/bob/ fa1b7c value: { postingUser: “alice”, likingUser: “bob”, postID: “fa1b7c”, state: true, time: “01/22/22 12:00:11” }

(b) key: /likes/fa1b7c/bob value: { postingUser: “alice”, likingUser: “bob”, postID: fa1b7c”, state: true, time: “01/22/22 12:00:11” }

While this may seem redundant and a waste of resources, it is important to remember that data stored within a dDatabase formatted UniChain is “sparsely” hosted by peers when compared to how raw UniChain structures are seeded. See Sparse Downloading and Sparse dDatabase Replication for more information.

Storing data under multiple keys using the above key structure enables what is known as “range querying.” For example, the following range query could be performed:

lt (less than): null (n/a) lte (less than or equal to): null (n/a) gt (greater than): null (n/a) gte (greater than or equal to): /likes/fa1b7c/

The above query would return anything greater than likes/fa1b7c/ or equal to likes/fa1b7c/. Thus, if the key for likes/fa1b7c/ was present (in this case it is not), the query would return the data because it falls within the defined range. Also, the query would return /likes/fa1b7c/bob since the data is greater than likes/fa1b7c/.

This example provides a useful method for querying all the likes for a given postID. Also, all the likes for Bob could be found by performing the following range query:

gte: /likes/bob/

This type of query would not be possible if Bob’s “like” had not been indexed under two different keys.

dDatabase enables powerful range querying built on top of UniChain’s lightweight distributed architecture. dWeb applications are faster when compared to centralized and other decentralized applications due to the manner in which dDatabase’s underlying data is sparsely replicated and therefore distributed.

Sparse dDatabase Replication

As explained in Sparse Downloading, a UniChain can be sparsely shared among peers. Due to the underlying Merkle tree, a peer that is only interested in the data within a specific block of a UniChain need only possess that same block to verify the partial chain. This ability is made possible by the want BIT protocol message, which allows requestors to ask if the remote peer has a block or blocks that contain a specific byte or set of bytes, rather than asking for a specific block or block range.

For instance, requestors can easily ask peers of a social network’s dDatabase if they have data that matches one of the range queries from the previous section. In this case, the peers would send only those blocks containing data that falls within the queried ranges.

If every peer of a dWeb-based social network had to download its entire dDatabase, it would significantly impact user experience, especially considering that a social network with billions of users could have trillions of blocks in its underlying MultiChain. Even though a dDatabase’s binary tree would allow peers to quickly traverse trillions of blocks in mere seconds, it is much easier and far simpler for peers to ask a social network’s dDatabase swarm for the data they need, preserving only those portions of data they utilize and consume. Range querying over keys allows entities to consume only the data they need, touching just a small subset of blocks in the underlying UniChain.

Collaborative dDatabases

dDatabases are built on top of UniChains and are thus single-writer data structures, but they can be used with MultiChain filters to compile a multi-entity key-value store, as seen in Figures BT-1 and BT-2. These multi-entity key-value stores are referred to as “Collaborative dDatabases.”

Whether the original data derives from a UniChain or a higher-level abstraction such as a dDatabase, a MultiChain’s programmable filters can be used to decipher data blobs and append data according to the filter’s schema. In this way, a MultiChain can be used to compile multiple UniChains into a single combined dDatabase, or even better, compile multiple dDatabases into a single combined dDatabase. Depending on the specific situation, these methods may prove superior to that of using a MultiChain to compile multiple UniChains into a single UniChain.

Identifying High-Level Abstractions

The structure type of high-level UniChain based data abstractions can be identified at block #0, where the structure’s protocol header exists. In the case of a dDatabase, block #0 would read:

{
    Type: ddatabase
}

dDatabase Use Cases

As illustrated in Figure BT-3, Distributed Databases have use cases ranging from social networks to domain name systems to distributed file systems, the latter of which will be discussed in the following section.

dDatabase Implementations

Hyperbee

dDrives

In order for a serverless web to work, it must do much more than simply facilitate the exchange of abstract data blobs between peers. It must also facilitate the distribution and reproduction of higher-level data structures, such as the exchange of files and entire file systems, or something comparable to how an application’s static files are accessed through an HTTP request and web server. The dWeb can achieve this type of functionality over BIT by using a high-level UniChain based data abstraction known as a dDrive. A dDrive is a file system stored within a UniChain. Peers can exchange individual files (blocks) or an entire file system in mere seconds, the same way in which static files can be exchanged between server and client over HTTP. The main difference, aside from a dDrive being peer-to-peer, is that a dDrive utilizes file versioning, meaning that each version of every file is stored within the underlying UniChain. As discussed in dDatabases, key-value pairs are stored so that the latest version of a key is served when “get()” is invoked. This attribute allows the entity consuming the dDrive to reference and retrieve any version of any file within.

Since dDatabases have built-in versioning and can quickly range query over numerous blocks within the underlying UniChain, they provide the perfect data structure on which to build dDrive’s higher-level abstraction.

Version Control and File System Reproduction

Figure BD-0 illustrates how a dDrive is stored within an underlying dDatabase. It is important to look more closely so that the actual value structure can be examined, as this is where the magic responsible for distributed file versioning and file system reproduction takes place between peer devices, as shown in Figure BD-1.

Consider the “/js/main.js” file in example (w) founding within Figure BD-0. The value for this file, and for any file within a given dDrive, is stored using the same indexing strategy, in which the content of the file is stored under one key and the metadata is stored under another key.

The key structure is as follows: Content: /js/main.js/content Metadata: /js/main.js/metadata

The reason for separating a file’s content from its metadata is simple: a peer may want to first examine a file’s metadata before consuming its content, which is very likely much larger in size.

The value structures for a file’s content and its metadata are explained below:

Content Value Structure

A file’s content entry will utilize the following schema:

{
    Type: “ddrive-content”,
    name: “/js/main.js”,
    folder: false,
    value: <byte-array-of-file-data>
}

Metadata Value Structure

A file’s metadata entry will utilize the following schema:

{
    type: “ddrive-metadata”,
    name: “js/main.js”,
    value: <bitfield-based-metadata>    //see below
}

Metadata Bitfield Schema

Metadata is stored within a “bitfield” that exists within the metadata value field. The overall field schema is as follows:

The dDatabase’s underlying data structure can be traversed for a file’s various state transitions by pulling the content and metadata entries related to a particular file, where the version field in the content and metadata entries equate to the same version number (1, 2, 3, etc.) (see Figure BD-2).

File System Reproduction

Below is an example of how a file system is distributed from one peer device to another:

As illustrated in Figure BD-1, a peer’s file system is converted into a dDrive, which means the entire file system (the content and metadata related to each file/folder) is stored within a single UniChain. The dDrive’s network address is then announced on the DHT, along with the peer’s IP address and port number.
The requestor, using a dWeb-compliant browser such as dBrowser, types the dDrive’s network address into the browser’s address bar using BIT (bit://). Once entered, the browser initiates a TCP connection with the dDrive’s peers.
The requestor begins exchanging BIT messages with peers until it possesses the entire dDrive. This process should take no more than a second, depending on the connection.
Upon receiving the dDrive, the requestor’s browser traverses the UniChain, assembles the state of the dDrive’s underlying file system, and stores the files on the requestor’s system.
The browser then displays the dDrive. In this case, since the dDrive contains a website, the browser displays a webpage.

A dWeb-compliant client is required to reproduce the file system on the requestor’s system. Otherwise, the requestor will end up with a UniChain.

Spare File Systems

Since a dDrive contains a UniChain at its core, a partial chain or a partial file system can be distributed among peers. For instance, user Bob might be sharing a dDrive that contains the following files:

- song1.mp4
- song2.mp4
- video.mp4

User Alice can simply request bit://<network-address>/song1.mp4 and she will then possess a partial version of the dDrive.

dDrive Nesting

One dDrive can be nested within another, which means that Bob can create a drive and nest Alice’s within. Although Bob cannot alter Alice’s drive, they are both pushing updates to the same drive because when Alice’s drive is updated, Bob’s is too since Alice’s drive is a nested portion of Bob’s. In this way, nesting can be considered a very basic form of collaboration. In the near future, dDrives will be compatible with MultiChains so that a multi-drive view can be created and live replicated. This capability will enable live collaboration within the same file system; for example, two entities editing the same spreadsheet.

Live File Updates

Since UniChains can be live-streamed between peers, dDrives can too. This feature is especially useful for websites and web applications – the moment a developer issues an update, the drive (the website or web application) will automatically refresh in the browser.

Truly Decentralized Websites and Web Applications

dDatabases and dDrives make it possible to develop websites and web applications that are completely decentralized, distributed, and serverless. Together, these data structures enable web-based experiences to host their databases and files among the peers that use them. That being said, complex web applications require even higher-level services that are beyond the scope of what dDatabases and dDrives are capable of providing.

Most of today’s decentralized applications use blockchain-based smart contract engines for services such as user authentication, data storage, payments, and cooperative data systems. Unfortunately, blockchains are slow and depend on HTTP. Furthermore, they suffer from a design that inherently creates bottlenecks. As a result, blockchains will likely never possess the bandwidth needed to host global applications and the billions of users they represent. Blockchains and their “heavy” consensus protocols are simply not organic. dWeb provides a framework that is capable of delivering these services and more. dMachines and blockchainless smart contracts will be discussed in the next section.

dMachines

Software communities and their end-users (entities) rely on the data they consume to be “trustless,” which means that the participants need not trust one another or a third party for the system to function properly. While the data generated by an entity is stored within its own trustless UniChain, the same cannot be said for an application’s MultiChain. As discussed earlier in this paper, an application’s MultiChain combines UniChains from multiple entities into a single linearized and causally ordered UniChain, or “combined view.” The main disadvantage concerning a “naked” MultiChain is that although it derives its data from trustless and “organic” data structures, the data can be artificially altered, or filtered, as it is compiled into the MultiChain.

In order for data that derives from multiple UniChains to remain trustless in a MultiChain state, a cooperative digital organism is needed. Specifically, one that allows multiple entities utilizing a single software platform to execute the same immutable program functionality and output the trustless results of those executions to a single output log. In order to remain trustless, the organism’s output must prove that its combined data view has not been filtered, and that each entry is in its original UniChain based state. Before this cooperative digital organism is proposed, it is beneficial to discuss the issues associated with previous solutions that have not met these conditions.

Data, regardless of its simplicity or underlying structure, is the result of some type of program execution. For example, consider a social network called Socialx that combines the data of its users and their UniChains into a naked MultiChain. Socialx’s application code is distributed to its users via a dDrive. Users execute Socialx’s code on their local machine, downloading the code (file system) from other peers of the application’s dDrive. The moment Socialx’s developers (the dDrive’s creator) update the application’s files, all of their users (the dDrive’s peers) receive the update and the application “live-refreshes” in the browser. This scenario provides a good example of what was previously described as the “serverless web.”

That being said, scenarios like those above do not provide all that is required for a serverless web because they lack what is referred to as “total data agency.” Consider the following:

A user named Bob types bit://social.x into his web browser and the application’s files are downloaded to his local machine.
Bob clicks the ‘Signup’ button, which routes him to `bit://social.x/signup’.
Bob is prompted to enter a username and when he does, a UniChain is created on his local machine.

In the example above, the Socialx application has no data indicating that Bob signed up. Thus, in order for Bob to “follow” Alice and Neo, Socialx would have to allow Bob to create his own custom data agency, meaning that Socialx would have to allow Bob to build a MultiChain or “custom view” of the social network, viewing only the data that derives from the users he follows. In this instance, Bob’s “news feed” would consist of posts from himself and two others. What is important to note about this example is how Bob located Alice and Neo. Socialx has no function to search for users because the application’s data is distributed without any sort of data agency. There is no central place, where at the very least, a table of users (usernames) and their UniChain network addresses are stored. The only way for Bob to follow Neo and Alice is to know their UniChain addresses.

This type of scenario means that Bob would have to know Alice and Neo personally so that he could acquire their network addresses directly, or that Alice and Neo would have to publicize their network addresses via some sort of medium (e.g., a blog) that Bob somehow stumbles upon. This scenario provides a good example of what can be referred to as the “User-Controlled Data Agency” (UCDA) model, in which users are required to build their own data views within a given application. Without global data agency, a social network application would lack a user search feature, and when searching for posts or hash tags, users would only be able to search the data that derives from the users they follow.

That being said, distributed applications like Socialx could technically manage an API server to which new user requests are sent. All requests would be compiled by the server into a dDatabase based UniChain, announced on a dWeb-compliant DHT and distributed among Socialx’s users so that they could locate one another’s data structures. For example, consider the following:

When Bob creates his username via bit://social.x/signup and a UniChain is generated on his local machine for storing his Socialx data, the application sends Bob’s username, his UniChain network address and its public key to Socialx’s API server at https://socialx.com/api/newUser.
The API server runs a software that acts as a “data funnel.” This data funnel processes each newUser request and stores it in a dDatabase based UniChain that is controlled by the server. Since the dDatabase in this case is announced on a dWeb-compliant DHT, Socialx can allow users to retrieve and consume the dDatabase as a globally distributed user registry. Thus, Bob can navigate to bit://social.x/users and view all of the users found in the dDatabase. The Socialx user interface can be designed to display the data in various ways. For instance, there can be a follow button for each username the interface displays. When clicked, the follow button can input the user’s UniChain public key into Bob’s local MultiChain. At first glance, this setup seems to provide a workable solution until one considers the issues presented by a data funnel server and this type of data distribution:

The user registry in the example above only displays usernames with a follow button since the dDatabase only contains usernames and keys. As a result, the user registry is missing important information such as pictures, bios, and locations because the data associated with these items is stored locally by users in their individual UniChains. To display these items when browsing the registry, Bob would need to download the data from each user’s individual UniChain simultaneously. Consequently, Bob would be downloading data from all over the place, even considering the fact that he could sparsely fetch a single data key for each user, such as /userData.

Imagine a scenario in which ten users are displayed on the “All users” page. In this case, ten or more TCP connections are opened with peers of the ten UniChains corresponding to the displayed users. When the “Load more” button is clicked, another ten or more TCP connections are opened with peers of the ten UniChains corresponding to the displayed users. At this point, there are 20 or more open TCP connections required to load a single page. While this solution does indeed give Bob the ability to search for other users via a single dDatabase user registry, any time actual user data is needed Bob must download the data from each user’s individual UniChain. This type of network schema would negatively impact Socialx’s user experience because as the number of users Bob follows increases so does the number of users he must live stream.

The Socialx user registry is considered distributed because it is compiled on a centralized server and peered by Socialx’s users. However, if the server is discovered, attacked, and taken offline by hackers, which is feasible because its IP address is announced on a dWeb-compliant DHT, the server would be unable to process API requests and therefore add users to the registry. Also, if the server was penetrated, hackers could modify the keys of users, thereby destroying Socialx’s organic state. For instance, a hacker named Darth could change Bob’s UniChain key to his own and cause other users to unintentionally download Darth’s data in place of Bob’s. Thus, when looking up “@bob” in the registry, a user would retrieve Darth’s dDatabase key instead of Bob’s and download Darth’s data, which could be in the form of a virus. Darth is able to modify the keys because the server has write-access to the dDatabase user registry.
The server’s operators, most likely Socialx’s developers, could remove individuals from the user registry in order to prevent them from being discovered by other users of Socialx. This practice constitutes a form of censorship and would likely not be tolerated by the Socialx user community.

As explained in the points above, the client-server model can be risky and should never be used in a distributed setting. Implementing a funnel server creates a central point of failure, a central point of control for developers, a suboptimal user experience, and brings with it a number of security issues. The funnel server also moves an application away from its “organic” state because there is no model in place that allows participants to trust its underlying data. These concepts provide a good example of what can be referred to as the “Developer-Controlled Data Agency” (DCDA) model, where developers control an application’s global data agency (e.g., the compilation of an application’s user registry containing its usernames and corresponding identifiers).

Continuing the example above, an application could utilize a second funnel that takes the registry from the first funnel and uses it to compile each user’s dDatabase into another dDatabase based MultiChain containing all user and application related data (see Figure SC-0). While this second funnel would significantly improve user experience, as all of the application’s data would derive from a single data structure, developers could artificially alter data within the underlying MultiChain by using filters. For instance, developers could censor users or simply alter what they have posted. Also, as was the case with the first funnel discussed above, hackers could discover the second funnel and begin tampering with the MultiChain, destroying its distributed state. While it was previously stated that filtering data from multiple single-writer sources into a single output is organic, this is only true when the original state of the data is intact. The proposed second funnel solution makes no reference to the original data source within the compiled data source, and there still remains the possibility that hackers could penetrate the server and alter the data.

The importance of trust can be illustrated when considering a network of distributed robots. Imagine a scenario in which multiple maritime robots are tasked with swimming around the ocean to study fish. The robots use an application called FishBit that is shared among robots using a dDrive, which ensures that all robots are using the latest version of the software. As each robot uses the software in their study of fish, the results are stored within each robot’s individual UniChain. Researchers combine these UniChains into a dDatabase based MultiChain that is shared with the swarm of robots, each of which utilizes the data in the furtherance of its studies. When one robot discovers a new fish, all of the other robots possess the same information and can thereby learn from it and make decisions around it, deriving more and more data through a continuous cycle of learning.

While maritime robots studying fish provides an interesting use case of the dWeb's underlying technologies, it remains a classic example of the DCDA model because researchers have centralized control over the MultiChain used by the robots. If the data is somehow artificially altered, the maritime robots cannot be considered organic machines since they are potentially consuming artificial data. It is important to note that a machine can only be considered trustless and organic if its underlying data sources are also trustless and organic. Therefore, a data structure that compiles its data from multiple single-writer data structures must properly and deterministically reference the original data sources and include proofs of integrity: block numbers, hashes, signatures, and keys of data sources related to each entry added to its compilation.

Consider the maritime robots example: under what circumstances can Robot A trust the data that derives from Robot B? The fact is, Robot A cannot trust the data from Robot B because the compilation process is centralized and the possibility of artificial state altering means that the source must be deemed artificial. Any community of participants, whether humans, robots, or smart devices, must trust the data that derives from other participants of the community. This absolute trust is not possible when the data compilation process lacks cryptographic checks, original data sourcing, and inclusion proofs, especially when the data in question is compiled from multiple sources. Because single-writer data structures such as UniChains make it difficult to create decentralized global data agency, they work best when used within distributed file-sharing networks (e.g., torrents, dDrives, Hyperdrive, IPFS, or Ethereum’s Swarm).

Because of the limitations associated with using single-writer data structures to establish decentralized global data agency, many developers have chosen to utilize “blockchains” like Ethereum. By using a blockchain, developers can store the organic state of their applications, along with their users and their public keys, “on-chain,” and make use of decentralized currencies for payments. Under this type of model, commonly referred to as “Web3,” an application’s files are distributed to peers via a dDrive-like structure “off-chain,” while an application’s data is stored “on-chain.” Applications extrapolate their “business logic” and place the code within a program that is immutably stored on the blockchain as a “smart contract.” These programs contain various methods for writing data to the blockchain (e.g., a method for storing a user’s social network posts on-chain) and retrieving data from the blockchain (e.g., a method for retrieving a user’s social network posts).

While under this model, users remain “authors” of their data and retain ownership rights, their data is not stored locally on their machine or streamed to only those users that consume it. Instead, users of a blockchain have a username or long cryptographic address that is stored on the blockchain’s ledger with a corresponding public key. In principle, this public key allows a blockchain to serve as a global decentralized user registry. Users submit their data to a blockchain participant (miner, block producer, or validator) with a signature related to the data payload that the participant uses to verify the transaction. Since the signature was created using the private key corresponding to the user’s public key stored on-chain, participants can validate the signature related to the data payload, passing the data on to other participants for validation and so forth. The user’s transaction is eventually added to a “block” by a participant that functions as a miner or block producer. Together, the miners, block producers, and validators maintain the blockchain’s state on their machines, which is comprised of the data generated by the blockchain’s users. Blockchains are therefore considered distributed multi-writer organic data structures since the integrity of the data is verified using digital signature algorithms and the blockchain’s underlying Merkle tree.

The data that users submit to a blockchain participant is typically related to a smart contract function. Smart contracts are immutably stored on the blockchain and can be accessed by anyone. Consider the transaction data in Figure SC-1. Blockchain participants, upon receiving this transaction, know to retrieve the code for the smart contract “socialx” from the blockchain state and load it into a “Contract Interpreter,” otherwise known as a “virtual machine.” This interpreter reads the contract code and executes the post action referenced in the transaction’s payload field. The interpreter then outputs a series of operations that are computed and stored within the blockchain’s state, along with the user’s signature, proving that the computation was the result of the user’s contract request and thereby validating the data. It is important to note at this point that the program execution took place on the participant’s system and not on the end user’s system.

The use of smart contracts provides trust as entities executing the same contracts are assured they are executing the same code. The reason for this assurance is simple: the contracts are immutably stored on the blockchain and the resulting data is cryptographically signed. Additionally, third parties can cryptographically verify each execution and thus the entire blockchain’s state by verifying the chain’s Merkle tree, including each block’s inclusion proofs and each transaction’s digital signature. This verification ensures that all transactions, including smart contract derived transactions, have been initiated by the stated entity and that the blockchain’s state has not been artificially altered. Blockchain based data is therefore trustless and organic.

Blockchains play host to many smart contract programs that power currencies, social networks, and NFT marketplaces, all of which are built around the blockchain’s user registry. Because blockchains are used to store the computations of individual entities executing multiple programs, they are often referred to as “Distributed Turing Machines” or “Singleton Computers.” While blockchains are suitable for specific use cases, the concentration of application data and computation into a handful of massive blockchain engines presents a unique set of problems. While blockchains incorporate important mechanisms such as Byzantine Fault Tolerance (BFT), cryptographic structures, and economic incentives, all in an effort to establish trust, they depend on other mechanisms that have a negative impact on the overall system, including the following:

Reliance on large networks to provide trustlessness.
Expensive transaction fees.
Expensive global consensus on all transactions.
Wasteful hash “mining” in “Proof of Stake” based blockchains.

The ever-increasing transaction fees associated with blockchains like Ethereum derive from a rise in the number of transactions taking place on the network, which is true for any blockchain that relies on resource metering. Since millions of programs can be executed by millions of entities, a state machine’s “halting problem,” first encountered by Alan Turing in the early 1900s, must therefore be addressed. Resource metering has been the only solution – either the contract creator or the contract user must pay for the resources consumed by the contract’s execution, otherwise the blockchain’s resources can and will be exhausted.

The halting problem is such that a computer runs in an infinite loop, eventually using up all of the computer’s finite resources. Since many programs and many entities use blockchains and their underlying resources, transaction fees were created as a way of solving the halting problem. The fees are structured so that each execution of a blockchain-based transaction must be accompanied by a corresponding fee, which is estimated by the amount of computational resources the blockchain believes the code will require when executed and the amount of memory the resulting computation will occupy within the blockchain’s state. Thus, transaction fees keep bad actors from writing programs designed to simulate Denial of Service (DoS) attacks on the blockchain’s network. With resource metering and transaction fees, bad actors that carry out DoS attacks will eventually run out of money.

As networks like Ethereum grew, blockchain’s achilleas heel came to the fore when the ever-increasing resources needed to perform the network’s computations outpaced the resources available from participants. A blockchain’s resources derive from the numerous participants that run the blockchain’s software, allowing them to execute program actions, validate transactions, and maintain the blockchain’s state. This arrangement is the reason why “transactions per second” has become a blockchain’s most important metric and the impetus behind projects like EOS, claiming they could drastically increase blockchain throughput. The throughput issue has caused developers to compete with one another over transaction fees, so that their users' transactions find themselves within the “soonest possible block,” in the process pushing transaction fees higher and higher.

As a workaround, blockchain engineers from various projects are looking to utilize what are called “Layer 2 rollups,” where transactions are first executed by participants of a separate blockchain, called a “Layer 2” blockchain, which in turn hands the data off to the main network (Layer 1). This handoff is directed by a smart contract on Layer 1 that calls on a Layer 2 blockchain to execute a specific transaction and return the results and proofs back to Layer 1 where they will be stored. Theoretically, a Layer 2 blockchain will have considerably less transaction traffic and can thus compute transactions faster and cheaper than a Layer 1. This model should decrease transaction fees as a majority of the fee structure is related to computation, which in this case has been offloaded to Layer 2. Nevertheless, when there is a separation between end users and participants, there will always be more computations than there are resources to handle them in a timely manner.

The centralization of programs and entities within a few blockchain platforms presents a critical issue – the resources needed by the blockchain’s contract executions continuously outpace the blockchain’s ability to acquire those resources. The time it takes for a blockchain to acquire new participants that are capable of running its complex software and willing to contribute extensive computing resources to the blockchain’s operations cannot possibly keep pace with the resource needs of its end users. A blockchain could easily have one million programs and two billion users with only ten million participants supplying the resources needed for computations and data storage. This disparity is mainly due to the complexities involved with becoming a blockchain participant. For blockchains to truly scale with the needs of their programs and users, they must simplify and further distribute the onboarding of participant resources.

Even so, blockchain resource growth has historically produced costly, resource intensive “mining facilities,” as most blockchain software requires specialized hardware, enormous amounts of electricity, and can only be effectively managed by certified systems engineers. Participants wanting to do this from their homes are unable to do so since they cannot compete with the power brought together in specialized facilities. Blockchains such as EOS which have moved away from mining still rely on 21 “block producers” for computing transactions. These block producers, voted in by the blockchain community, have been for the most part centralized organizations. Nevertheless, the user/participant model has not worked: a regular user becoming a blockchain participant in this day and age is akin to a child installing a NASA grade telescope in their room and contributing to space research.

The dWeb does away with blockchains and the Web3 model, replacing them with Distributed Machines that can be executed and queries via dWeb's distributed web-based architecture. Distributed Machines (dMachines) have many notable differences when compared to blockchains:

They are built around a single smart contract.
The users of a dMachines are also its participants, processing contract code locally on their own machine and storing the results of those executions locally in their own UniChain, or “operator log.”
The operator logs of participants are followed by “executers” that deterministically “apply” operations from these logs to their own “output logs,” per the contract’s pure apply function. These output logs are combined into a single dDatabase based MultiChain called an “index log.”
Validators and other third parties can replay operator logs and output logs in an attempt to produce the same index log. If a validator produces an index log that matches the dMachines’s published index log, then the data is validated and considered organic. This validation is possible because executors deterministically apply operations from participant oplogs to the index log, linearly.

These differences create powerful advantages over blockchains:

An application can deploy its own Machine, built uniquely around its own business logic and state.
The application’s code, regardless of device or operating system, can utilize a built-in contract interpreter, allowing participants to handle contract executions locally.
Users operate in a passwordless environment, simply writing to their own local oplogs. There is no need for an “authenticator” such as MetaMask.
Participants are only able to alter their own state, while executors that attempt to alter global state are quickly identified and their actions published to the index log.
Distributed Machines can scale organically with an application’s resource needs, considering that each user provides the resources for their own computations.
Distributed Machine data is organic since index logs and their related operator logs are cryptographically tied together.
Two executors can apply a billion operations from a billion operator logs in a single second. This type of throughput cannot be achieved with a blockchain.
Distributed Machines form a cooperative environment that achieves truly decentralized global data agency.

Distributed Machines provide a powerful alternative to blockchains, in that they are distributed, organic Turing computers with a single onboard program that can be passed around a swarm of peers. The peers can freely execute the program’s functionality and can store and distribute its entire organic state among one another. Distributed Machines provide low cost, high throughput transaction speeds and the same cryptographic security as blockchains, without the need for expensive consensus and mining schemes. They also exist in a format that allows end users to be resource contributors, with no risk to their devices since they are only handling their own computations. These attributes allow Distributed Machine resources to grow organically with demand, providing global elasticity and an environment in which users can trust the data from other users.

These attributes enable developers to create smart contract programs on their own easily deployable Turing Machines, allowing their systems to scale organically as more and more entities use their programs. In the sections that follow, we will introduce dMachines, the dWeb's first reference implementation of the Distributed Machine specification, showcasing how Distributed Machines are able to consume data from other Distributed Machines and how they can execute a remote Distributed Machine's contract and consume its output.

Open Machines and the DOOM Model

The Distributed Organic and Open Machine (DOOM) Model is made possible through a cooperative relationship between participants, towers, executors, and validators. Together, these entities compute, apply, and validate state, related to a single contract program stored within the state they collectively maintain. This cooperative relationship forms a “Distributed Turing Machine,” also known as a “Distributed State Machine.” Therefore, from here on out, “dMachines” will simply be referred to as “Machines.”

Machine Genesis Initialization

In a single-executor machine, the executor, or “distributed machine manager,” handles a machine’s genesis initialization. In a multi-executor machine, the initial executor handles a machine’s genesis initialization. A genesis initialization is a machine’s initial boot operation, which consists of the following procedures:

The “output log,” also known as the “index log,” is created. The index log is a dDatabase based MultiChain that contains a machine’s collective state.
The dDatabase data structure header is written to the index log at block 0.
The smart contract program is written to the index log at block 1, under the system key (.nucleus/contract/source).
A “Ricardian” (human-readable) translation of the contract is written to the index log at block 2, under the system key (.nucleus/contract/Ricardian). Note: The Ricardian format is compiled by the Larimer Compiler.
In a multi-executor environment, the public key for each executor’s subindex log is written to .nucleus/execs/{ pubkey-hex }.
The public key for each initial participant’s oplog is written to .nucleus/inputs/{ pubkey-hex }.
A genesis object is stored within the machine’s state under the system key .nucleus/acks/genesis, immutably storing proof of completion related to the machine’s genesis initialization.

At this point, the machine is fully functional and participants can execute contract methods (actions) and place the resulting operations within their individual oplogs. The executor(s) watching these oplogs can deterministically apply the operations to the machine’s state.

Participants

Participants are the end users of a machine. They choose which program actions they compute, compute them locally, and store the resulting operations locally within their own UniChain, also referred to as an input log, operator log, or oplog. Malicious behavior, in which a participant adds a transaction to their oplog that violates the machine’s contract, is rejected by the machine’s executors and reported as such in the machine’s state.

Before participants create a transaction, they first sync the index log head block from the machine’s swarm. Participants then load Nova VM with the machine’s current contract code (.nucleus/contract/source), assign the index log’s root proof (root hash and signature) to the indexProof variable, and then call the specified contract via Nova. Nova responds with:

A call success or call failure (checksum value).
A call response or call error data.
The indexProof value.
An operation array of all operations generated, including [oplog root proof, the operation value].

The oplog root proof is the root hash and signature at the head of the participant’s oplog prior to initiating the current operation (transaction). The oplog root proof is then stored in the participant’s oplog.

Executor Transaction Processing

Executors monitor participant oplogs for newly created transactions and utilize a contracts exported process() function, which returns metadata related to transaction data found within a participant’s oplog. Executors then create an acknowledgment receipt, which is an object that includes the following:

The public key of the transaction creator’s (participant) oplog.
The oplog block number where the transaction exists.
The root hash of the oplog at the time of the transaction.
A local timestamp of when the executor executed the process() function.
The metadata returned by the process() function (e.g., process(operation()).

The executor then calls the atomic() function with the following three parameters:

`tx`

tx is an object containing dDatabase’s put(key, value) and del(key) APIs, discussed earlier in the paper, both of which are utilized for queuing updates deterministically to the machine’s main index log.

`op`

op contains the actual operation data from the participant’s oplog, which is the data that was passed to the process(operation).

`ack`

ack contains the acknowledgment receipt created after the execution of process(operation). A property named organic and a property named error are added to the ack object.

If the atomic() call completes successfully, it will return a fulfilled Promise. In this case, ack.organic is set to true and ack.error is set to false.

If the atomic() call results in an error, some contract violation has taken place by a participant (atomic() will always return a “rejected Promise” if an error takes place). If an error has taken place, ack.organic is set to false and ack.error is a string describing or representing the error. Also, if an error has taken place, the data is artificial – the queue of tx actions is cleared and the artificial state is applied to the machine’s overall state with a receipt stored in the machine’s nucleus.

Atomically Applying State

Regardless of whether a transaction is considered organic or artificial, an acknowledgement receipt is always stored in the machine’s nucleus at .nucleus/acks/{ oplog-pubkey-hex }/seq, where seq is the block number related to the oplog transaction (the block of the participant’s UniChain where the transaction was created). The queued actions within the atomic() function are then atomically applied to the machine’s index-log.

The participant/executor model can be seen in Figure SC-1.

Participant’s Acknowledgement of Execution

A participant, while awaiting a transaction to be executed, can monitor its status by watching the machine’s index-based nucleus acknowledgement state at .nucleus/acks/{ oplog-pubkey-hex }/seq, in which seq is the block number where the newly minted transaction exists within the participant’s local oplog. Once an executor publishes the acknowledgement receipt related to the participant’s transaction, the participant is able to fetch the following information for each operation related to the transaction from the index log.

op - The operation.
ack - The ack related to the operation.
mutations - All mutations to the index that match the ack.
indexProof - The root proof (root hash and signature) at the block number where the ack was created.

This process is referred to as Acknowledgement of Execution (AOE).

Towers

Towers are participants that allow for public actions to be executed within a contract by third-parties that are not oplog bearers. Towers publish RPC-based endpoints that allow for the outside execution of a contract’s public actions. An example of a public action would be an action that allows participants to join a machine. Considering participants (end users) can only execute contract actions and publish their own state via oplogs, a participant’s oplog identifier would have to be published to the machine’s index log within its nucleus state (.nucleus/inputs/oplog-pubkey-hex) before the participant could execute actions from a machine’s contract. Figure SC-2 shows end users of a social network operating through towers rather than operating their own oplogs.

As can be seen in Cross-Machine Execution, a machine can encode the publicized actions of a remote machine within its own contract so that the machine’s participants can execute the public actions of another machine via one of its towers. Protocols such as TAPE (Tower Authentication, Payment, and Execution) also exist for incentivizing towers and authenticating Remote Procedure Calls.

Note: It is recommended that end users of a contract are participants and that towers are only used for onboarding participants or cross-machine execution.

Towers take an AOE and use it to compile their RPC-based responses. The proofs within an AOE can be used by validators and executors when validating data that derives from remote sources.

Executors

Executors watch the oplogs of participants and towers and “apply” the operations that derive from their transactions atomically to the machine’s index log, per the contract’s “apply” rules. The following is an example program that sends coins from one participant to another:

import { index, auth } from `contract`    // Core dMachine APIs
import { currency } from `@dmachine/currency`    // dMachine plugin
    export async function getBalance (user) {
    const currentBalance = await index.get(`/balance/${ user }`)
    return currentBalance ? number(currentBalance.value) : false
}
export async function hasBalance (user, amount) {
    const balance = await getBalance(user)
    return balance >= amount ? time : false
}
export async function validateSignature(txid, publicKey, signature) {
    const status = await auth.validateSig(txid, publicKey, signature)
   returns status
}
export async function getUserPubKey (user) {
    const key = await index.get(`/${ user }`)
    return key.value.toString(`hex’)
}
export async function isIdUnique (txid) {
    const status = await index.get(`/${ txid}~)
    return status ? false : true
}
export async function isParticipantKey (participant, key) {
    const participantKey = await index.get(`/participants/${ participants }`)
    return participantKey.value.toString(`hex`) === key ? true : false
}
export async function sendCoins (txid, from, to, amount, memo, signature) {
    const timestamp = Date.now()
    const fromKey = await getUserPubKey(from)
    const uniqueID = await isIdUnique(txid)
    const validSig = await validateSignature(txid, fromKey, signature)
    if (uniqueId **@@** validSig **@@** hasBalance(from, amount)) {
        const ops = { timestamp, txid, from, to, amount, memo, signature }
        emit({ ops `SEND`, ops })
    }
}
// Pure `apply` function executed by executors
Export const apply = {
    Async SEND (tx, op) {
        const fromBalance = await getBalance(op.from)
        const toBalance = await getBalance (op.to)
        const fromKey = await getUserPubKey(op.from)
        const isKey = await isParticipantKey(op.from, fromKey)
        if (isKey) {
            tx.put(`/${ op.txid }`, op)
            tx.put(`/balance/${ op.from }`, fromBalance = op.amount)
            tx.put(`/balance/${ op.to }`, toBalance + op.amount)
        }
    }
}

Per the contract above, participants can execute the sendCoins action, but must provide their username (from), the username of the participant they are sending coins to (to), the amount (amount), an optional memo (memo). and a signature (signature) of the txID (txid).

The participant performs the following pre-flight data checks:

It retrieves the from user’s public key from state (/${ user }).
It ensures the txid does not already exist in the machine’s state (/${ txid }).
It validates the signature against the txid using the retrieved from user’s public key.
If the txid is unique, and the signature is valid, and from has the amount in its balance (/balance/${ user }), the ops are stored in the participant’s oplog.

The executor sees that SEND must be executed, applying the ops to the machine’s state. The pure apply function utilizes MultiChain’s apply filter described earlier in this paper and is therefore deterministically executed. The pure apply function allows an entire machine, its oplog(s), and index log(s) to be replayed so that its state can be validated by outsiders. Per the contract’s SEND method, the executor performs the following pre-flight data fetches and checks:

It fetches op.from and op.to balances.
It fetches op.from’s public key.
It ensures the key retrieved from state for op.from is the same public key as the oplog producing the oplog. This check ensures that op.from is also the participant.

Once checks are complete, SEND pushes three put() calls in the tx queue, all of which are atomically executed.

Note: the contract above omitted several actions for creating accounts, onboarding participants via towers and many other actions for the sake of brevity.

Single-Executor

A single executor ensures that a contract’s transactions are deterministically and linearly applied to a machine’s state. That being said, a single-executor can choose to censor the oplogs of a specific participant by simply removing them from a machine’s inputs. In scenarios in which there is a single executor, executor treason can leave a machine in an unrecoverable state. Even so, a machine can be forked by a community of participants and they can choose a new executor from the pool of participants. In this case, the chain continues running on a new machine that is comprised of the old machine’s data, minus the compromised part of state, unless of course the community decides to leave record of the executor’s treason.

In this way, contracts and the applications that integrate with them can be programmed to automatically react to executor treason, once a validator makes participants aware of the treason, forking to a new machine that utilizes the same oplog inputs and a newly elected executor. Another option would be to utilize a “Multi-Executor” machine. The “Single-Executor” model can be seen in Figure SC-3.

Multi-Executor and Proof-of-Execution (POE) Consensus

[ Coming Soon ]

Validators

Outside validators (third parties) and participants “monitor” the main index log for executor treason by replaying a machine’s oplogs (inputs) and deterministically applying the oplogs to the main index log, which can then be computed against the machine’s published index log. This process takes mere seconds, even for massive index logs, allowing validators to quickly broadcast treason to participants. If there are no diffs between the generated index log and the published index log, the machine is considered organic and its data validated. On the other hand, if diffs do exist, the machine is considered compromised. A message is then sent out over BIT to the machine’s swarm that the log has been compromised and cannot be trusted. This message is sent out by the first validator that discovers executor treason.

Organic Data

Since oplogs and the main index log are both UniChains, a machine’s logs are append-only and each block within each log is signed by the participants and executor(s) that created them.

Proving Violation of Append-Only Constraint

UniChains utilize Merkle trees to identify and validate the state of a log at any time in its history, so that consumers of the data can “time-travel.” Signed root hashes can be compared over BIT between swarming peers to ensure identical histories. If differing histories have been published, these signed root hashes can be used to “prove” that the append-only constraint has been breached.

Merkle trees mathematically prove that any particular “version” of a UniChain is a derivative or “superset” of any previous version (i.e., block 5, derived from block 1). If Peer A shares Log A and Peer B shares Log A, where the content of the logs differ, the bad log can be discovered by validating the Merkle trees for both logs. UniChains are identified using the public key corresponding to the keypair that signs each block; thus, signatures can easily be validated on a block-by-block basis.

Efficient State Reading

As was mentioned earlier, oplogs are naked UniChains, while index logs are built on dDatabase’s high level data abstraction. Using dDatabase’s embedded indexes, a machine’s state can be read efficiently. As was shown in the previous contract example, output methods (executor apply methods) utilize a key-value schema. dDatabase’s sparse state synching means that consumers of a machine’s output (its index log) only have to download the portions of state they need, while still being able to validate the index log’s Merkle tree.

Proof of Violation (POV)

Efficiently and persistently syncing logs over BIT with multiple validators as transactions are created, reduces the likelihood of participants or executors violating the append-only constraint. Once a validator has fully synched the entire state of all logs, it replays all oplogs deterministically against the contract’s apply function. The validator then compares its generated index log with the machine’s published indexed log. If diffs are discovered, the validator gives notice of the violation by sharing a “Proof of Violation” (POV) that includes the following:

The block number in the index log where the violation occurred.
The inclusion proofs (root hash and signature) of the relevant blocks.

Inclusion Proofs

Each UniChain entry has an “Inclusion Proof” that includes the following:

The UniChain’s public key.
The block number.
Hash of the UniChain’s state at the current block when the entry was created.
Signature of the hash.

Inclusion Proofs prove that an entry within a UniChain derived from its author, since the signature proves ownership and the hash of its state proves the location of the entry within its state (its block number or length). Inclusion Proofs therefore allow one machine to reference data from another, considering one proof can be connected and included alongside another in a processed called “proof chaining.” This same process is used to connect oplogs to a machine’s index log, since root proofs of an oplog are included within a transaction’s AOE within the index log.

Validation Flow for Single-Executor

The following flow takes place during the machine validation process:

Validate that Block 0 is a valid dDatabase header.
Initiate Nova VM with the contract source at Block 1.
Validate the Ricardian contract at Block 2.
Set index block to 2 and run a loop, while indexBlock < log.length over each entry in the index, with the following looping conditional statements:
- If an entry’s key is equivalent to .nucleus/ack/genesis, exit loop.
- If an entry’s key is not .nucleus/inputs/{ key }, fail validation and issue POV.
- Increment indexBlock and continue loop, examining the next entry.
Once loop exits, indexBlock should be incremented once more. A valid exit means that the genesis entry has been reached. During the loop, a mapping of active oplogs (inputs) was created under the name processedBlocks for each oplog, which each entry set to “-1.”
While the indexBlock is less than the length of the index log, run the following looping verifications over each entry to verify the machine’s state:
- Set ack to indexBlock.
- If ack key does not match .nucleus/ack/{ pubkey }/{ indexBlock }, issue POV and exit loop.
- If ack write-type is not put, issue POV and exit loop.
- If the indexBlock segment of the key does not equal processedBlocks[ key ] + 1, issue POV and exit loop.
- Get the op from the oplog specified by the ack.
- Rewind Nova VM index log state to indexBlock.
- Call atomic() with the following parameters:
  - tx
  - op
  - ack
- Set newContractSource to null.
- Set oplogChanges to an empty array.
- Iterate the actions in the tx using offset i:
  - If tx[ i ] type does not equal the type in indexLog[ indexBlock + i ], issue POV and exit loop.
  - If tx[ i ] key does not equal the key in indexLog[ indexBlock + i ], issue POV and exit loop.
  - If tx[ i ] value does not equal the value in indexLog[ indexBlock + i ], issue POV and exit loop.
  - If tx[ i ] key is set to .nucleus/contract/source, set newContractSource to the tx[ i ] value.
  - If tx[ i ] key is .nucleus/inputs/{ pubkey }, add the value to the oplogChanges array.
- Set processedBlocks[ pubkey ] to the indexBlock segment of the ack key.
- Increment indexBlock by tx.length + 1.
- If newContractSource is not null, replace the active VM with its value.
- Iterate each entry in oplogChanges and add or remove oplogs from the machine’s nucleus, according to the encoded changes (additions or removals).

Smart Contract Programs

As mentioned previously, each machine initializes a smart contract program as block 1 of the machine’s index log. Each contract defines an API for creating transactions in a participant’s oplog in the form of “actions.” Each contract also defines a pure apply function for deterministically producing the index log from a machine’s oplogs. The deterministic nature of the apply function allows for the index log to be continuously reproduced from any historical point in the state. The important part about the apply function is that operations are ALWAYS executed in the same order. This consistency is due in part to the pureness of the apply function, which does not allow for the generation of timestamps or pseudo-randomness.

JavaScript and Nova VM

Smart contract programs are written in JavaScript, the most widely-used computer programming language. As it relates to the dWeb, this prevalence lowers barriers to entry, allowing any software developer with a basic understanding of JavaScript to write dMachine based programs. Since dWeb applications (distributed in dDrives) are also written in JavaScript, as well as HTML and CSS, JavaScript developers will very likely rule the dWeb.

Contracts are sandboxed at runtime within Nova VM, a JavaScript and WASM Interpreted Virtual Machine. Nova ensures that non-deterministic behavior and unbounded computations (infinitely running computations) are prevented. There is never a point, during either the parsing or evaluation phases, in which Nova VM uses unbound recursion or loops. Nova is constrained to limit or eliminate the ability for a corrupt contract to cause a crash or infinitely hang the VM.

Even more important is the secure execution of contracts. Nova is designed to avoid unbounded memory allocation, extremely long load times, and stack overflows deriving from a syntax analysis (e.g., recursive descent parsing or execution). This secure environment is made possible by Nova’s built-in strict type-checks.

Nova was also designed to ensure the deterministic execution of contracts. Due to the non-deterministic nature of denormals, NaNs, and rounding modes as it relates to floating-point arithmetic, in addition to the underlying physical computer’s ALU, Nova relies on the softfloat implementation of IEEE-754 floating-point arithmetic, which is further constrained to ensure determinism.

These features ensure that a potentially untrusted contract program cannot harm a participant’s local system, and that machines can be deterministically replayed in order to validate a machine’s state safely and efficiently.

Core Engine APIs

dMachine’s core software comes equipped with many core APIs that simplify smart contract development. These core APIs follow a set of Core Data Schemas that are described in a subsequent section. The following Core APIs are included:

Authentication
Currency
Governance
Index
Remote
Horizon
BitcoinIQ
BitNames

Authentication

The core authentication API allows developers to build an entire user registry into their machine using just a few lines of code. For example:

import { auth } from `contract`
export async function signup (user, key) {
    const exists = auth.userExists(user)
    if (!exists) emit({ op: `SIGNUP`, user, key })
}
export const apply = {
    async SIGNUP (tx, op) {
        tx.newUser(op-user, op-key)
    }
}

Authenticators are available on the tx object, which run the put() API in the background utilizing the Auth API’s core data schema.

For more information on the Auth API, read the official documentation located here.

Currency

The core currency API allows a dMachine developer to create a cryptocurrency on chain, mint currency to accounts, and transfer currency between users, including cross-chain token swaps, via Towers.

For more information on the Currency API, read the official documentation located here.

Governance

The core governance API allows a dMachine developer to build decentralized autonomous organizations (DAOs) around their applications. Developers may want to develop a proposal system for submitting changes to a machine’s contract or a reporting system so that community members can vote on the removal of content.

For more information on the Governance API, read the official documentation located here.

Index

The Index API, as shown in the earlier contract example, has a single get() method for retrieving data from the machine’s local index log by key.

For more information on the Index API, read the official documentation located here.

Remote

The Remote API allows a contract to query the data from a remote machine’s index log using a single get() method. The difference between Index and Remote’s get() method, is that Remote’s get() accepts a UML that follows the Unified Machine Identifier (UMI) syntax, known as a “Unified Machine Locator” or “UML” for short. A UML consists of a dWeb network address and a dDatabase key or a dWeb domain name and a dDatabase key. For example:

remote.get(<key>/user/neo) or remote.get(social.x/user/neo)

For more information on the Remote API, read the official documentation located here.

Horizon API

The Horizon API allows developers to interact with the Horizon dMachine. Horizon is a global identity machine that utilizes Decentralized Identity Documents (DIDs). Developers can choose to utilize Horizon for user authentication instead of building their own authentication system.

For more information on Horizon, read the documentation located here.

BitcoinIQ API

BitcoinIQ is the world’s first “hypercurrency” and the first dMachine-based currency. BitcoinIQ exists within the Horizon dMachine. The BitcoinIQ API allows developers to incorporate BitcoinIQ into their smart contracts.

For more information on BitcoinIQ, read the documentation located here or here.

BitNames API

BitNames is the dWeb’s Decentralized Domain Name System. The BitNames API allows for dWeb domain records stored via the BitNames dMachine to the be resolved to dWeb network addresses. For example:

const address = bitnames.resolve(“domain.x, “SC”)

For more information on BitNames, read the documentation located here or here.

Core Data Schemas

dMachine’s core API methods store data in a machine’s state using specific key/value schemas, all of which are defined in each API’s documentation. For example, the “Auth” API uses the following schema when adding users to a machine’s state:

key: /users/{ username }
value: {
    username,
    publicKey,
}

These deterministic schemas allow for plugins to be developed around these core APIs.

dMachine Plugins

Developers can develop plugins for dMachine contracts, publish on NPM and allow other developers to utilize their dMachine APIs. dMachine plugins can be imported by any contract and used to expand its functionality. dMachine plugins are the Legos of decentralized application development.

Cross-Machine Data Fetching

As was explained in the Remote API section, one machine’s contract can fetch data from another machine’s state. This functionality is made possible by the remote.get() core API method. However, this process is not as simple as it seems. get() is passed a Uniform Machine Identifier (UMI) consisting of an address/key or “domain/key” schema:

If the UMI uses a domain/key schema, the domain is resolved to an address by performing a lookup via the BitNames machine for the “DC” record type. After resolving the address, the schema is converted to address/key. The resolver in this case is referred to as Brane. Brane is able to perform cross-registrar lookups as more domain name registrars are launched.
A dWeb Swarm instance is initiated, performing a lookup on a dWeb-compliant DHT for the address. The data related to the key (<address>/user/neo) is sparsely fetched from the machine’s swarm (its peers).
The data returned by remote.get() uses the following envelope:

{
    version, data: { value related to the remote data }
    rmip: {
        oplogBlockNumber,
        oplogKey,
        oplogRootHash,
        oplogSignature,
        indexBlockNumber,
        indexKey,
        indexRootHash,
        indexSignature
    }
}

Unified Machine Identifier

A UML that follows the UMI syntax uses either a domain/key schema or an address/key-schema:

Domain/key schema: bit://social.x/neo/posts/{ postID } Address/key schema: bit://<key>/neo/posts/{ postID }.

UMI syntax can also be used to deterministically fetch data by using a “version” number:

bit://social.x/neo/posts/{ postID }?version=1

In the example above, we are fetching the first version of the key. If Neo ever edited the above post, the first edit would be version 2 and so forth. A UMI’s determinism is critical as it relates to machine replays. If a UML does not include a version, dMachine’s core engine defaults to the latest version.

Deterministic State Linking

A UMI’s deterministic nature allows for remote data to be stored within a machine’s state. This data can easily be fetched in the validation process, ensuring that the same version of the data is always fetched regardless of replayer. This functionality is referred to as “Deterministic State Linking” or “DSL” for short.

Remote Machine Inclusion Proofs

The remote.get() method returns a “Remote Machine Inclusion Proof” (RMIP), which can also assist in the DSL schema. An RMIP contains the following:

The block number in the oplog where the operation related to the data exists.
The oplog public key.
The oplog’s root hash at the given block length.
Remote participant’s root hash signature.
Block number in the index log where the data exists (Remote index log).
The index log public key (Remote index log).
The index log’s root hash at the given block length (Remote index log).
Remote executor’s root hash signature.

Validation flows can be created that utilize RMIPs in the validation process. It is important that machines containing data from other machines prove that the remote state included within its actual state is proven to have integrity. During the validation process, a remote index log and a remote oplog can be “rolled back” to a specific state so that the machine’s remote data can be validated.

Cross Machine Execution

Cross-Machine Execution is made possible via towers, where one contract allows third parties or participants to execute an action on a wholly unrelated machine. Much research is currently underway involving cross-machine execution, distributed computing marketplaces, and RPC-based metering utilizing Hypercurrencies such as BitcoinIQ.

The first DWRC (dWeb Request for Comment) surrounding this research was recently published here. At the time of this writing, Cross-Machine Execution APIs are still under development and will be published in a future paper.

dMachine Use Cases

As will be seen in the final sections of this paper, dMachines can be used to build nearly any digital experience imaginable. dMachines bring together the technologies previously described in this paper, such as a dWeb-compliant DHT, UniChains, MultiChains and dDatabases. dMachines will likely have a place in powering the metaverses of the future, including other use cases such as:

Social networks.
Communication systems.
Currencies, currency exchanges, and banking systems.
Domain name system.
Document collaboration.
Video platforms.
Machine learning networks.
IoT networks.

dDNS

In order to create a truly distributed and decentralized web, the dWeb cannot be dependent upon centralized domain names and the Internet's tree-structured name space and is not compatible with the Internet's "Domain Name System" (as defined in RFC 1034 and 1035). The dWeb needed its own domain name system that was compatible with dWeb-compliant network addresses. The dWeb would be pretty hard to use without domain names or a directory-like service for dWeb network addresses, considering that 32-byte addresses are difficult to remember. A Domain Name System is a directory lookup service that provides mapping between the names of a host on the Internet and a numerical address or a canonical name. DNS is essential to the functioning of the Internet.

dWeb's Domain Name System is a Decentralized Domain Name System (dDNS), meaning the registration of Decentralized Top-Level Domains (dTLDs) and the registration of dTLD-based domain names are distributed and decentralized, where records are distributed across the computers of peers on the network related to a given dDNS registry. Like DNS, dDNS is a directory lookup service and provides a mapping between the name of a dDrive, device or peer and a dWeb-based network address. dDNS is essential to the functioning of the dWeb. A dDNS registry is a network of peers that share data regarding dTLDs, registered domains and their associated domain name records. Unlike traditional DNS, dWeb's dDNS does not utilize nameservers since the dWeb is completely serverless. DNS records are stored in distributed databases, whether they be a dDatabase, a dMachine's indexLog or a DHT (if a domain registry is built around an app-specific DHT). Data related toa dDNS registry are resolved using resolvers that simply query the distributed database. We have created a reference implementation known as the Brane Resolver which showcases how domains can be resolved. The dWeb's dDNS model is depicted in Figure BN-2.

Decentralized Top-Level Internet Domains (dTLDs)

dWeb's dDNS registries can issue their own dTLDs (Decentralized Top-Level Domains) that can be registered by users. dTLDs can include anything from text and numbers to emojis and other UniCode-compliant syntax. This specification does not specify whether dDNS registries should have an economic model or the method for which dTLDs are acquired (i.e. auction). Any participant within the dDNS Registry should be able to register a domain with any dTLD, even though dTLDs are originally registered by a user.

dTLD Data Model

Domain data that is stored within a dDNS Registry must conform to the following Data Model:

key - /dtld/<dtldName>
value:

{
  "dateRegistered": <timestamp>, // when was this dTLD registered?
  "owners": [], // addresses of owner(s) DID document(s)
  "otherData": {} // an object for storing futher data.
}

Decentralized Domains

dDNS Registries should allow users to register domains surrounding active dTLDs that have been registered via the registry. Domain names can include anything from text and numbers to emojis and other UniCode-compliant syntax.

Domain Name Data Model

Domain data that is stored within a dDNS Registry must conform to the following Data Model:

key - /domain/<domainName>
value:

{
  "dateRegistered": <timestamp>, // when was this domain registered?
  "owners": [], // addresses of owner(s) DID document(s)
  "otherData": {} // an object for storing futher data.
}

Resource Records

Resource Record Types

A dWeb-compliant dDNS utilizes several “Resource Record” types for different types of network entities, including the following:

Record Type	Description
DD	dDrive record for identifying a dDrive resource.
DX	Device address.
RM	Registrar machine address.
U	Record for identifying any UniChain-based structure.
DM	Record for identifying a dMachine
DB	Record for identifying a dDatabase
MM	Record for identifying a Mail Machine.
TXT	Way of putting text comments in the DNS database for a domain.
TW	Record for identifying a tower.
CNAME	Canonical name. Specifies an alias for a BD record.
SRV	Service record for identifying a service.
PX	Peer address.

Resource Record Data Model

Resource Record data that is stored within a dDNS Registry, must utilize the following Data Model:

key - /rr/<domain>
value:

{
  "rrType": <rrType>, // example: CNAME
  "rdata": <hash>, // what dWeb address does this record point to?
  "createdAt": <timestamp>, // timestamp of when this record was originally created.
  "lastModified": <timestamp>, // timestamp of when this record was last modified.
  "publicKey": <publicKeyOfRecordAuthor> // public key of the record's creator. Must be in the `owners` array within the Domain data.
  "class": <recordClass>,
  "ttl": <ttl>,
  "description": "my website",
  "otherData": {} // an object for storing futher data.
}

Resolving Records

Querying a dDNS Registry for data should be pretty straightforward, since a compliant dDNS Registry is built on a distributed key-value store that is either compliant with the dDatabase specification or a DHT.

To resolve a query for any dWeb-compliant domain name, simply query the key /domains/<domain>. This should return the following data:

{
  "dateRegistered": <timestamp>, // when was this domain registered?
  "owners": [], // addresses of owner(s) DID document(s)
  "otherData": {} // an object for storing futher data.
}

To resolve a query for any dWeb-compliant domain name resource record, simply query the key /rr/<domainRecord>. This should return the following data:

{
  "rrType": <rrType>, // example: CNAME
  "rdata": <hash>, // what dWeb address does this record point to?
  "createdAt": <timestamp>, // timestamp of when this record was originally created.
  "lastModified": <timestamp>, // timestamp of when this record was last modified.
  "publicKey": <publicKeyOfRecordAuthor> // public key of the record's creator. Must be in the `owners` array within the Domain data.
  "class": <recordClass>,
  "ttl": <ttl>,
  "description": "my website",
  "otherData": {} // an object for storing futher data.
}

To resolve a query for any dWeb-compliant dTLD, simply query the key /dtld/<dtldName>. This should return the following data:

{
  "dateRegistered": <timestamp>, // when was this dTLD registered?
  "owners": [], // addresses of owner(s) DID document(s)
  "otherData": {} // an object for storing futher data.
}

Brane Resolver Implementation

For more information on the Brane Resolver, please go to its official repository.

dIdentity

Under Development

Horizon

Under Development

dCash

Under Development

BitcoinIQ

Under Development

dOrganizations

Under Development

App-Specific DHTs

Under Development

Conclusion

The dWeb preserves the freedom of the individual, within a collective, while presenting technologies that not only allow for the sharing of open and trustless data but allow for the innovation of open and trustless systems. The dWeb is a web where both users and developers benefit one another, and in ways the World Wide Web could never facilitate. We live in interesting times, where it seems like the "golden age of information" is coming to an end, but it's far from over if We The People have anything to say about it. Like those in 1775, we must maintain our independent spirit and, above all, we must maintain our consciousness. The revolution is consciousness, and as we awaken from being slaves who have long been placed under a hypnosis of freedom and liberty, we are now starting to realize that those freedoms, as well as our liberties, have been eroded to a point where they remain barely intact.

Our freedom of speech is one of our most important rights as ratified by our founders. Yet it's no longer a right so much as it is a scarce privilege on the World Wide Web. Big tech companies, through their partnerships with China, have waged war on those who choose to expose their anti-American propaganda being spread throughout the web. From tweets to YouTube videos, you either agree with them or you're fact-checked into oblivion, and banned not long thereafter. It's a false matrix that is now riddled with censorship and controlled by the likes of communist neo-fascist dictators like Mark Zuckerberg and Jack Dorsey. You could search Google to confirm what I'm telling you, but they have already scrubbed those sources months ago, sadly.

The banking cartel didn't want the masses to have free and fair access to the web and they have certainly found a way to finally restrict it. Though what they never counted on was a movement of patriots who dreamed of a decentralized world and set out on a journey to figure out how to decentralized everything, risking their lives and freedom in the process. The dWeb is a specification that defines the decentralized components that are needed to build truly end-to-end decentralized systems, applications and more, allowing people and companies alike to restore their online freedoms and liberties at a time when they're needed most. As much as they would like to shut down the Distributed Web and as I have thoroughly pointed out in this paper, it simply isn't possible.

The distributed web's infrastructure has been online for well over 20 years and its torrent networks have survived every attack the globalists could think of! The ideas presented in this specification, like dCash, can be created on top of that distributed infrastructure effortlessly, whether it's BitTorrent, Hyper or other distributed web infrastructure. The purpose of this living document is to help further the develop of the Distributed Web and to ultimately enable the development of powerful decentralized applications that can reach billions of people. I can safely say that decentralized applications like the ones we have dreamed about for years, are no longer a dream, they're a reality.

Expanding the dWeb will be one of the most important fights for freedom since a few patriots decided to take up arms on a cold night in 1775 and fight for their families, their future and most importantly, their freedom. You too can join the fight and become a part of the dWeb revolution yourself, by simply downloading dBrowser and browsing much of what the dWeb already has to offer. While this paper may seem like a lot, I can assure you just as I'm writing this with blistered fingers - we're just getting started. There is so much to do and so much to build.

I hope our work motivates you to join the cause. I hope it liberates you. I hope it awakens you.

Fight on!

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
images		images
README.md		README.md
README_4.0.md		README_4.0.md

DistributedWeb/whitepaper

Folders and files

Latest commit

History

Repository files navigation

NOTE: This document is currently under heavy development and is currently in an incomplete state as we finalize the fifth revision.

The Distributed Web (dWeb)

Authors

Jared Rice Sr., Neo Thawreww, Shikhar Srivastava and Vinay Gupta

Versions

Core Concepts

Manifesto

Introduction

The Inception of Organic Machines

Cooperative Organic Machines and The Particalized World of the dWeb

Distributed Networking

dWeb-Compliant Distributed Hash Tables

UniChains

Data Integrity

User-Controlled Data

Live UniChain Replication

Chainstore Management and Replication

Use Cases for SWLs

Binary Interplanetary Transport Protocol

Bit Request Types

dWeb Data Model

Peer Messaging Process

MultiChains

BitStream

Appending Data to a BitStream

Viewing a Causal Stream

Causal Ordering

Forks and Stream Reordering

Linearized Mapping

UniChain Truncation

Linearized Views

View Generation and Sharing MultiChains

dDatabases

Indexing and Range Queries

Sparse dDatabase Replication

Collaborative dDatabases

Identifying High-Level Abstractions

dDatabase Use Cases

dDatabase Implementations

dDrives

Version Control and File System Reproduction

Content Value Structure

Metadata Value Structure

Metadata Bitfield Schema

File System Reproduction

Spare File Systems

dDrive Nesting

Live File Updates

Truly Decentralized Websites and Web Applications

dMachines

Open Machines and the DOOM Model

Machine Genesis Initialization

Participants

Executor Transaction Processing

tx

op

ack

Atomically Applying State

Participant’s Acknowledgement of Execution

Towers

Executors

Single-Executor

Multi-Executor and Proof-of-Execution (POE) Consensus

Validators

Organic Data

Proving Violation of Append-Only Constraint

Efficient State Reading

Proof of Violation (POV)

Inclusion Proofs

Validation Flow for Single-Executor

Smart Contract Programs

JavaScript and Nova VM

Core Engine APIs

Authentication

Currency

`tx`

`op`

`ack`