Skip to content
Branch: master
Find file Copy path
Find file Copy path
3 contributors

Users who have contributed to this file

@xhipster @serejandmyself @litvintech
823 lines (613 sloc) 65 KB
\newfontfamily{\play}[Path=./, Scale=1.1]{Play-Regular.ttf}
\newfontfamily{\PlayBold}[Path=./, Scale=1]{Play-Bold.ttf}
colorlinks = true,
\renewcommand\thesubsection{\arabic{subsection}. }
{-3.25ex\@plus -1ex \@minus -.2ex}%
{0ex \@plus .2ex}%
{\play\Large}}% from \large
\renewcommand\thesection{\arabic{section}. }
\newcommand{\code}[1]{{\PlayBold #1}}
\def\Stepx{0.5} %% separation between dots
\def\Stepy{0.6} %% separation between dots
\def\Size{5pt} %% radius of the dot
\def\Toty{60} %% adjust
\def\Totx{55} %% adjust
scale=1, angle=10, position={current}, contents={%
\begin{tikzpicture}[remember picture,overlay]
\node at (current {\usebox\mybox};
breakatwhitespace=false, % sets if automatic breaks should only happen at whitespace
escapeinside={\%*}{*)}, % if you want to add LaTeX within your code
language=Octave, % the language of the code
showtabs=false, % show tabs within strings adding particular underscores
\node[inner sep=0pt,outer sep=0pt] (tocancel) {#1};
\draw[black] (tocancel.south west) -- (tocancel.north east);
\list{}{\leftmargin2.1cm \rightmargin\leftmargin}
\endlist \par\bigskip
% \renewcommand*{\ttdefault}
% \EverySelectfont{%
% \fontdimen2\font=0.4em% interword space
% \fontdimen3\font=0.2em% interword stretch
% \fontdimen4\font=0.1em% interword shrink
% \fontdimen7\font=0.1em% extra space
% \hyphenchar\font=`\-% to allow hyphenation
% }
\title{\fontsize{16}{17}\selectfont\textnormal{\MakeLowercase{\play{cyber: \uppercase{C}omputing the knowledge from web3}}}}
@xhipster \& @litvintech work in progress as of \today
A consensus computer allows for the computing of provably relevant answers without the opinionated BlackBox intermediaries, such as: Google, Amazon or Facebook. Stateless, content-addressable peer-to-peer communication networks, such as IPFS, and stateful consensus computers such as Ethereum, are only able to provide part of the solution needed to obtain akin answers. But, there are at least 3 problems associated with the above-mentioned implementations. Of course, the first problem is the subjective nature of relevance. The second problem is the difficulty in scaling consensus computers for over-sized knowledge graphs. The third problem lies in the lack of quality among such knowledge graphs. They will be prone to various surface attacks, such as: Sybil attacks, and the self-indulgent behaviour of the interacting agents. In this document, we define a protocol for provable consensus computing of relevance in-between IPFS objects, which are based on the Tendermint consensus of cyber•rank, which is itself - computed on GPU. Also, we outline the design for the initial distribution games, based on our previous experience. We believe that a minimalistic architecture of the protocol is critical for the formation of ‘a network of domain-specific knowledge’ in consensus computers. As a result of our work, some applications never to have existed before - will emerge. We expand this work with our vision of possible features and potential applications.
\titleSection{The Great Web}\label{The Great Web}
Original protocols of the Internet, such as: TCP/IP, DNS, URL, and HTTP/S have brought the web to a stale point, where it is located as of now. Considering all the benefits that these protocols have produced for the initial development of the web; along with them, they have brought significant obstacles to the table. Globality - being a vital property of the web is under a real threat since its inception. The speed of the connection keeps degrading while the network itself keeps growing. Other issues include ubiquitous government interventions into user-privacy and user-security.
One property, not evident in the beginning, becomes important with everyday usage of the Internet: the ability to exchange permanent links, thus, they \linkgreen{}{would not break after time had passed}. Reliance on the architecture of ‘one at a time ISP’, allows governments to effectively censor packets. It is the last straw to be pulled out of the traditional web-stack, for every engineer that is concerned about the future of our children.
Other properties, while might not be so critical, are very desirable: offline and real-time connection. The average internet user, whilst ‘offline’, should still have the ability to carry on working with the state that they already possess. After acquiring a connection they should be able to sync with the global state and to continue to verify the validity of their state in real-time. Currently, these properties are only offered on the application level, while we believe that such properties should be integrated into lower-level protocols.
The emergence of \linkgreen{}{a brand-new web-stack} creates an opportunity for a superior Internet. The community calls it: web 3.0. We call it: The Great Web. We believe that various types of low-level communications should be immutable and should not alter for decades, e.g. immutable content links. They seem very promising at removing the problems of the conventional protocol stack; they add better speed and a more accessible connection to the new web. However, as it happens with any concept that offers something unique - new problems also emerge. One such concern is 'general-purpose search'. The existing general-purpose search engines are restrictive and centralized databases that everybody is forced to trust. These search engines were designed primarily for client-server architectures, based on TCP/IP, DNS, URL, and HTTP/S protocols. ‘The Great Web’ creates a challenge and an opportunity for a search engine based on emerging technologies and is designed specifically for these purposes. Surprisingly, permissionless blockchain architecture allows building a general-purpose search engine in a way inaccessible to previous architecture.
\titleSection{On the adversarial examples problem}\label{On the adversarial examples problem}
\linkgreen{}{The current architecture of search engines} is a system where one entity processes all the shit. This approach suffers from one challenging and a distinct problem, that has yet to be solved, even by the brilliant Google scientists: \linkgreen{}{the adversarial examples problem}. The problem that Google acknowledges, is that it is rather difficult to algorithmically reason, whether or not a particular sample is adversarial. This is inconsiderate of how awesome the learning technology in itself is. A crypto-economical approach can change beneficiaries within that game. Consequently, this approach will effectively remove possible Sybil-attack vectors. It removes the necessity to hard-code model crawling and the meaning extraction from a single entity to the whole world. Learning Sybil-resistant models will probably lead to orders of magnitude of more predictive results.
\titleSection{Cyber protocol}\label{Cyber protocol}
In its core the protocol is very minimalistic and can be expressed by the following steps:
\item Computing the genesis of cyber-protocol, based on the distribution rules defined by this paper
\item Defining the state of the knowledge graph
\item Gathering cyberlinks
\item Checking the validity of the signatures
\item Checking the validity of CIDv1
\item Checking the bandwidth limit
\item If the signatures, the bandwidth limit and CIDv1 are all valid - applying cyberlinks
\item Calculating cyber•rank deltas every round for the knowledge graph
The rest of this document discusses the rationale and the details of the proposed protocol.
\titleSection{Knowledge graph}\label{Knowledge graph}
We represent a knowledge graph as a weighted graph of direct links between content addresses or content identifications (or CIDs) or simply - IPFS hashes. In this document, we will use the above as synonyms.
Content addresses are essentially web3 links. Instead of using the unclear and mutable:
we can use the exact object itself:
By using content addresses to build the knowledge graph we gain \linkred{}{the so much needed} superpowers of \linkgreen{}{IPFS} - \linkgreen{}{like} the P2P protocols desired for a search engine:
\item mesh-network futureproof
\item interplanetary accessibility
\item censorship resistance
\item technological independence
Our knowledge graph is generated by web-agents. The agents add themselves to the knowledge graph with the help of a single transaction. Thereby, they prove the existence of their private keys for content addresses of their revealed public keys. By using these mechanics, a consensus computer could achieve provable differentiation between subjects and objects on a knowledge graph.
Our \code{cyber} implementation is based on \linkred{}{cosmos-SDK} identities, and \linkred{}{CIDV1} content addresses.
Web-agents produce the knowledge graph by applying \code{cyberlinks}.
To understand how cyberlinks function we need to understand the difference between a \code{URL link} (aka a hyperlink) and between an \code{IPFS link}. A URL link points to the location of the content, whether an IPFS link points to the content itself. The difference between web-architecture based on location links and content links is a drastic one and hence requires a unique approach.
\code{Cyberlink} is an approach to link two content addresses or \code{IPFS links} semantically:
The above cyberlink means that the cyberd presentation during cyberc0n is referencing to the Cosmos white paper. The concept of cyberlinks is a convention around simple semantics of a communication format in any P2P network:
<content-address X>.<content-address Z>
You can see that a cyberlink represents a link between the two links above. Easy peasy!
Cyberlink is a simple, yet a powerful semantic construction for building a predictive model of the universe. This means, that by using cyberlinks instead of hyperlinks provides us with the superpowers that were inaccessible to the previous architecture of general-purpose search engines.
Cyberlinks can be extended, i.e. they can form linkchains if there are two or more cyberlinks in existence from one agent, where the second link in the first cyberlink is equal to the first link in the second cyberlink:
<content-address X>.<content-address X>
<content-address X>.<content-address A>
If the web-agents wish to expand native \code{IPFS links} with something semantically richer, for example:
links, then the web3-agents will be able to reach consensus more naturally on execution rules of a specific program.
\linkred{}{cyber} implementation of \code{cyberlinks} is based on the \linkred{}{DURA} specification, which is available in the \linkred{}{.cyber} app of the new web3-browser \linkred{}{cyb}.
Based on \code{cyberlinks} we can compute the relevance of subjects and objects within the knowledge graph. This is why we need a consensus computer.
\titleSection{The notion of a consensus computer}\label{The notion of a consensus computer}
A consensus computer is an abstract computing machine that emerges from the interaction of agents. A consensus computer has the capacity in terms of fundamental computing resources, such as: memory and computing. To interact with agents a computer needs bandwidth. An ideal consensus computer is a computer where:
the sum of all computations and memory available to all individuals
is equal to
the sum of all the verified computations and the memory of the consensus computer
We know that:
verifications of computations < (computations + verifications of computations)
Hence, we will never be able to achieve an ideal consensus computer. The CAP theorem and the scalability trilemma append more proof to this statement. Yet, this theory can work as a performance indicator for a consensus computer. After 6 years of investing into consensus computers, we have come to realize that the \linkgreen{}{Tendermint} consensus has a good balance between coolness required for our task and the readiness for production. Therefore, we have decided to implement \code{cyber} protocol with the use of the Tendermint consensus, which has very close settings to the Cosmos Hub.
The \code{cyber} implementation is a 64-bit Tendermint consensus computer of relevance for 64-byte string-space; this is by far - not ideal, at least as 1/146. This is why we have 146 validators who verify the same computations by using the knowledge graph of the same size.
We must bind computation, storage and the bandwidth supply of the consensus computer with a maximized demand for queries. Computation and storage in case of a basic relevance machine can be easily predicted based on bandwidth; but bandwidth requires a limiting mechanism.
\titleSection{The relevance machine}\label{The relevance machine}
We define a relevance machine as a machine that transitions the state of a knowledge graph based on the will of the agents wishing to learn and read that knowledge graph. The will is projected by every agents' cyberlink. With more agents inquiring the knowledge graph, the more valuable the knowledge becomes. Based on these projections - relevance between content segments can be computed. The relevance machine enables a simple construction for the search mechanism via querying and delivering answers.
One property of the relevance machine is crucial: it must have inductive reasoning properties or follow the BlackBox principle.
A machine must be able to interfere with predictions without any knowledge about the objects,
except for who, when and what was cyberlinked
If we assume that a consensus computer must have some information about the linked objects, then the complexity of such a model will grow unpredictably. Therefore, the high requirements of the processing computer for memory and computation. Thanks to content addressing, a relevance machine which follows the BlackBox principle does not need to store the data, but, can still effectively operate on top of it. This is because the deduction of meaning inside the consensus computer is expensive. Hence, such a design can dependent on the blindness of assumption. Instead of deducting the meaning inside of the consensus computer, we have designed a system in which meaning extraction is incentivized. This is achieved due to agents needing CYB tokens to express their will, based on which, the relevance machine can compute rank.
In the centre of the spam protection system is an assumption that ‘write’ operations can be executed only by those who have a vested interest in the evolutionary success of the relevance machine. Every 1\% of effective stake within the consensus computer gives the ability to use 1\% of the possible networks' bandwidth and its computing capabilities. A simple rule prevents abuse from the agents: one content address can be voted on by a token - only once.
EffectiveStake = active stake + bonded stake, where:
BondedStake - stake that is deducted out of your account and is put as a deposit to take part in the consensus
ActiveStake - stake which is currently available for direct transfers or not-bonded stake
There are only two ways to change the effective stake of an account: direct token transfers and bonding operations.
Cyber uses a very simple bandwidth model. The principal goal of this model is to reduce the daily network growth to a given constant. This is done to accommodate validators with the ability to forecast any future investment into infrastructure. Thus, here we introduce \code{ResourceCredits} or 'RC'. Each message type has an assigned RC cost. The constant \code{DesirableBandwidth}, determines the desirable \code{RecoveryWindow} spent by the RC value. The recovery period defines how fast an agent can recover their bandwidth from 0 back to max bandwidth. An agent has maximum RC proportional to his effective stake, determined by the following formula:
AgentMaxRC = EffectiveStake * DesirableBandwidth
The period \code{PriceAdjustWindow} sums up how much RC was spent during that period\\ \code{AdjustPricePeriodTotalSpent}. There is also a constant \code{AdjustPricePeriodDesiredSpent}, which is used to calculate the network load.
\code{AdjustPricePeriodTotalSpent / AdjustPricePeriodDesiredSpent} ratio is called 'the fractional reserve ratio'. If the network usage is low, the fractional reserve ratio adjusts message cost to allow agents with a lower stake to commit more transactions. If the demand for resources increases, the fractional reserve ratio goes \code{>1}, consequently, increasing message cost and limiting final tx count for a long-term period (RC recovery will be \code{<} then RC spending).
As no-one uses all of their possessed bandwidth, we can safely use up to 100x fractional reserves within a 2-minute recalculation target period. This mechanics offers a discount for cyberlinking, thus, effectively maximizing demand for it. You can see that the proposed design needs demand for full bandwidth for the relevance to become valuable.
Human intelligence is organized in such a manner that the pruning of none-relevant and none-important memories that are forgotten with time. The same can be applied to the relevance machine. Another useful property of the relevance machine is that it needs to do neither - store the past, nor the current full state, to maintain usefulness. Or more precisely, it remains: \verb|relevant|. The relevance machine can implement \linkgreen{}{aggressive pruning strategies}, such as, the pruning of the history of the formation of a knowledge graph, or forgetting links that become 'less relevant'.
As a result, the implemented ‘cybernomics’ of the CYB token serves not just as 'will-expression and spam-protection' mechanisms; but, it also functions as an economics regulation tool that can align validators processing the knowledge graph, and as a demand for market processing. The \code{cyber} implementation of the relevance machine is based on a very straightforward mechanism, called: cyber•Rank.
Ranking using a consensus computer can be challenging. This is due to consensus computers having serious resource restraints. e.g. \linkgreen{}{Nebulas} has failed to deliver anything useful on-chain. First, we must ask ourselves: why do we need to compute and to store the rank on-chain and not follow the same way as \linkgreen{}{Colony} or \linkgreen{}{Truebit}?
When a rank is computed inside a consensus computer one has easy access to the content distribution of that rank, and an easy way to build provable applications on top of that rank. Hence, we have decided to follow a more 'cosmic' architecture. In the next section, we describe the ‘proof of relevance’ mechanism, which allows the network to scale with the help of ‘domain-specific relevance machines’, that work concurrently thanks to the IBC protocol.
Eventually, the relevance machine needs to obtain (1) a deterministic algorithm that will allow for computing rank on a continuously appending network, which itself can scale to the orders of magnitude of the likes of Google. Additionally, a perfect algorithm (2) must have linear memory and computational complexity. Most importantly, it must have (3) the highest provable prediction capabilities for the existence of relevant cyberlinks.
After \linkred{}{some research}, we have found that it is impossible to obtain the ‘silver bullet’. Therefore, we have decided to find a more basic, bulletproof way, that can bootstrap the network: \linkred{}{the rank} which Larry and Sergey used to bootstrap their previous network. The key problem with the original PageRank is that it wasn't resistant to Sybil attacks. However, a token-weighted PageRank is limited by a token-weighted bandwidth model, and does not inherit such problems as the native PageRank; it is also resistant to Sybil-attacks. For the time being, we will call it cyber•Rank, that is until something more suitable will emerge. The following algorithm is applied to the implementation at Genesis:
$$ CIDs \ V, cyberlinks \ E, Agents \ A $$
$$agents(e): E \rightarrow 2^{A}$$
$$stake(a): A \rightarrow {\rm I\!R}^+ $$
$$rank(v, t): V \times {\rm I\!N} \rightarrow {\rm I\!R} $$
$$weight(e) = \sum\limits_{a \in agents(e)}{stake(a)}$$
$$rank(v, t + 1) = \frac{1 - d}{N} + d\sum\limits_{u \in V, (u, v) \in E}{\frac{weight(u, v)}{\sum_{w \in V, (u, w) \in E}{weight(u, w)}}rank(v, t)} $$
$$rank(v) = \lim\limits_{t \rightarrow \infty} rank(v, t)$$
\Input{Set of CIDs $V$; \\ Set of cyberlinks $E$; \\ Set of agents $A$; \\ Cyberlink authors $ agents(e) $; \\ Stake of each agent $ stake(a) $; \\Tolerance $\epsilon$; \\ Damping factor $d$}
\Output{$\textbf{R}$, computed value of $rank(v)$ for each node from $V$}
Initialize $\textbf{R}_{v}$ with zeros for all $v \in V$\;
Initialize $E$ with value $\epsilon$ + 1\;
$N_{\emptyset} \leftarrow |\{v|v \in V \land (\nexists u, u \in V, (u, v) \in E )\}|$ \;
$R_{0} \leftarrow (1 + d \cdot N_{\emptyset} / |V|) \cdot (1 - d) / |V| $ \;
\While{$E > \epsilon$}{
\For{$v \in V$}{
$S \leftarrow 0$\;
\For{$u \in V, (u, v) \in E$}{
$W_{uv} \leftarrow \sum_{a \in agents(u, v)}stake(a)$ \;
$W_{u} \leftarrow \sum_{w \in V, (u, w) \in E}\sum_{a' \in agents(u, w)}stake(a')$ \;
$S \leftarrow S + W_{uv} \cdot \textbf{R}_{u} / W_{u}$ \;
$\textbf{R}'_v \leftarrow d \cdot S + R_{0}$ \;
$E \leftarrow \max\limits_v(|\textbf{R}_v - \textbf{R}'_v|)$ \;
Update $\textbf{R}_{v}$ with $\textbf{R}'_{v}$ for all $v \in V$\;
\caption{cyberRank algorithm v1.0}\label{algo_disjdecomp}
We understand that the ranking mechanism will always remain a red herring. This is why we expect to rely on an on-chain governance mechanism that can define the ‘most suited mechanism at a time’. We suppose that the network can switch from one algorithm to another, not simply based on subjective opinion, but rather on economical a/b testing through ‘Hard Spooning’ of domain-specific relevance machines.
cyber~Rank shields two design decisions which are of paramount importance: (1) it accounts only the current intention of the agents, and (2) it encourages rank inflation of cyberlinks. The first property ensures that cyber~Rank can not be gamed with. If an agent decides to transfer CYB tokens out of their account, the relevance machine will adjust all the cyberlinks relevant to this account per the current intentions of the agent. And vice versa, if an agent transfers CYB tokens into their account, all of the cyberlinks submitted from this account will immediately gain more relevance. The second property is essential in order not to get cemented in the past. As new cyberlinks are continuously added, they dilute the rank of the already existing links proportionally. This property prevents a situation where new and better content has a lower rank simply because it was submitted sometimes in the past. We expect these decisions to enable an inference quality for recently added content to the long tail of the knowledge graph.
We would love to discuss the problem of manual vote-buying. Vote-buying as an occurrence isn't that bad. The dilemmas with vote-buying appear in systems where voting affects the allocation of that systems inflation. For example, \linkgreen{}{Steem}
or any fiat state-based system. Vote buying can become easily profitable for an adversary that employs a zero-sum game without the necessity to add value. Our original idea of a decentralized search was based on this approach. But, we have rejected that idea, removing the incentive for the formation of the knowledge graph on a consensus level. In our environment, in which every participant must bring some value to the system to affect the predictive model, vote-buying becomes NP-hard problem, therefore, becomes beneficial to the system.
The current implementation of the relevance machine is based on 'CUDA'. They can answer and deliver relevant results for any given search request in a 64-byte CID space. However, it is not enough to build a network of domain-specific relevance machines. Consensus computers must have the ability to prove relevance to one another.
\titleSection{Proof of relevance}\label{Proof of relevance}
We have designed the network under the assumption that with regards to search, such a thing as malicious behaviour does not exist. This can be assumed as no malicious behaviour can be found in the intention of finding the answers. This approach significantly reduces any surface attacks.
Ranks are computed based on the fact that something has been searched for,
thus linked, and as a result, has affected the predictive model.
A good analogy is an observation in quantum mechanics. This is why we have no requirement for such a thing as negative voting. By doing this, we remove subjectivity out of the protocol and we can define proof of relevance.
Rank state =
rank values that are stored in a one-dimensional array
and in the Merkle tree of those values
Each new CID receives a unique number. This number starts with zero. Then, increments by one for each new CID. Therefore, we can store rank in a one-dimensional array, where indices are the CID numbers. Merkle tree calculations are based on the \linkgreen{}{RFC-6962 standard}.
We now possess proof of rank for any given content address. While relevance is still subjective by nature, we have a collective proof that something was relevant to a certain community at some point in time.
For any given CID it is possible to prove relevance
Using this type of proof, any two \linkgreen{}{IBC compatible} consensus computers can prove relevance to one another. This means that domain-specific relevance machines can flourish.
In our relevance for a common \code{cyber} implementation, the proof of relevance root hash is computed on the CUDA GPUs every round.
\titleSection{Performance speed}\label{Performance speed}
We require instant confirmation times to provide users with the feeling of a conventional web-application. This is a powerful architectural requirement that shapes the economical topology and the scalability of the cyber protocol. The proposed blockchain design is based on the \linkgreen{}{Tendermint consensus} algorithm with 146 validators, and has a very quick - 5 second tx finality time. The average confirmation time closer to 1 second could make complex blockchain interactions almost invisible to agents.
We denote one particular cyberd property in the context of speed - rank computation. Being a part of the consensus it occurs in parallel to transaction validation within the rounds. A round is a consensus variable defined by the stakeholders. From inception one round is set to 100 blocks. Practically, this indicates that every 300 seconds the network must agree on the current root hash of the knowledge graph. This means that every cyberlink submitted becomes a part of the knowledge graph almost instantly while acquiring a rank within an average period of 150 seconds. In the early days of Google rank was recomputed roughly every week. We believe that web3-agents will be pleased to observe that ranking changes in real-time and have decided to launch the network with an assumption that this 'window' is enough. It is expected that with the development of the cyber protocol the velocity of each round will decrease. This is due to competition between validators. We are aware of certain mechanisms to make this function order of magnitudes faster:
\item optimization of the consensus parameters
\item better parallelization of rank computation
\item \linkred{}{a better clock} for the consensus
We require an architecture which will allow us to scale our idea to orders of the significance of the likes of Google. Let us assume that our node implementation, which is based on \code{cosmos-SDK} can process 10k transactions per second. This means that every day, at least 8.64 million agents will be able to submit 100 cyberlinks each and impact the search results simultaneously. This is enough to verify all the assumptions out in the wild, but not enough to say that it will work at the current scale of the Internet. Given the current state of the art research done by our team, we can safely state that no consensus technology exists which will allow scaling a particular blockchain to the size that we require. Hence, we introduce the concept of domain-specific knowledge graphs. One can either launch an own domain-specific search engine by forking cyberd, which is focused on \textit{common public knowledge} or simply 'plug' cyberd as a module into an existing chain, e.i. Cosmos Hub. The inter-blockchain communication protocol introduces concurrent mechanisms of syncing state between relevance machines. Therefore, in our search architecture, domain-specific relevance machine will be able to learn from common knowledge as well as common knowledge can learn from domain-specific relevance machines.
\titleSection{In-browser implementation}\label{In-browser implementation}
We were aspired to imagine how our network would operate in a web3 browser. To our disappointment we \linkred{}{were not able} to find a web3 browser that can showcase the coolness of the proposed approach in action. This is why we have decided to develop our own web3 browser \linkred{}{cyb}, which has a sample application of .cyber for interacting with the \code{cyber://} protocol.
As another good example of delivery, we have created \linkred{}{a Chrome extension} that allows anyone to pin any web page to IPFS with just one click and index it by any keywords, hence, making it searchable.
The current search snippets are unpleasant, but we presume that they can be easily extended using IPLD for different types of content. Eventually, they can become even more attractive than those of Google.
During the implementation of the proposed architecture, we have realized at least 3 key benefits that Google would probably not be able to deliver with its conventional approach:
\item the search results can be easily delivered from any P2P network: e.g. .cyber can play videos
\item payment buttons can be embedded right into search snippets. This means that a web3 agent can interact with the search results, e.g. an agent can buy an item right in \code{.cyber}. This means that e-commerce can flourish fairly thanks to a transparent conversion
\item search snippets do not have to be static but can be interactive, e.g. \code{.cyber} can deliver your current wallet balance
Due to technical limitations, we have to bootstrap the ecosystem using 2 tokens: THC and CYB
\item THC (pronounce as tech) is a creative cyber proto substance. THC being an Ethereum ERC-20 compatible token has utility value in the form of control over cyber•Foundation (an Aragon DAO) and the ETH from the auction proceeds. THC was emitted during the creation of cyber•Foundation as an Aragon organization. The creative powers of THC come from the ability to receive 1 CYB token per each 1 THC token when locking it during Game of Thrones and cyber•Auction.
\item CYB is a native token of the sovereign Cyber protocol powered by the Tendermint consensus algorithm. It has 3 primary
uses: (1) staking for consensus, (2) bandwidth limiting for submitting links, and (3) expression of will for the computing
of cyber•rank
Both tokens remain functional and will track value independently of one another due to the very different utility by nature.
Overall, the deployment process has the following structure:
\item cyber•Congress deploy cyber•Foundation and organizes the Game of Links
\item The community participates in the Game of Links
\item cyber•Congress deploys contracts for Game of Thrones and for cyber•Auction
\item The community proposes a Genesis block with results from the Game of Links
\item The community participates in Game of Thrones that lasts for 20 days after Genesis. ETH donors stake THC tokens to get CYB tokens
\item cyber•Congress distributes CYB tokens after Game of Thrones
\item The community participate in cyber•Auction for a period of 500 rounds after the Game of Thrones. Donors stake THC tokens to get CYB tokens
\item cyber•Congress distributes CYB tokens continuously during cyber•Auction
\item cyber•Congress burns the remaining CYB tokens and reports on the end of the initial distribution process
cyber•Congress is an entity which was developed by the cyber protocol. Within the context of cyber future, cyber•Congress
has 2 roles:
\item To deploy and to execute the initial distribution program, which is impossible to automate. Because there is no trustless infrastructure for message swapping between ETH and ATOM, cyber•Congress introduces a single point of failure in the initial distribution scheme. We have decided to send CYB tokens to THC staking manually because we feel that now is the
right time to launch the network we have created. We also believe that an ongoing auction is vital to the initial distribution process. Even if cyber•Congress fails to deliver its obligations in terms of distribution due to any reasons, we hope that the community will be able to fork out the network and to distribute the CYB tokens as was promised (hopefully with every operation designed provable and transparent). All operations will be executed using the official cyber•Congress 2-of-3 multisig.
\item Support for the growth of cyber protocol until the community takes over the development. Up to 15\% of CYB tokens will be distributed, based on donations in ATOMs during Game of Links (GoL) and Game of Thrones (GoT). All ATOM donations
that are routed to the cyber•Congress multisig will become its property. The role of the ATOM donation is the following:
thanks to ATOM we want to secure a lifetime commitment for cyber•Congress in the development of both - the Cosmos
and Cyber ecosystems. e.g. ATOM donations will allow for cyber•Congress to use staking rewards for continuous funding
of the Cyber protocol without the necessity to dump CYB tokens
We want to give the ability to evaluate the proposed approach for as many agents as possible, but without adding complexity such as KYC and/or captcha. That is why we give away 8\% of CYB tokens in Genesis to Ethereum, 1\% to Cosmos, and 1\% to Urbit communities. The following rules are applied to reproduce the Genesis:
\item Every account within the Ethereum foundation network with at least 1 outgoing transaction which is not a contract; and
holds > 0.2 ETH at block 8080808
\item Every non-zero account within Cosmos hub-2 at block 1110000
\item Every account which holds galaxies (30\%), stars (30\%), or planets (40\%) at block 8080808 according to the number of objects
The key purpose of this gift is for every account in Genesis to be able to make at least 1 cyberlink in the space of 24 hours when the network is unloaded. This is why we have decided to make the distribution curve a bit more even and radically change it to a quadratic curve. Hence, we distribute CYB tokens proportionally to the square root of each account balance during the snapshots. But, because a quadratic design is too easy to ”play around with”, we have calculated the amount of the distributed CYB tokens on the proposed blocks, before this fact became known to the public. We do not apply the quadratic rule for Urbit aliens.
\titleSection{Distribution Games}\label{Distribution Games}
Overall the distribution process is split into 3 games, each with a different purpose of network deployment:
\item Game of Links is for early believers and Genesis validators
\item Game of Thrones is for speculators
\item cyber•Auction is for all web agents
The Game of Links - is a game between cyber•Congress and Cosmos stakeholders for a place in the Genesis. The game is over when either 600000 ATOM have been donated or 90 days have passed since the funding took off. The key idea is - the better the Game of Links performs, the more percent of the network Cosmos hodler acquire, the more payouts the participants in disciplines receive. Depending on the results up to 100 TCYB is allocated to the Game of Links. All the CYB tokens that remain at the end of the game are allocated to cyber•Congress. A detailed document will be published with rules and provision for the game.
The Game of Thrones - is a game between ATOM and ETH hodlers for being the greatest. As a result of a 21-day auction after
Genesis, every community will earn 10\% of CYB tokens. To make the game run smoothly, we are concisely adding an arbitrage
opportunity in the form of significant discount to ATOM hodlers, because the system needs provably professional validators and delegators at its inception and basically, employes them for free. We can describe the discount with the following terms: Currently, the buying power of all ATOMs against all ETHs is based on the current m.caps at about 1/24. Given that 10\% of CYB tokens will be distributed based on donation in ATOMs and 10\% of CYB tokens will be distributed based on donations in ETH, the discount for every ATOM donation during the Game of Thrones is about 24x, which is significant enough to encourage participation based on the arbitrage opportunity during the first 21 days of the Genesis auction, and stimulate the price of ATOMs as an appreciation to all the Cosmic community. Distribution of CYB tokens happens after the end of Game of Thrones is announced by cyber•Congress.
cyber•Auction starts after the end of the Game of Thrones and lasts for 500 rounds that last for 23 hours each (this is 23*500 hours or just a little over 479 days). Every round the participants ”battle out” for 1 000 000 000 000 THC. During this phase, CYB tokens are continuously distributed, based on the vested THC tokens until the end of the auction. Vested THC tokens give the ability to receive CYB tokens accordingly, and voting powers within cyber•Foundation. After the end of the distribution, participants will be able to unlock their THC tokens and use them as they wish, e.i. transfer, trade, etc. As a result of the auction, the community will have access to all the donated ETH within the Aragon organization. The following rules apply to the CYB tokens under the cyber•Auction multisig:
\item it will not delegate its stake, and as a result, it will remain a passive stake until it will become distributed
\item after the end of cyber•Auction, all the remaining CYB tokens must be provably burned
The goal of creating an alternative to a Google-like structure requires extraordinary effort from different groups. Hence, we have decided to set up cyber~Foundation as a fund managed via a decentralized engine such as an Aragon DAO, charged with ETH, and managed by the agents who have participated in the initial distribution. This approach will allow safeguarding from excessive market dumping of the native platform token - CYB within the first years of its work, thereby, ensuring stable development. Additionally, this allows to diversify the underlying platform and extend the protocol to other consensus computing architecture,should such a need arise.
While choosing the token for donations, we followed three main criteria: the token must be (1) one of the most liquid, (2)
most promising, so a community can secure a solid investment bag to be competitive even in comparison to such giants like
Google, and (3) have the technical ability to execute an auction and a resulting organization, without relying on any third party. The only system that matches these criteria is Ethereum, hence, the primary token of donations will be ETH.
Prior to \hyperlink{genesis}{Genesis} cyber~Foundation has minted 700 000 000 000 000 THC (seven hundred terathc), which will be broken down as follows:
\item 600 000 000 000 000 THC tokens are allocated to the cyber~Auction contract
\item 100 000 000 000 000 THC tokens are allocated to the Game of Thrones contract
\item 100 000 000 000 000 THC tokens are allocated to the cyber~Congress contract
Burn and mint rights must be revoked after allocation.
All decisions by cyber~Foundation will be executed based on the results of THC votes. The following parameters will be
applied during deployment:
\item Support: 67\%
\item Quorum: 51\%
\item Vote duration: 500 hours
The genesis block of \code{cyber} protocol contains 1 000 000 000 000 000 CYB (one petacyb or 1 PCYB) broken down as follows:
\item 700 000 000 000 000 CYB tokens for those who stake THC tokens until the end of cyber•Auction: participants of cyber•Congress, Game of Thrones in ETH and cyber•Auction
\item 100 000 000 000 000 CYB tokens as a gift for Ethereum and Cosmos communities
\item 100 000 000 000 000 CYB tokens for the participants in the Game of Links
\item 100 000 000 000 000 CYB tokens for the participant in the Game of Thrones in ATOMs
After Genesis CYB tokens can only be created by validators based on staking and slashing parameters. The basic consensus is that newly created CYB tokens are at the disposal of stakeholders.
There is currently no such thing as the maximum amount of CYB tokens, this is due to the continuous inflation paid to the
network validators. Currently, the CYB token is implemented using 64int, so the creation of additional CYB tokens makes it significantly more expensive to compute state changes and rank. We expect that a lifelong monetary strategy must be established by the governance system, after the complete initial distribution of CYB tokens and the activation of the functionality of smart contracts.
Starting parameters of inflation will be defined during tests and simulations...
We assume that the proposed algorithm does not guarantee high-quality knowledge by default. Just like a newborn it needs to acquire knowledge to develop further. The protocol itself provides just one simple tool: the ability to create a cyberlink with a certain weight between two content addresses.
Analysis of the semantic core, behavioural factors, anonymous data about the interests of agents, and other tools that determine the quality of search can be achieved via smart contracts and off-chain applications, such as: web3 browsers, decentralized social networks and content platforms. We believe that it is the aim of the community and the agents to build the initial knowledge graph and to maintain it, so that it can provide the most relevant search results.
Generally, we distinguish three types of applications for knowledge graphs:
\item Consensus apps. Can be run at the discretion of the consensus computer by adding intelligent abilities
\item On-chain apps. Can be run by the consensus computer in exchange for gas
\item Off-chain apps. Can be implemented by using the knowledge graph as an input within an execution environment
The following imaginable list of apps can combine the above-mentioned types:
\code{Web3 browsers}. In reality browser and search are inseparable. It is hard to imagine the emergence of a full-blown web3 browser which is based on web2 search. Currently, there are several efforts for developing browsers around blockchains and distributed tech. Among them are Beaker, \sout{Mist}, Brave, and Metamask. All of them suffer trying to embed web2 in web3. Our approach is a bit different. We consider web2 as the unsafe subset of web3. So we develop a web3 browser Cyb showcasing the cyber approach to answer questions better and deliver content faster.
\code{Programmable semantics}. Currently, the most popular keywords in the gigantic semantic core of Google, are keywords of apps such as Youtube, Facebook, GitHub, etc. However, the developers of those successful apps have very limited ability to explain to Google how to structure search results in a better manner. The cyber approach gives this power back to developers. Developers are now able to target specific semantics cores and index their apps as they wish.
\code{Search actions}. The proposed design enables native support for blockchain (and tangle-alike) assets related activity. It is trivial to design applications which are (1) owned by the creators, (2) appear correctly in the search results and (3) allow a transactable action, with (4) provable attribution of a conversion to a search query. e-Commerce has never been this easy for everyone.
\code{Off-line search}. IPFS makes it possible to easily retrieve a document from such an environment without a global internet connection. cyberd itself can be distributed by using IPFS. This creates the possibility for ubiquitous, off-line search!
\code{Command tools}. Command-line tools can rely on relevant and structured answers from a search engine. Practically speaking, the following CLI tool is possible to implement:
> cyberd earn using 100 GB
Enjoy the following predictions:
- apt install go-filecoin: 0.001 BTC p/ month p/ GB
- apt install siad: 0.0007 BTC p/ month p/ GB
- apt install storjd: 0.0005 BTC p/ month p/ GB
According to the most desirable prediction, I decided to try `mine go-filecoin -limit 107374182400`
Git clone ...
Building go-filecoin
Starting go-filecoin
Creating a wallet using @xhipster seed
Your address is ...
Placing bids ...
Waiting for incoming storage requests ...
The search from within CLI tools will inevitably create a highly competitive market of a dedicated semantic core for robots.
\code{Autonomous robots}. Blockchain technology enables the creation of devices that can manage digital assets on their own.
If a robot can store, earn, spend and invest - they can do everything you can do
What is needed is a simple, yet a powerful state reality tool with the ability to find particular elements. \code{cyberd} offers minimalistic, but continuously self-improving data source, which provides the necessary tools for programming economically rational robots. According to \linkgreen{}{top-10,000 English words} the most popular word in the English language is the defining article \code{the} - which means a pointer to a particular item. This fact can be explained as the following: particular items are of most importance to us. Therefore, the nature of our current semantic computing is to find unique things. Hence, the understanding of unique things is essential for robots too.
\code{Language convergence}. A programmer should not care about what language will an agent be using. We don't need to know about what language the agent is performing their search in. The entire UTF-8 spectrum is at work. The semantic core is open, so competition for answering queries can become distributed across different domain-specific areas, including the semantic cores for various languages. This unified approach creates an opportunity for cyber•Bahasa. Since the dawn of the Internet, we observe a process of rapid language convergence. We use truly global words across the entire planet, independently of our nationality, language, race, name or Internet connection. The dream of a truly global language is hard to deploy because it is hard to agree on what means what. However, we have the tools to make this dream come true. It is not hard to predict that the shorter a word, the more powerful its cyber~rank will become. Global, publicly available list of symbols, words, and phrases sorted accordingly by cyber~rank with a corresponding link provided by cyberd can become the foundation for the emergence of a genuinely global language everybody can accept. Recent \linkgreen{}{scientific advances} in machine translation are breathtaking but meaningless to those who wish to apply them without a Google-like scale trained model. The proposed cyber~rank offers precisely this.
Our approach to the economics of a consensus computer is that agents will pay for gas as they wish to execute programs. OpenCypher-like language can be provided to query the knowledge graph, right from within smart contracts. \linkgreen{}{We can envision} the following smart contracts that can be built on top of a simple relevance machine with the support of on-chain WASM VM or CUDA VM:
\code{Self prediction}. A consensus computer can continuously build a knowledge graph on its own, predicting the existence of cyberlinks and applying these predictions to its state. Hence, a consensus computer can participate in the economic consensus of the cyber protocol.
\code{Universal oracle}. A consensus computer can store the most relevant data in a key-value storage, where the key is a CID and the values are the bytes of the actual content. This can be achieved by making a decision every round, on which CID value the agents want to prune and which value they wish to apply, based on the utility measure of content addresses within the knowledge graph. To compute utility measure, validators check the availability and the size of the content for the top-ranked content addresses within the knowledge graph, then, weight on the size of the CIDs and its rank. The emergent key-value storage will be available to write for consensus computer only and not for agents, but, values could be used in programs.
\code{Proof of location}. It is possible to construct cyberlinks with 'proof-of-location' based on remarkable existing protocols such as \linkgreen{}{Foam}. Consequently, a location-based search can also become provable, if web3-agents will mine triangulations and attach ‘proof of location’ for every linked chain.
\code{Proof of web3-agent}. Agents are a subset of content addresses with one fundamental property: a consensus computer can prove the existence of private keys for content addresses for the subset of a knowledge graph. Even if those addresses have never transacted on their chain. Therefore, it is possible to compute much provable essence on top of that knowledge, e.g. any inflation can be distributed to addresses that have never transacted in the cyber network but have the provable link required.
\code{Motivation for read requests}. It would be great to create cybernomics not only for ‘write’ requests to consensus computers but from ‘read’ requests too. Thus, read requests can become orders of magnitude cheaper but still guaranteed. Read requests to a search engine can be provided by the second tier of nodes which earn CYB tokens within state channels. We consider implementing state channels based on HTLC and proof of verification, which unlocks the number of tokens earned for already served requests.
\code{Prediction markets on link relevance}. We can impel this idea further by the ranking of the knowledge graph, based on a prediction market on link relevance. An app. that allows betting on link relevance can become a unique source of truth for the direction of terms, as well as, motivate agents to submit more links.
\code{Private cyberlinks}. Privacy is fundamental. While we are committed to privacy, achieving implementation of private cyberlinks is unfeasible for our team up to Genesis. Therefore, it is up to the community to work on WASM programs, that can be executed on top of the protocol. The problem is to compute cyberRank, based on the cyberlinks submitted by a web3-agent without revealing neither: their previous request nor the public keys of that web3-agent. Zero-knowledge proofs, in general, are very expensive. We believe that the privacy of search should be a feature by design, but we are unsure that we know how to implement it at this stage. \linkgreen{}{Coda} like recursive snarks and \linkgreen{}{MimbleWimble} constructions, in theory, can solve part of the privacy concern, but they are new, untested and anyway, will be more expensive with regards to computations than their transparent alternative.
This is surely not the excessive list of all the possible applications... but a very exciting one indeed.
We foresee demand for the following protocol features, that the community could work on after launch:
\item cyber•Rank scaling
\item On-chain parametrization
\item On-chain upgrades
\item IBC
\item Universal oracle
\item WASM VM for gas
\item CUDA VM for gas
\item Privacy by default
\item PoRep/PoST
\item Tensority for SPV checkpoints
We define and implement a protocol for provable communication between consensus computers on relevance. The protocol is based on the simple idea of content defined knowledge graphs, which are generated by web3-agents via the use of cyberlinks. Cyberlinks are processed by a consensus computer using a concept that we call ‘the relevance machine’. \code{cyber} consensus computer is based on \code{CIDv1} and uses \code{go-IPFS} and \code{cosmos-SDK} as a foundation. IPFS provides significant benefits with regards to resource consumption. CIDv1 as primary objects are robust in their simplicity. For every CIDv1, cyber~rank is computed by a consensus computer without a single point of failure. Cyber•rank is a CYB (token) weighted PageRank, with economic protection from Sybil attacks, and selfish voting. Every round the Merkle root of the rank tree is published. Consequently, every computer can prove to any other computer a relevance of value for a given CID. Sybil resistance is based on bandwidth limiting. The embedded ability to execute programs offers inspiring applications. The starting primary goal is the indexing of peer-to-peer systems with self-authenticated data, either stateless, such as: IPFS, Swarm, DAT, Git, BitTorrent, or stateful, such as: Bitcoin, Ethereum and other blockchains and tangles. The proposed semantics of linking offers a robust mechanism for predicting meaningful relations between objects by the consensus computer itself. The source code of the relevance machine is open-source. Every bit of data accumulated by the consensus computer is available to anyone if one has the resources to process it. The performance of the proposed software implementation is sufficient for seamless agents' interaction. The scalability of the proposed implementation is sufficient to index all self-authenticated data that exist today and can serve it to millions of web3-agents. The blockchain is managed by a superintelligence, which functions under the Tendermint consensus algorithm with a standard governance module. Though the system provides the necessary utility to offer an alternative for a conventional search engine, it is not limited just to this use case. The system is extendable for numerous applications and makes it possible to design economically rational, self-owned robots, that can autonomously understand objects around them.
\item \linkred{}{cyberd}
\item \linkgreen{}{Scholarly context adrift}
\item \linkgreen{QmNhaUrhM7KcWzFYdBeyskoNyihrpHvUEBQnaddwPZigcN.ipfs}{Scholarly context adrift}
\item \linkgreen{}{Web3 stack}
\item \linkgreen{}{Search engines information retrieval in practice}
\item \linkgreen{}{Motivating game for adversarial example research}
\item \linkred{}{An idea of decentralized search}
\item \linkgreen{}{IPFS}
\item \linkgreen{}{DAT}
\item \linkred{}{cosmos-sdk}
\item \linkred{}{CIDv0}
\item \linkgreen{}{Thermodynamics of predictions}
\item \linkred{}{DURA}
\item \linkgreen{}{Nebulas}
\item \linkgreen{}{Colony}
\item \linkgreen{}{Truebit}
\item \linkgreen{}{SpringRank presentation}
\item \linkred{}{PageRank}
\item \linkred{}{RFC-6962}
\item \linkgreen{}{IBC protocol}
\item \linkgreen{}{Tendermint}
\item \linkred{}{Comparison of web3 browsers}
\item \linkred{}{Cyb}
\item \linkred{}{Cyb virus}
\item \linkred{}{SpringRank}
\item \linkred{/docs/}{How to become validator in cyber protocol}
\item \linkred{}{Top 10000 english words}
\item \linkgreen{}{Multilingual neural machine translation}
\item \linkgreen{}{Foam}
\item \linkgreen{}{Coda}
\item \linkgreen{}{Mimblewimble}
\item \linkgreen{}{Tezos}
\item \linkred{}{Software 2.0}
\item \linkred{}{Proof-of-history}
\item @hleb-albau
\item @arturalbov
\item @jaekwon
\item @ebuchman
\item @npopeka
\item @belya
\item @serejandmyself
You can’t perform that action at this time.