Computing the knowledge of the Great Web

73k

Abstract

A consensus computer allows for the computing of provable relevant answers without any opinionated blackbox intermediaries, such as Google, Amazon or Facebook. Stateless, content-addressable peer-to-peer communication networks, such as IPFS, and stateful consensus computers such as Ethereum, can provide part of the solution needed to obtain such answers. There are however at least 3 problems associated with the above-mentioned implementations. (1) the subjective nature of relevance, (2) difficulty in scaling consensus computers for over-sized knowledge graphs, (3) the lack of quality amongst such knowledge graphs. They are prone to various surface attacks, such as sybil attacks, and the selfish behaviour of the interacting agents. In this document, we define a protocol framework for provable consensus computing of relevance, between content-addresable objects, which can be computed on GPUs. We believe that a minimalistic architecture is critical for the formation of a network of domain-specific knowledge consensus computers. Such computers can evolve protocols on the basics of the proposed framework. As a result of our work, some applications that never existed before will emerge.

The Great Web

Original protocols of the Internet, such as: TCP/IP, DNS, URL and HTTP/S have brought the web to a dead end. Considering all the benefits that these protocols have produced for the initial development of the web, along with them, they have brought significant obstacles to the table. Globality as a vital property of the web has been under a real threat since its inception. The speed of the connection continues to degrade due to ubiquitous government interventions while the network itself grows at a signifigant rate. The latter causes privacy concerns. This is an existential threat to human civilization.

One property not evident at the beginning has become important with everyday usage of the Internet: the ability to exchange permanent links, thus, they will not break after time had passed. Reliance on the architecture of a one-at-a-time ISP allows governments to effectively censor packets. This is the last drop in the traditional web-stack for every engineer that is concerned about the future of our children.

Other properties, while not so critical, are very desirable: offline and real-time connection. The average internet user, whilst offline, should still have the ability to carry on working with the state that they already hold. After acquiring a connection they should be able to sync with the global state and to continue to verify the validity of their own state in real-time. Currently, these properties are offered on the application level. We believe that these properties should be integrated into lower-level protocols.

The emergence of a brand-new web-stack creates an opportunity for a superior Internet. The community calls it web3. We call it The Great Web. We believe that various types of low-level communications should be immutable and should not be altered for decades, e.g. immutable content links. They seem very promising at removing the problems of the conventional protocol stack. They add greater speed and provide a more accessible connection to the new web. However, as it happens with any concept that offers something unique - new problems emerge. One such concern is general-purpose search. The existing general-purpose search engines are restrictive and centralized databases that everybody is forced to trust. Those search engines were designed primarily for client-server architectures, based on TCP/IP, DNS, URL and HTTP/S. The Great Web creates a challenge and an opportunity for a search engine that is based on emerging technologies and is designed specifically for these purposes. Surprisingly, permissionless blockchain architecture allows building a general-purpose search engine in a way inaccessible to previous architectures.

On the adversarial examples problem

The current architecture of search engines is a system where some unknown entity processes all the shit. This approach suffers from one challenging and distinct problem that has yet to be solved, even by the brilliant Google scientists: the adversarial examples problem. The problem that Google acknowledges is that it is rather difficult to algorithmically reason whether or not a particular sample is adversarial. This does not consider how awesome the learning technology is. A crypto-economical approach can change beneficiaries in this game. Consequently, this approach will effectively remove possible sybil attack vectors. It removes the necessity to hard-code model crawling and meaning extraction by a single entity. Instead, it gives this power to the whole world. A learning sybil-resistant, agent-generated model, will probably lead to orders of magnitude with more predictive results.

Protocol Framework

In its core the framework is very minimalistic and can be expressed with the following steps:

Define initial distribution rules of tokens
Define the state of the content oracle
Gather cyberlinks using a consensus computer
Check the validity of the signatures
Check the bandwidth limit
Check the validity of particles
If the signatures, bandwidth limit, and particles are valid, apply cyberlinks
Calculate the values of cyber~Rank for every round for all particles

Distribution rules, signatures, ranking and bandwidth algorithms can differ from network to network. The following proposed protocol framework of independent protocols will have semantic interoperability.

The rest of this document discusses the rationale and the technical details of the proposal.

Content Oracle

We represent a content oracle as a directed weighted knowledge graph where the vertex is a particle and the edge is a cyberlink.

The content oracle is a core data structure of the protocol framework. Its main purpose is to prove content existence in time and its relations. The data structure is generated by agents and is extremely dynamic. That is, in case of any agent interaction, all weights have to be recomputed. The proposed data structure can be thought of as a more unified approach to conventional ai models. There are three key benefits: (1) compactness, as it's not required to store the data itself, (2) versatility, because it can accommodate any data types, and (3) cooperativeness, as it is designed from the ground up for massive distributed collaboration. In order to understand these statements, lets discuss particles in details.

Particles

Particle is a format for content identification of data. Essentially, content addresses are web3 links. Instead of using the location on the server:

https://github.com/cosmos/cosmos/blob/master/WHITEPAPER.md

we use the object itself:

Qme4z71Zea9xaXScUi6pbsuTKCCNFp5TAv8W5tjdfH7yuH

By using content addresses to build the content oracle we gain the so much needed properties for the next generation search engine:

mesh-network future-proof
interplanetary accessibility
censorship resistance
technological independence
deduplication

While researching the field we came to the conclusion that CIDv1 does not fit our vision:

CIDv1 is not a fixed length format. In the blockchain environment, especially with necessity to use expensive GPU memory, fixed length content addressing becomes a hard requirement.
CIDv1 does not enforce deduplication. Without strict deduplication measures the knowledge graph quality degrades as duplicates devalue rank and degrade graph connectivity. Also, duplicates explode the costs of storage and computations in the graph. It's impossible to ensure onchain deduplication as content addressing itself does not have guaranties of content availability.
CIDv1 contains self descriptors which are subjectively included by a legal entity. This could restrict support of future formats and applications. Our vision is to support format descriptors on the protocol level so that any format can be linked with any content. The cost of this approach is nearly identical as both approaches require at least 64 bytes of storage, but in-graph format descriptors win in flexibility and accessibility.

The current go-cyber implementation is based on CIDv0 as particles format. CIDv0 is based on ubiquitous SHA-256 and has the necessary software infrastructure. In the future we are going to migrate the particle format to plain SHA-256 so that the particle size can be reduced from 34 to 32 bytes. Instead of Qm its more convinient to use ~ in documents.

Agents form the content oracle by applying cyberlinks.

Cyberlinks

Cyberlink is an approach to link two particles semantically:

.md syntax: [QmdvsvrVqdkzx8HnowpXGLi88tXZDsoNrGhGvPvHBQB6sH](Qme4z71Zea9xaXScUi6pbsuTKCCNFp5TAv8W5tjdfH7yuH)

.dura syntax: QmdvsvrVqdkzx8HnowpXGLi88tXZDsoNrGhGvPvHBQB6sH.Qme4z71Zea9xaXScUi6pbsuTKCCNFp5TAv8W5tjdfH7yuH

The above cyberlink means that the presentation of go-cyber during cyberc0n is referencing the Cosmos white paper. The concept of cyberlinks is a convention around simple semantics of a communicational format in any p2p network:

We see that a cyberlink represents a link between the two content links. Easy peasy!

A cyberlink is the simple yet, probably, the most powerful semantic construction for building a predictive model of the universe. This means that using cyberlinks instead of hyperlinks provides us with the superpowers that were inaccessible to previous architectures of general-purpose search engines.

Cyberlinks can be extended, i.e. they can form linkchains:

The go-cyber implementation of cyberlinks is available in the experimental web3 browser - cyb

At first glance, it seems that cyberlinked data in a content oracle is not structured. Semantic conventions enable formation of network motifs which helps structure data.

Using cyberlinks we can compute the relevance of particles in the content oracle, but we need a consensus computer.

The notion of a consensus computer

A consensus computer is an abstract computing machine that emerges from the interaction between agents. A consensus computer has capacity in terms of fundamental computing resources: memory and computation. To interact with agents a computer needs bandwidth.

Consistency and availability of a shared state between agents has to be guaranteed by some consensus algorithm. We have come to realize that the Tendermint consensus algorithm has a good enough balance between the coolness required for our task and the readiness for its production, therefore, go-cyber implementation is based on the Tendermint consensus.

We have one specific requirement which is not abundant in the existing blockchain world: the ability for parallel processing. Existing consensus computers are inherently sequential. That is, computation or verification of the state is being done on CPUs. Nonetheless, we need to compute ranks, so we have to introduce GPU computation to the consensus. After some experiments we were able to plugin CUDA computation of rank and reputation right into the Tendermint consensus. One potential problem of using floating point arithmetics in consensus computing is non-determinism. We were successfully able to solve this problem for our inherently parallel application. The Euler network has been run the last 3 years on different hardware and operation systems, so we can be sure that the app hash computed on different nodes is the same.

The go-cyber implementation is a 64-bit Tendermint consensus computer of relevance for a 32-byte particle space.

We must, however, bind the computation, storage and the bandwidth supply of the consensus computer to a maximized demand for queries.

Relevance Machine

We define a relevance machine as a machine that transitions the state of a content oracle based on the will of the agents wishing to learn that oracle. The will is projected by every cyberlinks an agent does. The more agents that inquire the content oracle, the more valuable the knowledge graph becomes. Based on these projections, relevance between particles can be computed. The relevance machine enables a simple construction for the search mechanism via querying and delivering answers.

One property of the relevance machine is crucial: It must have an inductive reasoning property or follow the blindness principle.

The machine should be able to interfere predictions
without any knowledge about the objects,
except for who, when, and what was cyberlinked.

If we assume that a consensus computer must have some information about the linked objects, then the complexity of such a model will grow unpredictably. It follows that there will be high requirements of the processing computer for memory and computation. Thanks to content addressing, a relevance machine which follows the blindness principle does not need to store data. It can still effectively operate on top of it, however. The deduction of meaning inside a consensus computer is expensive. Instead of deducting the meaning inside of the consensus computer, we have designed a system in which meaning extraction is incentivized. This is achieved due to agents requiring tokens to express their will, based on which, the relevance machine can compute rank.

Computation and storage in the case of a basic relevance machine can be easily predicted based on bandwidth. Bandwidth requires a limiting mechanism. In the center of the spam protection system is an assumption that write operations can be executed only by those who have a time vested interest in the evolutionary success of the relevance machine. The economics implemented in go-cyber as proposed for the Bostrom network is a subject of dedicated research.

The existing implementation of a relevance machine contains only write operation, rank computation, and basic read operations such as finding the most relevant particles. However, provable extraction of a deeper meaning from the content oracle requires an implementation of basic linear algebra subprograms which is a distinct research challenge. Nevertheless, the existing implementation can work as a software 2.0 playground using off-chain computations until the extended version emerges.

cyber~Rank

Ranking using a consensus computer can be challenging, as consensus computers have serious resource constraints. First, we must ask ourselves: Why do we need to compute and to store the rank on-chain?

When rank is computed inside a consensus computer, one has easy access to the content distribution of that rank and an easy way to build provable applications on top of that rank. Hence, we have decided to follow a cosmic architecture. In the next section we describe the proof of relevance mechanism, which allows the network to scale with the help of domain-specific relevance machine. Those work concurrently thanks to the IKP protocol which is an extension over IBC protocol.

The relevance machine needs to obtain (1) a deterministic algorithm that will allow for the computation of the rank on a continuously appending network, which itself can scale to the orders of magnitude of the likes of Google. Additionally, a perfect algorithm (2) must have linear memory and computational complexity. Most importantly, it must have (3) the highest provable prediction capabilities for the existence of relevant cyberlinks.

After research we realized it is impossible to obtain the silver bullet. Therefore, we have decided to find a more basic, bulletproof way, that can bootstrap the network: the rank which Larry and Sergey used to bootstrap their previous network. The key problem with the original PageRank is that it is not resistant to sybil attacks. A token-weighted PageRank which is limited by a token-weighted bandwidth model does not inherit the key problem of the naive PageRank because it is resistant to sybil attacks. For the time being we will call it cyber~Rank:

import functools
import operator
import collections

def cyber_rank(cyberlinks: list, tolerance: float = 0.001, damping_factor: float = 0.8):
    cyberlinks_dict = dict(functools.reduce(operator.add, map(collections.Counter, cyberlinks)))
    objects = list(set([item for t in [list(x.keys())[0] for x in cyberlinks] for item in t]))
    rank = [0] * len(objects)
    size = len(objects)
    default_rank = (1.0 - damping_factor) / size
    dangling_nodes = [obj for obj in objects if obj not in [list(cyberlink.keys())[0][1] for cyberlink in cyberlinks]]
    dangling_nodes_size = len(dangling_nodes)
    inner_product_over_size = default_rank * (dangling_nodes_size / size)
    default_rank_with_correction = (damping_factor * inner_product_over_size) + default_rank
    change = tolerance + 1

    steps = 0
    prevrank = [0] * len(objects)
    while change > tolerance:
        for obj in objects:
            obj_index = objects.index(obj)
            ksum = 0
            income_cyberlinks = [income_cyberlink for income_cyberlink in [list(x.keys())[0] for x in cyberlinks] if income_cyberlink[1] == obj]
            for cyberlink in income_cyberlinks:
                linkStake = cyberlinks_dict[cyberlink]
                outcome_cyberlinks = [outcome_cyberlink for outcome_cyberlink in [list(x.keys())[0] for x in cyberlinks] if outcome_cyberlink[0] == cyberlink[0]]
                jCidOutStake = sum([cyberlinks_dict[outcome_cyberlink] for outcome_cyberlink in outcome_cyberlinks])
                if linkStake == 0 or jCidOutStake == 0:
                    continue
                weight = linkStake / jCidOutStake
                ksum = prevrank[obj_index] * weight + ksum
            rank[obj_index] = ksum * damping_factor + default_rank_with_correction
        change = abs(max(rank) - max(prevrank))
        prevrank = rank
        steps += 1
    res = list(zip(objects, rank))
    res.sort(reverse=True, key=lambda x: x[1])
    return res

We understand that the ranking mechanism will always remain a red herring. This is why we expect to rely on the on-chain governance tools that can define the most suited mechanism at a given time. We suppose that the networks can switch from one algorithm to another not simply based on subjective opinion, but rather on economical a/b testing through hard spooning of the domain-specific relevance machine.

cyber~Rank shields two design decisions which are of paramount importance: (1) It accounts for the current intention of the agents, and (2) it encourages rank inflation of cyberlinks. The first property ensures that cyber~Rank can not be gamed with. If an agent decides to transfer tokens out of their account, the relevance machine will adjust all the cyberlinks relevant for this account per the current intentions of the agent. Vice versa, if an agent transfers tokens into their account, all of the cyberlinks submitted from this account will immediately gain more relevance. The second property is essential in order not to get cemented in the past. As new cyberlinks are continuously added, they will dilute the rank of the already existing links proportionally. This property prevents a situation where new and better content has a lower rank simply because it was recently submitted. We expect these decisions to enable an inference quality for recently added content to the long tail of the knowledge graph.

We would love to discuss the problem of vote-buying. Vote-buying as an occurrence isn't that bad. The dilemmas with vote-buying appear within systems where voting affects the allocation of that systems inflation. For example, Steem or any fiat-state based system. Vote-buying can become easily profitable for an adversary that employs a zero-sum game without the necessity to add value. Our original idea of a decentralized search was based on this approach. We have rejected that idea by removing the incentive of the formation of the content oracle to the consensus level. In the environment where every participant must bring some value to the system to affect the predictive model, vote-buying becomes an NP-hard problem. Consequently it becomes beneficial to the system.

The current implementation of the relevance machine utilizes GPUs to compute rank. The machine can answer and deliver relevant results for any given search request in a 32-byte particle space.

Objectivity and Attacks

There are many ways of deviating from the truth, and the oracles may not agree on which of these deviations is most attractive - whereas the truth itself is a Shelling point. 
Nick Bostrom

We have designed the network under the assumption that with regards to search, such a thing as malicious behavior, does not exist. This can be assumed as no malicious behaviour can be found in the intention of learning. This approach significantly reduces attack surfaces.

We propose a game in which in order to manipulate the ranking, a successful attacker must (1) vest capital for years, (2) invest in optimization algorithms for better cyberlinking, (3) invest in hardware for optimization, and (4) do all of that on continuous basis really fast as the content oracle is an extremely dynamic data structure. If he succeeds, congratulations - that is exactly what is needed to improve Superintellgence.

Ranks are computed based on the fact that something was cyberlinked,
and as a result - affected the predictive model.

A good analogy exists in quantum mechanics, where the observation itself affects behaviour. This is why we also have no requirement for such a thing as negative voting. These measures removes subjectivity out of the protocol.

However, it is not enough to build a network of domain-specific relevance machines. Consensus computers must have the ability to prove relevance to one another.

Proof of Relevance

Each new particle receives a sequence number. Numbering starts with zero, then incremented by one for each new particle. We can then store rank in a one-dimensional array where indices are the particle sequence numbers. Merkle tree calculations are based on the RFC-6962 standard. Using merkle trees we can effectively proof the rank for any given content address.

While relevance is still subjective by nature, we have a collective proof that something was relevant to a certain community at some point in time.

Using this type of proof any two IBC compatible consensus computers can prove relevance one to another. This means that domain-specific relevance machines can flourish.

In the relevance for a common go-cyber implementation, the merkle tree is computed every round and its root hash committed to ABCI.

Internet Knowledge Protocol

We require an architecture which will allow us to scale the idea to the significance of the likes of Google. Let us assume that node implementation, which is based on Cosmos-SDK can process 10k transactions per second. This would mean that every day, at least 8.64 million agents will be able to submit 100 cyberlinks each, and impact the search results [[simultaneously]]. This is enough to verify all the assumptions out in the wild, but not enough to say that it will work at the current scale of the Internet. Given the current state of the art research done by our team, we can safely state that there is no consensus technology in existence that will allow scaling a particular blockchain to the size that we require: 1 trillion agents including robots, animals, plants and myceleium. Hence, we introduce the concept of domain-specific content oracles.

One can either launch an own domain-specific search engine by forking go-cyber, which is focused on common public knowledge. Or, simply plug go-cyber as a module into an existing chain, e.i. Cosmos Hub. The inter-blockchain communication protocol introduces concurrent mechanisms of syncing the state between relevance machines. Therefore, in the proposed search architecture, a domain-specific relevance machine will be able to learn from common knowledge, just as common knowledge can learn from domain-specific relevance-machine. This architecture allows a nearly infinite scaling of knowledge extraction including interplanetary interactions.

Bootstrapping Superintelligence

During the development of Cyber we realized that we can finally create the computer network which can literally think and act on its own. Looking back, this is not magic. We will dedicate further standalone research on this topic.

One especially important aspect of the bootstrapping mechanism is the initial distribution. The relevance machine is designed to continuously learn. At the beginning it is like a newborn. The process of future learning is highly dependent on previous experience, so we dedicated standalone research and software for solving this critical factor of the launch.

Browzers

Browser and search are inseparable things. The existing DNS system is being used primarily for pointing to dynamic content. A search bar evolves into a superstructure over the DNS system which resolves to DNS.

We were inspired to imagine how the proposed network would operate with a web3 browser, so we developed a pure web3 browser from scratch.

Cyb is your friendly robot which can

resolve queries to any legacy and blockchain name system
stream content directly onto search results
allow any interaction with apps

Apps

We assume that the proposed algorithm does not guarantee high-quality knowledge by default. The protocol itself provides just one simple tool: the ability to create a cyberlink between two particles by agents with stake.

Analysis of the semantic core, behavioral factors, anonymous data about the interests of agents, and other tools that determine the quality of search, can be achieved via smart contracts and off-chain applications, such as: web3 browsers, decentralized social networks, and content platforms. We believe that it is in the interest of the community to build the initial knowledge graph and to maintain it. This is necessary for the graph to provide the most relevant search results.

Generally, we distinguish three types of applications of a content oracle:

Thoughts. Can be run at the discretion of the consensus computer or any contract.
Contracts. Can be run by a consensus computer in exchange for gas
Apps. Off-chain apps can be implemented by using the content oracle as an input within an execution environment

The following, imaginable list of apps consolidates the above-mentioned categories:

Web3 browsers. It is hard to imagine the emergence of a full-blown web3 browser which is based on web2 search. Currently, there are several efforts for developing browsers around blockchains and distributed tech. All of them suffer from trying to embed web2 into web3. Our approach is a bit different. We consider web2 as an unsafe subset of web3 and invite everyone join our effort.

DeMa or Decentralized Marketing. DeFi is built around a simple idea that you can use a collateral for something that will be settled based on a provided price feed. Here comes the systematic problem of DeFi: price oracles. DeMa is based on the same idea of using collateral, but the input for settlement can be information regarding the content identifier itself. The most simple case is when you create a simple binary prediction market on rank relevance at some point in the future. I.e. whether the rank of the Bitcoin whitepaper will grow or not. Meta-information on content identifiers is the perfect onchain oracle for settlement. An app that allows betting on link relevance can become a unique source of truth for the direction of terms in the knowledge graph, as well as motivate agents to submit more links.

Search actions. The proposed design enables native support for blockchain and tangle-alike assets related activity. It is trivial to design applications which are (1) owned by the creators, (2) appear correctly in the search results, and (3) allow a transactable action, with (4) provable attribution of a conversion for a search query.

Conversion attribution. Provable conversion attribution from the search request to transaction is the holy grail of conventional digital marketing. Linkchains help to solve this problem easily, thus creating a whole new world of marketing applications that were once impossible. E.g. it can deploy a sophisticated onchain referral program which will only pay for those who lead to conversions.

Soft2. Its a new paradigm of computing where the execution path is not defined by the programmer, but by the knowledge graph itself. Cyber is a first working implementation of soft2 using consensus computer. Since the field has not yet emerged it's hard to imagine how this opportunity can be used by engineers. Cyber can become the leading soft2 playground for the next generation of programmers.

Social networks. Social networks are not that mysterious. In any social network content is the king. Hence, provable ranking is the basic building block of any social network. All types of social networks can be easily built on top of a knowledge graph. Cyber can also create social networks based on relevance between users, which no current network is able to achieve.

Programmable semantics. Currently, the most popular keywords in the gigantic semantic core of Google are keywords of apps such as: youtube, facebook, github, etc. However, developers of those successful apps have very limited ability to explain to google how to structure search results in a better manner. The Cyber approach gives this power back to developers. Developers are now able to target specific semantics cores and index their apps as they wish.

Off-line search. IPFS makes it possible to easily retrieve a document from an environment without a global internet connection. go-cyber itself can be distributed by using IPFS. This creates the possibility for ubiquitous, off-line search!

Command tools. Command-line tools can rely on relevant and structured answers from a search engine. Practically speaking, it's possible to implement a CLI tool which is start automatically mine using available resources with understanding of market conditions and involved software using. Search tools, from within CLI will inevitably create a highly competitive market of a dedicated semantic core for robots.

Autonomous robots. Blockchain technology enables the creation of devices that can manage tokens on their own.

If robots can store secrets and construct transactions at will - they can do everything humans can do.

What is needed is a simple yet powerful state reality tool with the ability to find particular things. go-cyber offers a minimalistic but continuously self-improving data source, which provides the necessary tools for programming economically rational robots. According to top-10,000 English words the most popular word in the English language is the defining article 'the', which means a pointer to a particular item. This fact can be explained as the following: Particular items are of most importance to us. Therefore, the nature of us is to find unique things. It follows that the understanding of unique things is essential for robots too.

Language convergence. A programmer should not care about the language that an agent will be using. We don't need to know in which language the agent is performing their search. The entire UTF-8 spectrum is at work. The semantic core is open, so competition for answering queries can become distributed across different domain-specific areas. This included the semantic cores for various languages. This unified approach creates an opportunity for a convergent language - Earthish. Since the dawn of the Internet, we have observed a process of rapid language convergence. We use truly global words across the entire planet, independentl of nationality, language, race, name or internet connection. The recent rise of emojis add to this. The dream of a truly global language is hard to deploy because it is hard to agree on what means what. We do however have the tools to make this dream come true. It is not hard to predict that the shorter a word, the more powerful its cyber~()Rank will be. A global, publicly available list of symbols, words, and phrases sorted accordingly by cyber~()Rank with a corresponding link provided by go-cyber can become the foundation for the emergence of a genuinely global language everybody can accept. Recent scientific advances in machine translation are breathtaking but meaningless to those who wish to apply them without a Google-scale trained model. GPT-3 and other large language models are also hidden inside private companies. The proposed cyber~Rank offers public verifiable knowledge.

Self prediction. A consensus computer can continuously build a knowledge graph on its own predicting the existence of cyberlinks and applying these predictions to its state. A consensus computer can participate in the economic consensus of the protocol.

Universal oracle. A consensus computer can store the most relevant data in a key-value storage. This is where the key is a CID and the values are the bytes of the actual content. This can be achieved by making a decision every round, in regards to which CID value the agents want to prune and which value they wish to apply, based on the utility measure of content addresses within the knowledge graph. To compute utility measure, an algorithm checks the availability and the size of the content for the top-ranked content addresses within the knowledge graph; it then weighs on the size of the CIDs and its rank. The emergent key-value storage will be available to write for consensus computers only and not for agents, but values could be used in programs.

Location-aware search. It is possible to construct cyberlinks with Proof-of-Location. Consequently, a location-based search also becomes provable if web3-agents will mine triangulations and attach proof-of-location for every linked chain.

Private cyberlinks. Privacy is fundamental. While we are committed to privacy, achieving implementation of private cyberlinks is unfeasible for our team up to Genesis. Therefore, it is up to the community to work on WASM programs, that can be executed on top of the protocol. The problem is to compute cyber~Rank, based on the cyberlink submitted by a agenst without revealing either: their previous request and the public keys. Zero-knowledge proofs, in general, are very expensive. We believe that the privacy of search should be a feature by design, but we are unsure that we know how to implement it at this stage.

This is surely not a complete list of all the possible applications, but a very promising and exciting one indeed.

Conclusion

We defined and implemented a protocol framework for provable communication between consensus computers on relevance. The protocol is based on the simple idea of a content oracles, which are generated by agents via the use of cyberlinks. Cyberlinks are processed by a consensus computer using the the relevance machine. Content addressing as primary objects are robust in their simplicity and provide significant benefits with regards to resource consumption. For every particle cyber~Rank is computed by a consensus computer without a single point of failure. Cyber~Rank is a token weighted PageRank, with economic protection from sybil attacks via bandwidth limiting and incentives for longterm honest behavior. Every round the merkle root of the content oracle is computed. Consequently, any consensus computer can prove to any other consensus computer the relevance of value for a given particle. The proposed semantics of cyberlinking offers a robust mechanism for predicting meaningful relations between objects by the consensus computer itself. The source code of the consensus computer is written in Go and published under the most simple open-source license: Don't trust, Don't fear, Don't beg. Every bit of data accumulated by the consensus computer is available for anyone if one has the resources to process it. The performance of the proposed software implementation is sufficient for seamless interaction. The system scales with the help of domain specific consensus computers. Though the system provides the necessary utility to offer an alternative for a conventional search engine, it is not limited just to this use case. The system is extendable for numerous applications and makes it possible to design economically rational, self-sovereign cyborgs.

References

Scholarly context adrift
Brand-new stack
Motivating game for adversarial example research
An idea of decentralized search
cyb.ai
go-cyber
cosmos-sdk
CIDv0
PageRank
RFC-6962
IBC protocol
Tendermint
Top 10000 english words
Multilingual neural machine translation
Software 2.0

Acknowledgements

@hleb-albau
@arturalbov
@jaekwon
@ebuchman
@npopeka
@belya
@serejandmyself
@savetheales

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

computing-the-knowledge.md

computing-the-knowledge.md

Computing the knowledge of the Great Web

Abstract

The Great Web

On the adversarial examples problem

Protocol Framework

Content Oracle

Particles

Cyberlinks

The notion of a consensus computer

Relevance Machine

cyber~Rank

Objectivity and Attacks

Proof of Relevance

Internet Knowledge Protocol

Bootstrapping Superintelligence

Browzers

Apps

Conclusion

References

Acknowledgements

Files

computing-the-knowledge.md

Latest commit

History

computing-the-knowledge.md

File metadata and controls

Computing the knowledge of the Great Web

Abstract

The Great Web

On the adversarial examples problem

Protocol Framework

Content Oracle

Particles

Cyberlinks

The notion of a consensus computer

Relevance Machine

cyber~Rank

Objectivity and Attacks

Proof of Relevance

Internet Knowledge Protocol

Bootstrapping Superintelligence

Browzers

Apps

Conclusion

References

Acknowledgements