THIS PROJECT HAS MOVED

Kamilata

A Peer-to-Peer Search Engine System

Scope

This project is an implementation as a library of the Kamilata protocol. Kamilata enables trustless search in open networks. This library can handle any type of data, and be easily integrated into your libp2p application.

Several use cases are possible:

Youtube-like video sharing platforms
Social networks
File sharing platforms
Web search engines

The ranking algorithm is up to you, as this library will only provide you a stream of unordered search results. Based on metadata you include in those results, you can rank them however you want.

Kamilata is the first system in the world to offer the properties described above, while still being scalable. Indeed, the network can include without problems more than hundreds of millions documents and hundreds of thousands of nodes. The actual limit is unknown.

This library powers the Admarus IPFS search engine.

General Technical Description

It all starts from the most naive approach, that I have optimized to the maximum. Imagine a network of peers each storing documents (these documents can be replicated on multiple peers if they are popular). When a peer wants to search for a document, it sends a query to every peer in the network. This stops working when there are too many peers, because the network is flooded with queries.

To solve this, I added a routing algorithm that allows the searcher to route queries only to the peers who have matching documents. Thanks to this, queries skip all the useless peers. Now, you can download a list of matching documents at constant speed regardless of the query. The search speed depends on the size of the network. New results are received every h hops, where h = ln(n)/ln(c) with n the number of peers and c the number of connections each peer has to others. This is very good as for h beeing more than 3 when c is 100, the network needs to have more than a million peers. Results can then be ranked freely based on the metadata they include.

The Kamilata routing algorithm is based on Attenuated Bloom Filters. Bloom filters are compact data structures used to determine if an element is present in a set. Here, we check the presence of words in documents. From a node's point of view, a Kamilata network is divided into virtual node groups of varying sizes. This divides the corpus into multiple sets ranging from a few documents to all documents of the corpus. Each having its corresponding Bloom filter, it is then easy to locate words in the network and know which nodes to query for given words.

Name		Name	Last commit message	Last commit date
Latest commit History 219 Commits
.github		.github
src		src
tests		tests
.gitignore		.gitignore
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

THIS PROJECT HAS MOVED

Kamilata

Scope

General Technical Description

About

Languages

License

Mubelotix/kamilata

Folders and files

Latest commit

History

Repository files navigation

THIS PROJECT HAS MOVED

Kamilata

Scope

General Technical Description

About

Topics

Resources

License

Stars

Watchers

Forks

Languages