P2P idea #792

dalf · 2016-12-27T10:24:24Z

For one user request there are most of the time different engine requests. among other searx instances (in the general category : wikipedia, wikidata, qwant, google, etc...)

Purpose

For one user request, the purpose is to spread the individual engine requests to different searx instances. Of course, It's required trust in other searx instances from the searx administrator and from the users.

How

Let's call :

the searx instance receiving the user request, main instance.
the other instances, peer instances.

The first step, searx should improve response time measures :

the main searx should measure the median response time, and the 95th percentile response time (95% of the time the response time is bellow that measure) :
** for each engine of the main instance
** for each (peer instance, engine) couple.

For example (YMMV) : There are slow and fast engines : wikipedia is fast, google is slow (double). So for user request using google and wikipedia, the main instance can send the request for the google engine, and use a peer instance for the wikipedia. The user response time won't change, since the google response time set the upper limit.

So one way to spread the requests is :

for slow engines : do it in the normal way on the main instance.
for fast engines : use a peer instance to proxy the request.

The global response time should not change too much.

So when there is user request on the main instance :

the main instance guesses the global response time without peers, using the maximum the median of each engine of the user request (not sure if the median is the best choice).
the main instance selects a peer instance for each engine according to the guessed p95 response time of the couple (peer instance, engine). Using the p95 gives the worst case scenario (or nearly).
the main instance spreads the request to the different peers.
the response time of each (peer instance, engine) improves the statistics.

Problems not solved

How to select a (peer instance, engine) if no request has been sent to it ? Perhaps the main instance should bootstrap by sending one or two requests for each (peer instance, engine). And here, starts the problem of flood.
How the main instance and the peer instances should communicate ? Using the rss API ? Using something similar to morty with a Hash ?

[EDIT] clarification.

dalf added the core label Dec 27, 2016

dalf mentioned this issue May 10, 2017

new engines, search user-defined macros by topics (eg. academic papers, code search) #916

Open

dalf mentioned this issue Aug 19, 2017

Why not creating a searx network? #1001

Closed

dalf mentioned this issue Oct 20, 2019

Proof of concept: Quart #1724

Closed

5 tasks

dalf mentioned this issue Jan 14, 2021

[Feature] Proxy SearX through other SearX instances #2466

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

P2P idea #792

P2P idea #792

dalf commented Dec 27, 2016 •

edited

P2P idea #792

P2P idea #792

Comments

dalf commented Dec 27, 2016 • edited

Purpose

How

Problems not solved

dalf commented Dec 27, 2016 •

edited