Skip to content
This repository has been archived by the owner on Oct 26, 2023. It is now read-only.

Commit

Permalink
Added quickstart documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
rtrevinnoc committed Jan 28, 2021
1 parent 8aab101 commit 0620187
Show file tree
Hide file tree
Showing 35 changed files with 15,836 additions and 33 deletions.
4 changes: 2 additions & 2 deletions count_annoy_index.py → count_index.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@

with URLDBIndex.begin() as urlDBTransaction, imageDBIndex.begin(
) as imageDBTransaction: #, analyticsDBIndex.begin() as analyticsDBTransaction:
print(urlDBTransaction.stat()["entries"])
print(imageDBTransaction.stat()["entries"])
print("Number of URL's: ", urlDBTransaction.stat()["entries"])
print("Number of images: ", imageDBTransaction.stat()["entries"])
# print(analyticsDBTransaction.stat()["entries"])
File renamed without changes.
20 changes: 20 additions & 0 deletions docs/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
# Minimal makefile for Sphinx documentation
#

# You can set these variables from the command line, and also
# from the environment for the first two.
SPHINXOPTS ?=
SPHINXBUILD ?= sphinx-build
SOURCEDIR = source
BUILDDIR = build

# Put it first so that "make" without argument is like "make help".
help:
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

.PHONY: help Makefile

# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Binary file added docs/build/doctrees/environment.pickle
Binary file not shown.
Binary file added docs/build/doctrees/index.doctree
Binary file not shown.
4 changes: 4 additions & 0 deletions docs/build/html/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: 8e34a4dcc3ae5a63cd0e08988cbe8a9e
tags: 645f666f9bcd5a90fca523b33c5a78b7
128 changes: 128 additions & 0 deletions docs/build/html/_sources/index.rst.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
.. FUTURE documentation master file, created by
sphinx-quickstart on Thu Jan 28 13:37:15 2021.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
Welcome to FUTURE's documentation!
==================================

.. toctree::
:maxdepth: 2
:caption: Contents:

FUTURE is a decentralized, open-source and privacy focused search engine.
It is capable of running completely standalone, but it usually complements its own results with others sourced from meta-search at public Searx instances.
It also harnesses most of its power when running as a node in a network of independant FUTURE instances, so that they can share and complement their own indexes, thus also providing redundancy to the service.
The main instance is located at https://wearebuildingthefuture.com

Quickstart
==========

It is easy to setup and run a FUTURE instance publicly so that it contributes to the distributed network.
First, you will need to clone the repository:

.. code-block:: bash
git clone https://github.com/rtrevinnoc/FUTURE.git
cd FUTURE
Then you will have to add a ``config.py`` file, which will allow you to customize important parts of your instance without directly modifying the source code and struggling with updates.
It is suggested to start with this configuration template, which is essentially equal to the one used for the main instance:

.. code-block:: python
#!/usr/bin/env python3
# -*- coding: utf8 -*-
import secrets
WTF_CSRF_ENABLED = True
SECRET_KEY = secrets.token_urlsafe(16)
HOST_NAME = "my_public_future_instance" # THE NAMES 'private' and 'wearebuildingthefuture.com' are reserved for private and main nodes, respectively.
with open("tranco_JKGY.csv") as tranco:
SEED_URLS = [x.strip() for x in tranco.readlines()]
PEER_PORT = 3000
CONCURRENT_REQUESTS = 10
CONCURRENT_REQUESTS_PER_DOMAIN = 2.0
CONCURRENT_ITEMS = 100
REACTOR_THREADPOOL_MAXSIZE = 20
DOWNLOAD_MAXSIZE = 10000000
AUTOTHROTTLE = True
TARGET_CONCURRENCY = 2.0
MAX_DELAY = 30.0
START_DELAY = 1.0
DEPTH_PRIORITY = 1
LOG_LEVEL = 'INFO'
CONTACT = "rtrevinnoc@wearebuildingthefuture.com"
MAINTAINER = "Roberto Treviño Cervantes"
FIRST_NOTICE = "Written and Mantained By <a href='https://keybase.io/rtrevinnoc'>Roberto Treviño</a>"
SECOND_NOTICE = "Proudly Hosted on <a href='https://uberspace.de/en/'>Uberspace</a>"
DONATE = "<a href='https://www.buymeacoffee.com/searchatfuture'>DONATE</a>"
COLABORATE = "<a href='https://github.com/rtrevinnoc/FUTURE'>COLABORATE</a>"
After you have configurated your FUTURE instance, but before you can start the server, you will be required to add a minimum of ~25 urls to your local index, by executing:

.. code-block:: bash
chmod +x bootstrap.sh
./bootstrap.sh
./build_index.sh
At any point in time, you can check how much webpages are in your local index by executing:

.. code-block:: bash
python3 count_index.py
And eventually, you can interrupt the crawler by executing:

.. code-block:: bash
./save_index.sh
Naturally, you can restart it using ``./build_index.sh``.
And with this, you can start your development server with:

.. code-block:: bash
./future.py
However, if you are planning to contribute to the shared index by making your instance public, it is recommended to use uWSGI.
We suggest using this configuration template, with ``touch uwsgi.ini``, as it is used on the main instance.

.. code-block:: YAML
[uwsgi]
module = future:app
pidfile = future.pid
http-socket = :3000
chmod-socket = 660
strict = true
master = true
enable-threads = true
vacuum = true ; Delete sockets during shutdown
single-interpreter = true
die-on-term = true ; Shutdown when receiving SIGTERM (default is respawn)
need-app = true
disable-logging = true ; Disable built-in logging
log-4xx = true ; but log 4xx's anyway
log-5xx = true ; and 5xx's
cheaper-algo = busyness
processes = 6 ; Maximum number of workers allowed
cheaper = 1 ; Minimum number of workers allowed
cheaper-initial = 2 ; Workers created at startup
cheaper-overload = 1 ; Length of a cycle in seconds
cheaper-step = 1 ; How many workers to spawn at a time
cheaper-busyness-multiplier = 30 ; How many cycles to wait before killing workers
cheaper-busyness-min = 20 ; Below this threshold, kill workers (if stable for multiplier cycles)
cheaper-busyness-max = 70 ; Above this threshold, spawn new workers
cheaper-busyness-backlog-alert = 4 ; Spawn emergency workers if more than this many requests are waiting in the queue
cheaper-busyness-backlog-step = 2 ; How many emergency workers to create if there are too many requests in the queue
Finally, start your public node to contribute to the shared network with the following command:

.. code-block:: bash
uwsgi uwsgi.ini

0 comments on commit 0620187

Please sign in to comment.