Skip to content

Setup Algolia Search with Antora

Joseph Cayouette edited this page Dec 4, 2019 · 1 revision

Setup Algolia Search with Antora

Create an account on https://www.algolia.com.

Note

The default "Community" pricing tier of Algolia allows you to host up to 10k records (and 100k add/edit/delete operations/month). It has no time limit.

For Open Source projects, Algolia will extend this to 100k records and 200k operations. You just need to fill out the Algolia for Open Source form.

Keep in mind, this account is only required for running the DocSearch scraper(docker container or command line tool from Algolia).

Using DocSearch.js in your UI does not require an Algolia account.

  1. Sign in to your account on https://www.algolia.com and create a new app.

  2. Next navigate to Dashboard > Indices and create a new index. I decided to name my index manager.

Clone the algolia/docsearch-scraper repository using git, then switch to that directory:

$ git clone https://github.com/algolia/docsearch-scraper && cd docsearch-scraper
  1. Install pipenv (You may need to upgrade pip, the command to upgrade it provided when you attempt to install):

    pip install --user pipenv
  2. Add pipenv to your PATH (I did this via .bashrc):

    export PATH="$HOME/.local/bin:$PATH"
  3. Make sure you have python-devel headers installed otherwise the next step will fail with an error containing:

    src/twisted/test/raiser.c:4:10: fatal error: Python.h: No such file or directory', '     #include "Python.h"', '              ^~~~~~~~~~', '    compilation terminated.', "    error: command 'gcc' failed with exit status 1",
  4. Install Docsearch Scraper via pipenv, then enable your pipenv virtual environment:

    pipenv install
    pipenv shell
  5. Create an .env file in the docsearch repository root with the following contents (You can find these keys under your algolia Dashboard → API Keys. Remember you need to write to your algolia index. You need your admin key not the read-only search key.):

    APPLICATION_ID=YOUR_APP_ID
    API_KEY=YOUR_ADMIN_API_KEY
Note
Make sure your user account is member of the docker group. Since the virtual environment is not installed for the root user, using sudo will not help.

If you have docker installed, you can run the following command to build the docsearch docker container now (Recommended!):

./docsearch docker:build

In the docsearch repository root create a file called config.json with the following content (change the index_name, start_urls, and sitemap_urls accordingly. If you have not done so already place the sitemap.xml file you generated into your projects gh-pages branch root):

{
  "index_name": "manager",
  "start_urls": [
    "https://opensource.suse.com/doc-susemanager"
  ],
  "sitemap_urls": [
    "https://opensource.suse.com/doc-susemanager/sitemap.xml"
  ],
  "stop_urls": [],
  "selectors": {
    "lvl0": {
      "selector": "//nav[@class='crumbs']//li[@class='crumb'][last()-1]",
      "type": "xpath",
      "global": true,
      "default_value": "Home"
    },
    "lvl1": ".doc h1",
    "lvl2": ".doc h2",
    "lvl3": ".doc h3",
    "lvl4": ".doc h4",
    "text": ".doc p, .doc td.content, .doc th.tableblock"
  }
}
  1. Run the following command to index your site:

    ./docsearch docker:run config.json
  2. The crawler will start at the URL provided, use the sitemap.xml you provided, and push the content you specified into your index located at Algolia.

See the following files to see how to include the custom search bar in your antora UI:

Tip

TODO This process should be automated, Indexing and creating the sitemap.xml file should occur at least one time per week. If we do it too often we will use up our number of operations per month on Algolia. Indexing 3 - 4 times per month is just fine for the online search.

Clone this wiki locally