Skip to content

A project for creating the search backend used on site hosted by Github Pages. For now, the search engine is built using Typesense and hosted on a single free tier e2-instance on GCP.

License

Notifications You must be signed in to change notification settings

dkharazi/site-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

site-search

The purpose of this project is to create a free, light-weight search engine for any Gatsby website hosted by Github Pages. For now, the search documents are scraped using Python and stored in an Elasticsearch database that is hosted on a single, free tier e2-instance on GCP. ReactiveSearch is used for its pre-built search UI components, which allow users to interact and query notes and blog posts. In theory, any search database used on a small website could be able to run on the e2-instance on GCP. These search engines may include Typesense, Lucene, etc.

To create and host a search engine on GCP for a website hosted on Github Pages, we need to do the following steps:

  1. Create a free tier account on GCP
  2. Instantiate a micro e2-instance (with Ubuntu)
  3. Install Docker on the e2-instance
  4. Install and run Elasticsearch on the e2-instance
  5. Run and schedule Python code for scraping our website and ingesting blog posts and notes into Elasticsearch
  6. Implement ReactiveSearch components in our site's code for querying our site's posts and notes saved in Elasticsearch
  7. Configure a firewall on GCP (using UFW on Ubuntu)

Installing Docker on an Ubuntu E2-Instance

  1. Update all existing packages on the e2-instance
  2. Install pre-requisite packages for letting apt use packages over HTTPS
  3. Add the GPG key for the official Docker repository to the system
  4. Add the Docker repository to apt resources
  5. Ensure the installation comes from the official Docker repo, rather than the default Ubuntu repo
  6. Install Docker CE
$ ###
$ ### COMMANDS FOR INSTALLING DOCKER
$ ###
$ 
$ # 1. Update all existing packages on the e2-instance
$ sudo apt update
$ # 2. Install pre-requisite packages
$ sudo apt install apt-transport-https ca-certificates curl software-properties-common
$ # 3. Add the GPG key to the system
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ # 4. Add the Docker repository to `apt` resources
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu focal stable"
$ # 5. Ensure the installation comes from the official Docker repo
$ apt-cache policy docker-ce
$ # 6. Install Docker CE
$ sudo apt install docker-ce
$
$ ###
$ ### OTHER USEFUL COMMANDS
$ ###
$
$ # Print if docker service is running
$ sudo systemctl status docker
$ # Stop docker service
$ sudo service docker stop
$ # Start docker service
$ sudo service docker start
$ # Print list of running containers
$ sudo docker ps

For any additional information about downloading Docker, read the detailed steps and overview found in this article. For more recent, up-to-date information about downloading Docker on Ubuntu machines, please refer to the official Docker installation docs.

Installing Elasticsearch on an Ubuntu E2-Instance

  1. Update all existing packages on the e2-instance
$ ###
$ ### COMMANDS FOR RUNNING TYPESENSE SERVICE
$ ###
$ 
$ # 1. Update all existing packages on the e2-instance
$ sudo apt update
$ # 2. Install Node.js and npm
$ sudo apt install nodejs npm
$ # 3. Verify the installation
$ sudo nodejs --version
$ # 4. Create directory for Typesense search service
$ mkdir /home/dkharazif/typesense-server-data ; cd /home/dkharazif/typesense-server-data
$ # 5. Run search service as Typesense container
$ sudo nohup docker run -i -p 8108:8108 -v/home/dkharazif/typesense-server-data/:/data typesense/typesense:0.15.0 --data-dir /data --api-key=xyz --listen-port 8108 --enable-cors > typesense-server-data.log &
$ # 6. Create shell script for purging logs
$ touch purge-logs.sh
$ # 7. Open crontab
$ crontab -e
$ # 8. Schedule shell script at 2AM every day
$ 0 2 * * * sh /home/dkharazif/typesense-server-data/purge-logs.sh

$ ###
$ ### OTHER USEFUL COMMANDS
$ ###
$
$ # Verify search engine service is running in docker container
$ sudo docker ps
$ # Verify which port docker container is running on
$ sudo netstat -nlp | grep 8108

For any additional information about downloading npm on an Ubuntu system, read the walkthrough outlined in this article. For additional steps about installing Gatsby-related packages and/or Typesense in an Ubuntu environment, please refer to this article.

About

A project for creating the search backend used on site hosted by Github Pages. For now, the search engine is built using Typesense and hosted on a single free tier e2-instance on GCP.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages