Clone repository with
git clone https://github.com/MrPekar98/Simple-Search-Engine.git
Install Curl
sudo apt install curl -y && sudo apt install libcurl4-gnutls-dev
Install Microsoft C++ REST SDK
sudo apt-get install libcpprest-dev
To install on a platform different from a Debian-based platform, take a look at the Getting Started section here.
Run the following command to install necessary build tools
sudo apt-get install g++ make cmake git libboost-atomic-dev libboost-thread-dev libboost-system-dev libboost-date-time-dev libboost-regex-dev libboost-filesystem-dev libboost-random-dev libboost-chrono-dev libboost-serialization-dev libwebsocketpp-dev openssl libssl-dev ninja-build
Clone the respository
git clone https://github.com/Microsoft/cpprestsdk.git casablanca
Run the following set of commands (you can specify -DCMAKE_BUILD_TYPE=Release
instead to build a release version)
cd casablanca
mkdir build.debug
cd build.debug
cmake -G Ninja .. -DCMAKE_BUILD_TYPE=Debug
ninja
sudo ninja install
To compile the project, simply run the command
make
Now, an executable search is built in the project root.
Note that you can specify the crawler seed set in src/config.hpp. Choosing a set of web pages with a high out-degree is recommended.
Alternatively, build the Docker image
docker build -t search .
Run the Docker container
docker run --name search -p <PORT>:<PORT> search
Set <PORT> to the port number specified in src/config.hpp. Add the flag -d to detach from the process.
A simple web page is provided with a search bar. Otherwise, you can send a simple POST request with search keyword in the request body. The search result is a simple plain text with relavant document titles and their URL.