A semantic search engine that takes some input text and returns some (questionably) relevant (questionably) famous quotes.
Quotes from https://thewebminer.com/.
First, install the necessary dependencies into a python 3 environment of your choice. For instance, to install the deps into a venv, run
python3 -m venv venv source venv/bin/activate pip install -r requirements.txt
There are additional native dependencies for FAISS:
libopenblas must be available (see the FAISS repo for install instructions).
All other commands should be run from within the virtual environment.
A Makefile is provided to make things nice and easy.
make dirs make data # downloads the raw quote data make model # downloads ~350MB of BERT weights
Before we can run the app, we need embeddings of the quotes. To generate the embeddings and save them in a pickled pandas DataFrame, run the commands below. This will take some time (couple of hours) on CPU.
make serve # this runs bert-as-a-service make embed # this computes the embeddings
Once the embeddings exist, we can run the streamlit app with:
make serve # not needed if still running from above make app