Skip to content
Semantic search for quotes.
Python Makefile Shell
Branch: master
Clone or download
cjwallace Cache quote index
I had tried this previously. Streamlit can't cache the FAISS object, but that can be sidestepped by allowing output mutation, which seems to ultimately mean 'run this code only once', which works fine for our purposes.
Latest commit 95715ca Oct 29, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
scripts Cache quote index Oct 29, 2019
.gitignore Initial commit Oct 24, 2019
LICENSE Create LICENSE Oct 28, 2019
Makefile Rename app.py to squote.py for browser title Oct 29, 2019
README.md Update README with FAISS details. Oct 28, 2019
requirements.txt Reduce tensorflow version for Ubuntu compatibility Oct 29, 2019

README.md

squote

A semantic search engine that takes some input text and returns some (questionably) relevant (questionably) famous quotes.

Squote finding relevant quotes

Built with:

Quotes from https://thewebminer.com/.

setup

First, install the necessary dependencies into a python 3 environment of your choice. For instance, to install the deps into a venv, run

python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

There are additional native dependencies for FAISS: libomp and libopenblas must be available (see the FAISS repo for install instructions). All other commands should be run from within the virtual environment.

A Makefile is provided to make things nice and easy.

make dirs
make data   # downloads the raw quote data
make model  # downloads ~350MB of BERT weights

running

Before we can run the app, we need embeddings of the quotes. To generate the embeddings and save them in a pickled pandas DataFrame, run the commands below. This will take some time (couple of hours) on CPU.

make serve  # this runs bert-as-a-service
make embed  # this computes the embeddings

Once the embeddings exist, we can run the streamlit app with:

make serve  # not needed if still running from above
make app

Have fun!

You can’t perform that action at this time.