GoldenRetriever - Information retrieval using fine-tuned semantic similarity
AI Makerspace program. Please visit the demo page where you will be able to query a sample knowledge base.GoldenRetriever is part of the HotDoc NLP project, which provides a series of open-source AI tools for natural language processing. HotDoc NLP is part of the
Golden Retriever is a framework for a information retrieval engine (QnA, knowledge base query, etc) that works in 4 steps:
- Step 1: The knowledge base has to be separated into "documents" or clauses. Each clause is an indexed unit of information e.g. a clause, a sentence, or a paragraph.
- Step 2: The clauses (and query) should be encoded with the same encoder (Infersent, Google USE1, or Google USE-QA2).
- Step 3: A similarity score is calculated (cosine dist, arccos dist, dot product, nearest neighbors).
- Step 4: Clauses with the highest score (or nearest neighbors) are returned as the retrieved document.
model_finetuning.py currently optimizes
the framework for retrieving clauses from a
contract or a set of terms and conditions,
given a natural language query.
There is a potential for fine tuning following Yang et. al's (2018) paper on learning textual similarity from conversations.
A fully connected layer is inserted after the clauses are encoded to maximize the dot product between the transformed clauses and the encoded query.
In the transfer learning use-case, the Google-USEQA model is further fine-tuned using a triplet-cosine-loss function. This helps to push correct question-knowledge pairs closer together while maintaining a marginal angle between question-wrong-knowledge pairs. This method can be used to overfit towards any fixed FAQ dataset without losing the semantic similarity capabilities of the sentence encoder.
This model is implemented as a flask app.
python app.py to launch a web interface
from which you can query some pre-set documents.
To run the flask app using docker,
- Clone this repository.
- Build the container image:
docker build -t goldenretriever .
- Run the container:
docker run -p 5000:5000 goldenretriever
- Access the web interface on your browser by navigating to
For comparison, we apply 3 sentence encoding models to the data set provided at InsuranceQA corpus. Each test case consists of a question, and 100 possible answers, of which the correct answer is one or more of the 100 possible answers.
Model evaluation metric is
k is the number of clauses our model
returns for a given query.
A top score of
1 indicates that
k clauses contains
a correct answer to the query,
and a score of
0 indicates that
none of the
k clauses returned
contain a correct answer.