Kylo - The FAQ Bot

How does it work?

Using Facebook's Infersent model we create sentence embeddings of the existing data. When a new text is queried, we calculate the cosine distance between the the query text and the existing embeddings. The highest value is considered as a match and returned as the answer.

Infersent

InferSent is a sentence embeddings method that provides semantic representations for English sentences. It is trained on natural language inference data and generalizes well to many different tasks.

Read the original paper - arxiv.

Setup

Make sure you have Python 3. Install the Python requirements:

pip install -r requirements.txt

run setup.sh to get the Infersent model and also all the word vectors. This project uses GloVe:

./setup.sh

Training

For training, add any new data in data/ directory. Check data/README.md for format instructions. Then run the training to save the embeddings in embeddings/ dir:

python train.py

Inference

To check, import find_best_match from inference:

from inference import find_best_match

print(find_best_match(text="are you open source"))

Server Deployment

This repository comes with a Tornado API server, with REST API and a websocket end point. To run the server:

python server.py

Todo

Check CONTRIBUTING.md for more details.

Build a dataset to evaluate accuracy
Evaluate GLoVe vs FastText
Work on improving accuracy
Handle common typos

Name

License

The mighty MIT license. Check LICENSE for more details.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
configs		configs
data		data
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
commons.py		commons.py
inference.py		inference.py
requirements.txt		requirements.txt
server.py		server.py
settings.py		settings.py
setup.sh		setup.sh
train.py		train.py
utils.py		utils.py

License

avinassh/kylo

Folders and files

Latest commit

History

Repository files navigation

Kylo - The FAQ Bot

How does it work?

Infersent

Setup

Training

Inference

Server Deployment

Todo

Name

License

About

Resources

License

Stars

Watchers

Forks

Languages