Skip to content
Pytorch NLP library based on FastAI
Python Jupyter Notebook
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.
fastai @ e8841f7 updated fastai to latest master Jun 17, 2018
.gitignore initial os commit Mar 24, 2018
.gitmodules readded fastai submodule Mar 29, 2018
Dockerfile cleanup in dockerfile and readme May 10, 2018
README.rst cleanup in dockerfile and readme May 10, 2018
environment.yaml added dockerfile and relevant filees for dockerization of quicknlp/fa… Apr 15, 2018
mac-environment.yaml cleanup in environment files and Jun 5, 2018
matplotlibrc cleanup in environment files and Jun 5, 2018


Quick NLP

Quick NLP is a deep learning nlp library inspired by the library

It follows the same api as fastai and extends it allowing for quick and easy running of nlp models



Installation of library is required. Please install using the instructions here . It is important that the latest version of is used and not the pip version which is not up to date.

After setting up an environment using the instructions please clone the quick-nlp repo and use pip install to install the package as follows:

git clone
cd quick-nlp
pip install .

Docker Image

A docker image with the latest master is available to use it please run:

docker run --runtime nvidia -it -p 8888:8888 --mount type=bind,source="$(pwd)",target=/workspace agispof/quicknlp:latest

this will mount your current directory to /workspace and start a jupyter lab session in that directory

Usage Example

The main goal of quick-nlp is to provided the easy interface of the library for seq2seq models.

For example Lets assume that we have a dataset_path with folders for training, validation files. Each file is a tsv file where each row is two sentences separated by a tab. For example a file inside the train folder can be a eng_to_fr.tsv file with the following first few lines:

Go. Va !
Run!        Cours !
Run!        Courez !
Wow!        Ça alors !
Fire!       Au feu !
Help!       À l'aide !
Jump.       Saute.
Stop!       Ça suffit !
Stop!       Stop !
Stop!       Arrête-toi !
Wait!       Attends !
Wait!       Attendez !
I see.      Je comprends.

loading the data from the directory is as simple as:

from fastai.plots import *
from import Field
from fastai.core import SGD_Momentum
from fastai.lm_rnn import seq2seq_reg
from quicknlp import SpacyTokenizer, print_batch, S2SModelData
INIT_TOKEN = "<sos>"
EOS_TOKEN = "<eos>"
DATAPATH = "dataset_path"
fields = [
    ("english", Field(init_token=INIT_TOKEN, eos_token=EOS_TOKEN, tokenize=SpacyTokenizer('en'), lower=True)),
    ("french", Field(init_token=INIT_TOKEN, eos_token=EOS_TOKEN, tokenize=SpacyTokenizer('fr'), lower=True))

batch_size = 64
data = S2SModelData.from_text_files(path=DATAPATH, fields=fields,
                                    source_names=["english", "french"],
                                    bs= batch_size

Finally, to train a seq2seq model with the data we only need to do:

emb_size = 300
nh = 1024
nl = 3
learner = data.get_model(opt_fn=SGD_Momentum(0.7), emb_sz=emb_size,
clip = 0.3
learner.reg_fn = reg_fn
learner.clip = clip, wds=1e-6)
You can’t perform that action at this time.