Skip to content

Predict next word based on last N words typed with N-Gram / LSTM model

Notifications You must be signed in to change notification settings

nonococoleo/NextWordPredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Next Word Predictor

Predict next word based on last N words typed with two kinds of model

usage

  1. N-gram model
  • Train and test the model with the following parameters: n, training_file, test_file, output_file, backoff
python3 NGram.py [--N n] [--training_file training_file] [--test_file test_file] [--output_file output_file] [--backoff True/False]
  1. LSTM model
  • download pretrain Glvoe data
cd data
chmod +x download_glove.sh
./download_glove.sh
  • building vocabulary dictionary (data/vocab) with pretrain Glove data(e.g. data/glove.6B.300d.txt) and training corpus(e.g. corpus/train_all.pkl)
python3 vocabulary.py [Glove_file] [corpus_file]
  • training Neural Network with training corpus(e.g. corpus/train_all.pkl) and save (data/model_all.pt)
    required: vocabulary file (data/vocab)
    optional: window length
python3 LSTM.py --train corpus_file [--length N]
  • testing model(e.g. data/model_all.pt) with testing corpus
    required: vocabulary file (data/vocab)
    optional: window length
python3 LSTM.py --test model_file [--length N]
  • Next Word Preditor
    required: vocabulary file (data/vocab), model file(e.g. data/model_all.pt)
python3 app.py

About

Predict next word based on last N words typed with N-Gram / LSTM model

Topics

Resources

Stars

Watchers

Forks