Trigram/Bigram Typing Prediction Program

Project abstract

Our project is a typing prediction program that combines trigram and bigram probabilistic models to suggest words based on user input. This program contains two components written in python—model training and graphical user interface (GUI). The model training component receives a raw text file of well-formed sentences for training data. Sentences are first split into words, then special characters/symbols are removed with regex. In a sequential manner, the occurrence of each word is mapped to their immediate predecessor or two predecessors in python dictionaries to construct the bigram and trigram models, respectively. Both models are then trimmed to reduce size, where all entries except for the three highest occurrences are pruned from the model. This also serves to reduce noise in the model, since we are not processing the data on a sentence-by-sentence basis. Trimmed models are saved as JSON files, which would enable features such as using pre-trained models. The GUI component contains two primary sections. Users are expected to type in the text input field. As words are typed in, the program reads the user’s most recently typed one or two words and accesses the bigram and trigram models loaded from the aforementioned JSON files once the user presses the spacebar. Special characters are trimmed from the user’s typed words before accessing the respective models for the three most likely words (a.k.a. words with the highest occurrences) that follow. The suggested words are displayed as clickable labels on the bottom of the interface, where the user could click the respective label to enter the word into their text.

Program requirements

Python 3.5+
tkinter
numpy
json
re

Instructions

Start the GUI for the Trigram + Bigram model

python typepred.py

The program loads pre-built models from trigram_model.json and bigram_model.json and starts the GUI application automatically.

Start the GUI for the Bigram model

python typepredBI.py

The program loads the pre-built model from bigram_model.json and starts the GUI application automatically.

Train the Trigram + Bigram model

python train.py

The program reads training data from the file input.txt and creates trigram_model.json and bigram_model.json.

Train the Bigram model

python trainBI.py

The program reads training data from the file input.txt and creates bigram_model.json.

Training data

Training data is stored in the input.txt file. No specific requirements, preferably well formed natural sentences.

GUI sample

Sample training data

To generate the relevant models:

Unzip this zip file
Copy the contents of final/en_US/en_US.news.txt
Paste the contents into input.txt
Run train.py or trainBI.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Trigram/Bigram Typing Prediction Program

Project abstract

Program requirements

Instructions

Start the GUI for the Trigram + Bigram model

Start the GUI for the Bigram model

Train the Trigram + Bigram model

Train the Bigram model

Training data

GUI sample

Sample training data

To generate the relevant models:

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
abstract.pdf		abstract.pdf
input.txt		input.txt
train.py		train.py
trainBI.py		trainBI.py
typepred.py		typepred.py
typepredBI.py		typepredBI.py

XDDz123/TypePred

Folders and files

Latest commit

History

Repository files navigation

Trigram/Bigram Typing Prediction Program

Project abstract

Program requirements

Instructions

Start the GUI for the Trigram + Bigram model

Start the GUI for the Bigram model

Train the Trigram + Bigram model

Train the Bigram model

Training data

GUI sample

Sample training data

To generate the relevant models:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages