Skip to content

XDDz123/TypePred

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Trigram/Bigram Typing Prediction Program

Project abstract

Our project is a typing prediction program that combines trigram and bigram probabilistic models to suggest words based on user input. This program contains two components written in python—model training and graphical user interface (GUI). The model training component receives a raw text file of well-formed sentences for training data. Sentences are first split into words, then special characters/symbols are removed with regex. In a sequential manner, the occurrence of each word is mapped to their immediate predecessor or two predecessors in python dictionaries to construct the bigram and trigram models, respectively. Both models are then trimmed to reduce size, where all entries except for the three highest occurrences are pruned from the model. This also serves to reduce noise in the model, since we are not processing the data on a sentence-by-sentence basis. Trimmed models are saved as JSON files, which would enable features such as using pre-trained models. The GUI component contains two primary sections. Users are expected to type in the text input field. As words are typed in, the program reads the user’s most recently typed one or two words and accesses the bigram and trigram models loaded from the aforementioned JSON files once the user presses the spacebar. Special characters are trimmed from the user’s typed words before accessing the respective models for the three most likely words (a.k.a. words with the highest occurrences) that follow. The suggested words are displayed as clickable labels on the bottom of the interface, where the user could click the respective label to enter the word into their text.

Program requirements

  • Python 3.5+
  • tkinter
  • numpy
  • json
  • re

Instructions

Start the GUI for the Trigram + Bigram model

python typepred.py

The program loads pre-built models from trigram_model.json and bigram_model.json and starts the GUI application automatically.

Start the GUI for the Bigram model

python typepredBI.py

The program loads the pre-built model from bigram_model.json and starts the GUI application automatically.

Train the Trigram + Bigram model

python train.py

The program reads training data from the file input.txt and creates trigram_model.json and bigram_model.json.

Train the Bigram model

python trainBI.py

The program reads training data from the file input.txt and creates bigram_model.json.

Training data

Training data is stored in the input.txt file. No specific requirements, preferably well formed natural sentences.

GUI sample

gui

Sample training data

To generate the relevant models:

  1. Unzip this zip file
  2. Copy the contents of final/en_US/en_US.news.txt
  3. Paste the contents into input.txt
  4. Run train.py or trainBI.py

About

Typing prediction program

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages