Skip to content

A program which guesses next words based on the user's input. Suggestions are the words with the highest probability to follow what has been already written, calculated in the n_grams of different size.

Notifications You must be signed in to change notification settings

KajaBraz/NextWordPredictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

49 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NextWordPredictor

A program, the aim of which is to give suggestions of possible next words guessed on the basis of the user's input.

With the use of the nltk's brown corpus as training data, once adequately preprocessed, the program creates n_grams as the base for the prediction of possible next words which are most probable to be chosen by the user. The maximum size of n_grams is provided in advance. To choose the best matches, the program follows the n_grams hierarchy, which means that it calculates the probabilities of the highest n_grams and in case of missing data, the same process is repeated on the 'lower level', i.e. using the smaller n_grams, until it reaches the lowest n_grams (bigrams). If the desired number of suggestions is still not reached, they will be filled with the default words, selected on the basis of frequency.

The users are given suggestions of possible next words and in case their needs are not met, they may type their own word.

To run the program, you have to execute the "main.py" file.

obraz

About

A program which guesses next words based on the user's input. Suggestions are the words with the highest probability to follow what has been already written, calculated in the n_grams of different size.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages