-
Notifications
You must be signed in to change notification settings - Fork 1
kumquatexpress/nGramBabbler
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Babbler A simple n-gram babbler that spits out probabilitic word choices when given a corpus to prime itself on and an integer denoting the number of words to guess on. To use, create an ngram Babbler by feeding it a file and an integer: b = Babbler("test.txt", 2) Then successively call generate_next_word([list of words]) with the words that the Babbler is generating: word_list = [] while [condition]: word_list.append(b.generate_next_word(word_list)) return " ".join(word_list) nGram and nGramReader nGram is an object that holds information about a given number of words preceding a word mapped to the word and its count in the corpus. nGramReader reads from a corpus file and stores relevant word progression information in a dictionary, which it then stores as many nGram objects in a database transaction. To use, create an nGramReader: n = nGramReader("test.txt", 2) where the integer is the number of preceding words to read for each nGram. Then call n.make_db_transactions() to create nGrams from the dictionary and load them into a database (currently sqlite3).
About
A simple probabilistic guesser for word corpora
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published