The notebook takes as input a pointer to a S3 bucket containing a set of text files, and outputs a set of JSON files, each of which contains a set of prior strings (e.g. 1, 2, 3 or 4 prior words separated by spaces) as keys, pointing to arrays of the most common occurances for the next word in the sequence. The JSON files can be used in next-word prediction applications, as demonstrated in this blog post.
-
Notifications
You must be signed in to change notification settings - Fork 0
bricof/word_predictor
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Next-word predictor development via Amazon ElasticMapReduce (EMR)
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published