Skip to content

Next-word predictor development via Amazon ElasticMapReduce (EMR)

Notifications You must be signed in to change notification settings

bricof/word_predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Next-Word Predictor Development via Amazon ElasticMapReduce (EMR)

The notebook takes as input a pointer to a S3 bucket containing a set of text files, and outputs a set of JSON files, each of which contains a set of prior strings (e.g. 1, 2, 3 or 4 prior words separated by spaces) as keys, pointing to arrays of the most common occurances for the next word in the sequence. The JSON files can be used in next-word prediction applications, as demonstrated in this blog post.

About

Next-word predictor development via Amazon ElasticMapReduce (EMR)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published