Next-Word Predictor Development via Amazon ElasticMapReduce (EMR)

The notebook takes as input a pointer to a S3 bucket containing a set of text files, and outputs a set of JSON files, each of which contains a set of prior strings (e.g. 1, 2, 3 or 4 prior words separated by spaces) as keys, pointing to arrays of the most common occurances for the next word in the sequence. The JSON files can be used in next-word prediction applications, as demonstrated in this blog post.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
dev		dev
README.md		README.md
word_predictor_EMR.ipynb		word_predictor_EMR.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Next-Word Predictor Development via Amazon ElasticMapReduce (EMR)

About

Releases

Packages

Languages

bricof/word_predictor

Folders and files

Latest commit

History

Repository files navigation

Next-Word Predictor Development via Amazon ElasticMapReduce (EMR)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages