A natural language processing app that predicts most likely next words
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
next_word
word_prediction-figure
LICENSE
MilestoneReport.Rmd
README.md
UI_figure.png
predictive_text.R
word_prediction-rpubs.html
word_prediction.Rpres
word_prediction.md

README.md

nlp_next_word

This project produced a predictive text algorithm, and demonstration web interface, as part of the Coursera Data Science Capstone by Johns Hopkins University on Coursera (view certificate).

Background

Due to the complexities, subtleties and ever-changing nature of language, the most successful predictive text algorithms tend to take the approach of training models on a large body of text sources "in the wild" rather than alternatives such as applying grammatical rules (although the combination of both has potential to be even better).

To this end we will be using a large body of text (corpus) provided by SwiftKey as the training source for our predictive text models. Here we report on the nature of the data and search for insight on effective strategies on how to build text predictive algorithms.

Data

The data for this project kindly provided by SwiftKey (large zip archive).

Code

Analyses were peformed using R. Reporting written in Rmarkdown format and rendered in HTML using knitr.

Usage

A brief explanation of this project and how to use the app can be found on Rpubs.

The predictive text web app is hosted on shinyapps.io.