Skip to content

Naive Solution for Course 10 JHU Data Science Specialization

Notifications You must be signed in to change notification settings

aashay15/DS-Capstone

Repository files navigation

DS-Capstone

This is a simple frequency based approach for the Data Science Capstone problem (John Hopkins Data Science Specilization Coursera)

Link to the working final project (hosted on shinyapps.io)

What was the DS Capstone problem ?

It was basically a NLP project where the problem was to identify (or guess) the next word based on the word entered by the user.

For example if the user entered the word Happy the system should guess the next pairing word with happy in this case Birthday.

What was my approach to solve the problem ?

I first visualized the most frequent n-grams (both keeping and ignoring stopwords) and then decided to go with the most simple approach.

My apporach simply returns the most frequent pair word after getting the user input. It is not a reliable method but it gives fascinating resutls.

It performed very well with common words, even identified names (For example input : Barack output : Obama)

Is the solution reliable enough ?

No, there are far more advanced methods and models this is not even close. I went with it as I was very new to the field and I did what was

possible for me at the moment.