Skip to content
/ Doby Public

NLP analysis and visualization of scholarly articles (Flask, Heroku, PostgreSQL, NLTK, Markovify)

Notifications You must be signed in to change notification settings

doguma/Doby

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Doby

NLP analysis and visualization of scholarly articles sourced from PubMed.

Introduction

Doby uses unigram, bigram and trigram frequency from the context of the searched articles to help users understand the topic in a fresh perspective. It also provides a random thesis generator via Markov chain to create a sentence based on the collection of abstracts in the articles. In addition, a convenient option of csv file export for trending and queried articles.

Flow Chart

As can be seen in the following diagram, Doby uses Flask as the main web platform, and Heroku for PostgreSQL and deployment.

Packges utilized in Doby :

  • Selenium and Beautiful Soup were used to access PubMed and to pull texts from available articles to be updated on the database.

  • Nltk, spacy, re were used to clean out the text, and to remove stop words and unnecessary tokens.

  • Markovify was used for creating Markov Chain from the given text and its word tokens and to regenerate the sentence based on the # of states and word limits.

  • Heroku chrome and chrome driver were also added as buildpacks for selenium on Heroku.

Content

Home Page

The home page includes the search bar, trending articles from PubMed (as of current date), 'unigram, bigram, trigram' lists from the context, and option to export them as csv files.

Search Page

The search page includes the queried keywords, auto sentence generator via Markov chain (refresh button), list of searched articles, 'unigram, bigram, trigram' from searched articles, and also an option to export the articles as csv files.


References

About

NLP analysis and visualization of scholarly articles (Flask, Heroku, PostgreSQL, NLTK, Markovify)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published