PubMed-Text-Mining

Text-mining close to 800,000 PubMed Articles

Extracting titles from PubMed articles and using Latent Dirichlet Allocation to classify topics. The aim was to see how machine learning topic modeling ompares to topics classified manually under Medical Subject Headings (MeSH) terms by NIH PubMed library. This was done by calculating the frequency and probability of words in the titles from Cardiovascular Case Reports and comparing it with the topics modeled by Latent Dirichlet Allocation.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
Rentrez and PubMed Help Sources		Rentrez and PubMed Help Sources
LDA Classifier.png		LDA Classifier.png
Most Common Words in Titles.R		Most Common Words in Titles.R
Pubmed Articles as dataframes.R		Pubmed Articles as dataframes.R
README.md		README.md
Text Ming Titles and Abstracts.R		Text Ming Titles and Abstracts.R
Text Miningg Titles and Abstracts.R		Text Miningg Titles and Abstracts.R
junk.R		junk.R
recs2.RData		recs2.RData

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PubMed-Text-Mining

About

Releases

Packages

Languages

kevinchen27/PubMed-Text-Mining

Folders and files

Latest commit

History

Repository files navigation

PubMed-Text-Mining

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages