Skip to content

kevinchen27/PubMed-Text-Mining

Repository files navigation

PubMed-Text-Mining

Text-mining close to 800,000 PubMed Articles

Extracting titles from PubMed articles and using Latent Dirichlet Allocation to classify topics. The aim was to see how machine learning topic modeling ompares to topics classified manually under Medical Subject Headings (MeSH) terms by NIH PubMed library. This was done by calculating the frequency and probability of words in the titles from Cardiovascular Case Reports and comparing it with the topics modeled by Latent Dirichlet Allocation.

About

Text-mining close to 800,000 PubMed Articles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages