A tool to visually browse co-occurrence of MeSH terms in PubMeb.
Publications indexed in PubMed have human curated MeSH terms associated with them. We leverage these MeSH terms and create a visual search tool to find articles in PubMed. The idea is that a visual inspection of co-occurrences is helpful for exploratory queries to PubMed.
We recently launched our website!
To check out MeSHgram in action go to meshgram.org. The site is still under development. Please leave your comments / issues here.
Citation details will be posted soon. For now please cite the repository or the website directly.
Server code was tested in Python 3.5 and Web client was tested in all major browsers except FireFox.
url_gen.py - generates Pubmed XML archive urls to be fed to wget to download.
pm2mdb.py - parses the downloaded Pubmed XML archives and loads them into Mongodb.
server.py - CherryPy based server that provides json end points for the Web Front End.
config.txt - CherryPy config file.
terms.txt - list of all MeSH terms, alphabetically sorted, extracted from the database.
mesh_stopwords.txt - "Stop words" among MeSH terms. We calculated the 100 most frequent MeSH terms across the entire corpus and manually curated some terms out.
External Libraries / Packages
lxml - C library for fast native XML parsing.
MongoDB - Scalable NoSQL database.
PyMongo - Python driver for MongoDB.
CherryPy - A lightweight HTTP server. Used for REST/JSON in our project.