Web scraping PubMed database and paper information visualization
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
PubMed.ipynb add explanation Oct 9, 2017
PubMed_Scraping.py add explanation Oct 9, 2017
README.md Update README.md Oct 27, 2017
pubmed.png add pic Oct 27, 2017
pubmed16.csv add data Oct 9, 2017

README.md

PubMed

Introduction:

Web scraping script was created to extract articles information from PubMed database https://www.ncbi.nlm.nih.gov/pubmed/.

Data is stored in MongoDB first then extracted to conduct data preprcoessing, manipulation and visualizaiton. More information could be found on http://woodenleaves.com/pages/pubmed.html

1. PubMed_Scraping.py:

Tools:

Python(Selenium, BeautifulSoup, Requests, Multiprocessing, Pandas, pymongo, re, bokeh, matplotlib)

MongoDB

ECharts.js

2. PubMed.ipynb:

Data preprocessing, statistical analysis and data visualizaton

3. Demo

demo