Skip to content

TLDWTutorials/PubmedAPI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 

Repository files navigation

PubMed Data Extraction Script

This repository contains a Python script to search and extract PubMed records based on specific authors and topics. The extracted data is stored in a Pandas DataFrame and saved to an Excel file.

YouTube video overview of code: https://youtu.be/sGC66q45BX4

Table of Contents

Installation

  1. Clone the repository:

    git clone https://github.com/TLDWTutorials/PubMed-Data-Extraction.git
    cd PubMed-Data-Extraction
  2. Install the required dependencies:

    pip install pandas biopython

Usage

  1. Update the email address to your own to avoid potential issues with Entrez:

    Entrez.email = 'your.email@example.com'
  2. Customize the list of authors and topics as needed:

    authors = ['Bryan Holland', 'Mehmet Oz', 'Anthony Fauci']
    topics = ['RNA', 'cardiovascular']
  3. Run the script:

    python pubmed_extraction.py

Customization

  • Authors: Modify the authors list with the names of authors you want to include in the search.
  • Topics: Modify the topics list with the topics you want to include in the search.
  • Date Range: Adjust the date_range variable to the desired date range for your search.

Output

The script will create an Excel file named PubMed_results.xlsx containing the following columns:

  • PMID
  • Title
  • Abstract
  • Authors
  • Journal
  • Keywords
  • URL
  • Affiliations

License

This project is licensed under the MIT License.

About

Python script to extract abstracts from PubMed using the Entrez data API.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages